java - Trouble passing application argument to spark-submit with scala -


i'm pretty new functional programming , don't have imperative programming background. running through basic scala/spark tutorials online , having difficulty submitting scala application through spark-submit.

in particular i'm getting java.lang.arrayindexoutofbounds 0 exception, have researched , found out array element @ position 0 culprit. looking further, saw basic debugging tell me if main application picking argument @ runtime - not. here code:

import org.apache.spark.{sparkconf, sparkcontext}  object sparkmeapp {   def main(args: array[string]) {      try {       //program works fine if path file hardcoded       //val logfile = "c:\\users\\garveyj\\desktop\\netsetup.log"       val logfile = args(0)       val conf = new sparkconf().setappname("sparkme application").setmaster("local[*]")       val sc = new sparkcontext(conf)       val logdata = sc.textfile(logfile, 2).cache()       val numfound = logdata.filter(line => line.contains("found")).count()       val numdata = logdata.filter(line => line.contains("data")).count()       println("")       println("lines found: %s, lines data: %s".format(numfound, numdata))       println("")     }     catch {       case aoub: arrayindexoutofboundsexception => println(args.length)     }   } } 

to submit application using spark-submit use:

spark-submit --class sparkmeapp --master "local[*]" --jars target\scala-2.10\firstsparkapplication_2.10-1.0.jar netsetup.log 

...where netsetup.log in same directory i'm submitting application. output of application simply: 0. if remove try/catch, output is:

exception in thread "main" java.lang.arrayindexoutofboundsexception: 0         @ sparkmeapp$.main(sparkmeapp.scala:12)         @ sparkmeapp.main(sparkmeapp.scala)         @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)         @ sun.reflect.nativemethodaccessorimpl.invoke(unknown source)         @ sun.reflect.delegatingmethodaccessorimpl.invoke(unknown source)         @ java.lang.reflect.method.invoke(unknown source)         @ org.apache.spark.deploy.sparksubmit$.org$apache$spark$deploy$sparksubmit$$runmain(sparksubmit.scala:731)         @ org.apache.spark.deploy.sparksubmit$.dorunmain$1(sparksubmit.scala:181)         @ org.apache.spark.deploy.sparksubmit$.submit(sparksubmit.scala:206)         @ org.apache.spark.deploy.sparksubmit$.main(sparksubmit.scala:121)         @ org.apache.spark.deploy.sparksubmit.main(sparksubmit.scala) 

it's worth pointing out application runs fine if remove argument , hard code path log file. don't know i'm missing here. direction appreciated. in advance!

you doing spark-submit wrong. actual command

./spark-submit --class sparkmeapp --master "local[*]" \ example.jar examplefile.txt 

you need pass --jars if there external dependency , want distribute jar executors.

if had enabled log4j.properties info/warn have caught it.

warning: local jar /home/user/downloads/spark-1.4.0/bin/netsetup.log not exist, skipping. 

Comments