i'm pretty new functional programming , don't have imperative programming background. running through basic scala/spark tutorials online , having difficulty submitting scala application through spark-submit.
in particular i'm getting java.lang.arrayindexoutofbounds 0 exception, have researched , found out array element @ position 0 culprit. looking further, saw basic debugging tell me if main application picking argument @ runtime - not. here code:
import org.apache.spark.{sparkconf, sparkcontext} object sparkmeapp { def main(args: array[string]) { try { //program works fine if path file hardcoded //val logfile = "c:\\users\\garveyj\\desktop\\netsetup.log" val logfile = args(0) val conf = new sparkconf().setappname("sparkme application").setmaster("local[*]") val sc = new sparkcontext(conf) val logdata = sc.textfile(logfile, 2).cache() val numfound = logdata.filter(line => line.contains("found")).count() val numdata = logdata.filter(line => line.contains("data")).count() println("") println("lines found: %s, lines data: %s".format(numfound, numdata)) println("") } catch { case aoub: arrayindexoutofboundsexception => println(args.length) } } }
to submit application using spark-submit use:
spark-submit --class sparkmeapp --master "local[*]" --jars target\scala-2.10\firstsparkapplication_2.10-1.0.jar netsetup.log
...where netsetup.log in same directory i'm submitting application. output of application simply: 0. if remove try/catch, output is:
exception in thread "main" java.lang.arrayindexoutofboundsexception: 0 @ sparkmeapp$.main(sparkmeapp.scala:12) @ sparkmeapp.main(sparkmeapp.scala) @ sun.reflect.nativemethodaccessorimpl.invoke0(native method) @ sun.reflect.nativemethodaccessorimpl.invoke(unknown source) @ sun.reflect.delegatingmethodaccessorimpl.invoke(unknown source) @ java.lang.reflect.method.invoke(unknown source) @ org.apache.spark.deploy.sparksubmit$.org$apache$spark$deploy$sparksubmit$$runmain(sparksubmit.scala:731) @ org.apache.spark.deploy.sparksubmit$.dorunmain$1(sparksubmit.scala:181) @ org.apache.spark.deploy.sparksubmit$.submit(sparksubmit.scala:206) @ org.apache.spark.deploy.sparksubmit$.main(sparksubmit.scala:121) @ org.apache.spark.deploy.sparksubmit.main(sparksubmit.scala)
it's worth pointing out application runs fine if remove argument , hard code path log file. don't know i'm missing here. direction appreciated. in advance!
you doing spark-submit wrong. actual command
./spark-submit --class sparkmeapp --master "local[*]" \ example.jar examplefile.txt
you need pass --jars if there external dependency , want distribute jar executors.
if had enabled log4j.properties info/warn have caught it.
warning: local jar /home/user/downloads/spark-1.4.0/bin/netsetup.log not exist, skipping.
Comments
Post a Comment