cassandra/datastax: programatically setting datastax package

The following spark-submit script works:

nohup ./bin/spark-submit   --jars ./ikoda/extrajars/ikoda_assembled_ml_nlp.jar,./ikoda/extrajars/stanford-corenlp-3.8.0.jar,./ikoda/extrajars/stanford-parser-3.8.0.jar \
--packages datastax:spark-cassandra-connector:2.0.1-s_2.11 \
--class ikoda.mlserver.Application \
--conf spark.cassandra.connection.host=192.168.0.33 \
--master local[*]  ./ikoda/ikodaanalysis-mlserver-0.1.0.jar   1000  > ./logs/nohup.out &

Programatically, I can do the same by configuring SparkContext:

        val conf = new SparkConf().setMaster("local[4]").setAppName("MLPCURLModelGenerationDataStream")
    conf.set("spark.streaming.stopGracefullyOnShutdown", "true")
    conf.set("spark.cassandra.connection.host", sparkcassandraconnectionhost)
    conf.set("spark.driver.maxResultSize", sparkdrivermaxResultSize)
    conf.set("spark.network.timeout", sparknetworktimeout)

Question

Can I add --packages datastax:spark-cassandra-connector:2.0.1-s_2.11 programatically? If yes, how?

1 answer

  • answered 2018-07-12 08:17 Aaron Makubuya

    The corresponding option is spark.jars.packages

    conf.set(
      "spark.jars.packages",
      "datastax:spark-cassandra-connector:2.0.1-s_2.11")