How do I read sequence data in Scala in Spark

This is my first time to attempt to read sequence format data in Scala, it would be greatly appreciated if someone can help me with the right command.


hdfs dfs -cat orders03132_seq/part-m-00000 | head

My command:

sc.sequenceFile("orders03132_seq/part-m-00000", classOf[Int], classOf[String]).first


18/03/13 16:59:28 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1) java.lang.RuntimeException: WritableName can't load class: orders at$Reader.getValueClass(

Thank you very much in advance.

1 answer

  • answered 2018-03-14 11:11 suj1th

    You would need to read it as a Hadoop File. You can do this with something like:

    sc.hadoopFile[K, V, SequenceFileInputFormat[K,V]]("path/to/file")

    Refer documentation here.