How do I read sequence data in Scala in Spark
This is my first time to attempt to read sequence format data in Scala, it would be greatly appreciated if someone can help me with the right command.
hdfs dfs -cat orders03132_seq/part-m-00000 | head SEQ!org.apache.hadoop.io.LongWritableordeG�Y���&���]E�@��
sc.sequenceFile("orders03132_seq/part-m-00000", classOf[Int], classOf[String]).first
18/03/13 16:59:28 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1) java.lang.RuntimeException: java.io.IOException: WritableName can't load class: orders at org.apache.hadoop.io.SequenceFile$Reader.getValueClass(SequenceFile.java:2103)
Thank you very much in advance.
You would need to read it as a Hadoop File. You can do this with something like:
sc.hadoopFile[K, V, SequenceFileInputFormat[K,V]]("path/to/file")
Refer documentation here.