How to explode each row that is an Array into columns in Spark (Scala)?

I have a Spark DataFrame with the a single column 'value', whereby each row is an Array of equal length. How can I explode this single 'value' column into multiple columns, which follow a schema like this?

Single-column DataFrame

val bronzeDfSchema = new StructType()
  .add("DATE", IntegerType)
  .add("NUMARTS", IntegerType)
  .add("COUNTS", StringType)
  .add("THEMES", StringType)
  .add("LOCATIONS", StringType)
  .add("PERSONS", StringType)
  .add("ORGANIZATIONS", StringType)
  .add("TONE", StringType)
  .add("CAMEOEVENTIDS", StringType)
  .add("SOURCES", StringType)
  .add("SOURCEURLS", StringType)

Thank you!

1 answer

  • answered 2021-07-24 20:00 Rushabh Gujarathi

    This should work just fine

    val schema=Seq(("DATE",0),("NUMARTS",1),("COUNTS",2),("THEMES",3),("LOCATIONS",4),("PERSONS",5),("ORGANIZATIONS",6),("TONE",7),("CAMEOEVENTIDS",8),("SOURCES",9),("SOURCEURLS",10))
    
    val df2=schema.foldLeft(df)((df,x)=>df.withColumn(x._1,col("value").getItem(x._2)))
    

    After you do this just cast the column into the data type you want.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum