Apache Spark Dataframe - Get length of each column

Question: In Apache Spark Dataframe, using Python, how can we get the data type and length of each column? I'm using latest version of python.

Using pandas dataframe, I do it as follows:

df = pd.read_csv(r'C:\TestFolder\myFile1.csv', low_memory=False)

for col in df:
        print(col, '->', df[col].str.len().max())

1 answer

  • answered 2022-05-07 05:16 Vaebhav

    Pyspark also has a describe similar to Pandas , which you can use in this case


