write PySpark dataframe to csv
# In this example, change the field column_as_array to column_as_string before saving. from pyspark.sql.functions import udf from pyspark.sql.types import StringType def array_to_string(my_list): return '[' + ','.join([str(elem) for elem in my_list]) + ']' array_to_string_udf = udf(array_to_string, StringType()) df = df.withColumn('column_as_str', array_to_string_udf(df["column_as_array"])) # Then you can drop the old column (array type) before saving. df.drop("column_as_array").write.csv(...)
Source: stackoverflow.com
pyspark dataframe to single csv
df.repartition(1).write.csv('/path/csvname.csv')
save dataframe to a csv local file pyspark
df.repartition(1).write.format('com.databricks.spark.csv').save("/path/to/file/myfile.csv",header = 'true')
Source: stackoverflow.com