dataframe groupby multiple columns
grouped_multiple = df.groupby(['Team', 'Pos']).agg({'Age': ['mean', 'min', 'max']}) grouped_multiple.columns = ['age_mean', 'age_min', 'age_max'] grouped_multiple = grouped_multiple.reset_index() print(grouped_multiple)
Source: jamesrledoux.com
pyspark group by and average in dataframes
df.groupBy("Profession").agg({'Age':'avg', 'Gender':'count'}).show()
Source: stackoverflow.com
pyspark groupby multiple columns
df.groupBy("year", "sex")
Source: stackoverflow.com