usage of agg(), aggregate()

2021-10-16 Pandas

In this article, we will talk about the usage of agg and aggregate in Pandas.

[Pandas] usage of agg, aggregate

You can use the agg() and aggregate() methods to aggregate the columns or rows of a DataFrame. agg() is an alias for aggregate().

Firstly, we will prepare test data.

import pandas as pd

# class, name, height, weight
data = [("A", "Kevin", 170, 60), 
        ("A", "Jack", 168, 59), 
        ("A", "Mary", 160, 50), 
        ("B", "Tom", 175, 65), 
        ("B", "Annie", 162, 51)]
df = pd.DataFrame(data=data, columns=["class", "name", "height", "weight"])
df

Result

	class	name	height	weight
0	A	Kevin	170	60
1	A	Jack	168	59
2	A	Mary	160	50
3	B	Tom	175	65
4	B	Annie	162	51

Basic usage of agg()

Specify a string or a list of callable objects as the argument of agg() to indicate the process to be applied. Here, we will use a string.

Definition of agg()

DataFrame.agg(func=None, axis=0, *args, **kwargs)

func is function, string function name, list of functions and/or function names, e.g. [np.sum, ‘mean’], dict of axis labels -> functions, function names or list of such. function: np.sum, np.mean, etc. function name: sum, mean, count, etc.

# specify a list of functions
df.agg(['sum', 'mean', 'min', 'max'])

Result

	class	name	height	weight
sum	AAABB	KevinJackMaryTomAnnie	835.0	285.0
min	A	Annie	160.0	50.0
max	B	Tom	175.0	65.0
mean	NaN	NaN	167.0	57.0

If list is specified, DataFrame will be returned. If a single function name is specified, Series will be returned.

# The return value is DataFrame
this_is_a_dataframe = df.agg(['mean'])

# The return value is Series
this_is_a_series= df.agg('mean')

If we want to apply different aggregations on columns, we can use key(column name): value(applied aggregations function).

# specify a list of functions
df.agg({"height": ['sum', 'mean'], "weight": ['min', 'max']})

Result

	height	weight
sum	835.0	NaN
mean	167.0	NaN
min	NaN	50.0
min	NaN	65.0

The aggregation is performed on columns in default. If we want to apply the aggregation on rows we can specify axis=1 or axis='columns'.

axis: If 0 or ‘index’: apply function to each column. If 1 or ‘columns’: apply function to each row.

# Calculate sum of height and weight in row direction
df[["height", "weight"]].agg("sum", axis=1)
# or
df[["height", "weight"]].agg("sum", axis='columns')

Aggregation function usage examples

We can specify function name string to apply aggregation.

df["height"].agg("mean")

Result

167.0

We can also specify function to apply aggregation.

import numpy as np
df["height"].agg(np.mean)

Result

167.0

We can also define lambda function to apply aggregation.

df["height"].agg(lambda x: x/10)

Result


0	17.0
1	16.8
2	16.0
3	17.5
4	16.2

Name: height, dtype: float64

We can also define own function to apply aggregation.

def myfunc(h):
    return "Height: " + str(h)

df["height"].agg(myfunc)

Result


0	Height: 170
1	Height: 168
2	Height: 160
3	Height: 175
4	Height: 162

Name: height, dtype: float64

We can also specify multiple functions to apply aggregation.

df["height"].agg([lambda x: x/10, myfunc])

Result

	<lambda>	myfunc
0	17.0	Height: 170
1	16.8	Height: 168
2	16.0	Height: 160
3	17.5	Height: 175
4	16.2	Height: 162

That's it ! Code Snippets

Pandas >> usage of agg(), aggregate()

Table of Contents

Basic usage of agg()

Aggregation function usage examples

Contact us

Pandas >> usage of agg(), aggregate()

Table of Contents

Basic usage of agg()

Aggregation function usage examples

Subscribe and be the FIRST reader of our latest articles

Contact us