pandas.core.groupby.DataFrameGroupBy.aggregate#
- DataFrameGroupBy.aggregate(func=None, *args, engine=None, engine_kwargs=None, **kwargs)[source]#
- Aggregate using one or more operations over the specified axis. - Parameters:
- funcfunction, str, list, dict or None
- Function to use for aggregating the data. If a function, must either work when passed a DataFrame or when passed to DataFrame.apply. - Accepted combinations are: - function 
- string function name 
- list of functions and/or function names, e.g. - [np.sum, 'mean']
- dict of axis labels -> functions, function names or list of such. 
- None, in which case - **kwargsare used with Named Aggregation. Here the output has one column for each element in- **kwargs. The name of the column is keyword, whereas the value determines the aggregation used to compute the values in the column.- Can also accept a Numba JIT function with - engine='numba'specified. Only passing a single function is supported with this engine.- If the - 'numba'engine is chosen, the function must be a user defined function with- valuesand- indexas the first and second arguments respectively in the function signature. Each group’s index will be passed to the user defined function and optionally available for use.
 
- *args
- Positional arguments to pass to func. 
- enginestr, default None
- 'cython': Runs the function through C-extensions from cython.
- 'numba': Runs the function through JIT compiled code from numba.
- None: Defaults to- 'cython'or globally setting- compute.use_numba
 
- engine_kwargsdict, default None
- For - 'cython'engine, there are no accepted- engine_kwargs
- For - 'numba'engine, the engine can accept- nopython,- nogiland- paralleldictionary keys. The values must either be- Trueor- False. The default- engine_kwargsfor the- 'numba'engine is- {'nopython': True, 'nogil': False, 'parallel': False}and will be applied to the function
 
- **kwargs
- If - funcis None,- **kwargsare used to define the output names and aggregations via Named Aggregation. See- funcentry.
- Otherwise, keyword arguments to be passed into func. 
 
 
- Returns:
- DataFrame
 
 - See also - DataFrame.groupby.apply
- Apply function func group-wise and combine the results together. 
- DataFrame.groupby.transform
- Transforms the Series on each group based on the given function. 
- DataFrame.aggregate
- Aggregate using one or more operations over the specified axis. 
 - Notes - When using - engine='numba', there will be no “fall back” behavior internally. The group data and group index will be passed as numpy arrays to the JITed user defined function, and no alternative execution attempts will be tried.- Functions that mutate the passed object can produce unexpected behavior or errors and are not supported. See Mutating with User Defined Function (UDF) methods for more details. - Changed in version 1.3.0: The resulting dtype will reflect the return value of the passed - func, see the examples below.- Examples - >>> df = pd.DataFrame( ... { ... "A": [1, 1, 2, 2], ... "B": [1, 2, 3, 4], ... "C": [0.362838, 0.227877, 1.267767, -0.562860], ... } ... ) - >>> df A B C 0 1 1 0.362838 1 1 2 0.227877 2 2 3 1.267767 3 2 4 -0.562860 - The aggregation is for each column. - >>> df.groupby('A').agg('min') B C A 1 1 0.227877 2 3 -0.562860 - Multiple aggregations - >>> df.groupby('A').agg(['min', 'max']) B C min max min max A 1 1 2 0.227877 0.362838 2 3 4 -0.562860 1.267767 - Select a column for aggregation - >>> df.groupby('A').B.agg(['min', 'max']) min max A 1 1 2 2 3 4 - User-defined function for aggregation - >>> df.groupby('A').agg(lambda x: sum(x) + 2) B C A 1 5 2.590715 2 9 2.704907 - Different aggregations per column - >>> df.groupby('A').agg({'B': ['min', 'max'], 'C': 'sum'}) B C min max sum A 1 1 2 0.590715 2 3 4 0.704907 - To control the output names with different aggregations per column, pandas supports “named aggregation” - >>> df.groupby("A").agg( ... b_min=pd.NamedAgg(column="B", aggfunc="min"), ... c_sum=pd.NamedAgg(column="C", aggfunc="sum")) b_min c_sum A 1 1 0.590715 2 3 0.704907 - The keywords are the output column names 
- The values are tuples whose first element is the column to select and the second element is the aggregation to apply to that column. Pandas provides the - pandas.NamedAggnamedtuple with the fields- ['column', 'aggfunc']to make it clearer what the arguments are. As usual, the aggregation can be a callable or a string alias.
 - See Named aggregation for more. - Changed in version 1.3.0: The resulting dtype will reflect the return value of the aggregating function. - >>> df.groupby("A")[["B"]].agg(lambda x: x.astype(float).min()) B A 1 1.0 2 3.0