Pandas >> Sort
Table of Contents
This tutorial will explain how to sort in Pandas. (Sort values, ascending, descending, Sort multiple columns, Sort index, Sort columns by name, Sort a series, Sort datetime index, Sort by date)
Sort values
To sort a Pandas dataframe by the values in a particular column, you can use the sort_values() function. This function allows you to specify the column to sort by and the sort order (ascending or descending).
Here is an example of how to use the sort_values() function to sort a dataframe by the values in the “Age” column in ascending order:
ascending
import pandas as pd
# Load a sample dataframe
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45]})
# Sort the dataframe by the values in the "Age" column in ascending order
df_sorted = df.sort_values(by='Age')
# The sorted dataframe will have the rows in ascending order of the "Age" column
print(df_sorted)
Output:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40
4 Eve 45
descending
To sort the dataframe in descending order, you can use the ascending parameter:
# Sort the dataframe by the values in the "Age" column in descending order
df_sorted = df.sort_values(by='Age', ascending=False)
# The sorted dataframe will have the rows in descending order of the "Age" column
print(df_sorted)
Output:
Name Age
4 Eve 45
3 David 40
2 Charlie 35
1 Bob 30
0 Alice 25
Sort multiple columns
You can also sort the dataframe by multiple columns by specifying a list of column names in the by parameter. The rows will be sorted first by the values in the first column, and then by the values in the second column, and so on.
Here is a sample of sorting values by two columns.
# Sort the dataframe by the values in the "Age" column and then by the values in the "Name" column
df_sorted = df.sort_values(by=['Age', 'Name'], ascending=[True, True])
# The sorted dataframe will have the rows first sorted by the values in the "Age" column and then by the values in the "Name" column
print(df_sorted)
Output:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40
4 Eve 45
Note that the sort_values() function returns a new dataframe, so you need to assign the returned dataframe to a new variable or overwrite the existing dataframe if you want to keep the sorted data.
Sort index
To sort the index of a Pandas dataframe, you can use the sort_index() function. This function allows you to specify the sort order (ascending or descending) for the index.
Here is an example of how to use the sort_index() function to sort the index of a dataframe in ascending order:
import pandas as pd
# Load a sample dataframe with a multi-level index
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45]},
index=pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z'), ('d', 'x'), ('e', 'y')],
names=['First', 'Second']))
# Sort the index of the dataframe in ascending order
df_sorted = df.sort_index()
# The sorted dataframe will have the index sorted in ascending order
print(df_sorted)
Output:
Name Age
First Second
a x Alice 25
b y Bob 30
c z Charlie 35
d x David 40
e y Eve 45
To sort the index in descending order, you can use the ascending parameter:
# Sort the index of the dataframe in descending order
df_sorted = df.sort_index(ascending=False)
# The sorted dataframe will have the index sorted in descending order
print(df_sorted)
Output:
Name Age
First Second
e y Eve 45
d x David 40
c z Charlie 35
b y Bob 30
a x Alice 25
You can also sort the index by a specific level by using the level parameter. For example, to sort the index by the second level:
# Sort the index of the dataframe by the second level
df_sorted = df.sort_index(level=1)
# The sorted dataframe will have the index sorted by the second level
print(df_sorted)
Output:
Name Age
First Second
a x Alice 25
d x David 40
b y Bob 30
e y Eve 45
c z Charlie 35
Note that the sort_index() function returns a new dataframe, so you need to assign the returned dataframe to a new variable or overwrite the existing dataframe if you want to keep the sorted data.
Sort columns by name
To sort the columns of a Pandas dataframe by column name, you can use the sort_index() function and specify the axis parameter as 1.
Here is an example of how to use the sort_index() function to sort the columns of a dataframe in ascending order:
import pandas as pd
# Load a sample dataframe
df = pd.DataFrame({'Age': [25, 30, 35, 40, 45],
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve']})
# Sort the columns of the dataframe in ascending order
df_sorted = df.sort_index(axis=1)
# The sorted dataframe will have the columns sorted in ascending order
print(df_sorted)
Output:
Age Name
0 25 Alice
1 30 Bob
2 35 Charlie
3 40 David
4 45 Eve
To sort the columns in descending order, you can use the ascending parameter:
# Sort the columns of the dataframe in descending order
df_sorted = df.sort_index(axis=1, ascending=False)
# The sorted dataframe will have the columns sorted in descending order
print(df_sorted)
Output:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 35
3 David 40
4 Eve 45
Note that the sort_index() function returns a new dataframe, so you need to assign the returned dataframe to a new variable or overwrite the existing dataframe if you want to keep the sorted data.
Sort a series
To sort a Pandas series by its values, you can use the sort_values() function. This function allows you to specify the sort order (ascending or descending) for the values.
Here is an example of how to use the sort_values() function to sort a series in ascending order:
import pandas as pd
# Load a sample series
s = pd.Series([25, 30, 35, 40, 45], index=['Alice', 'Bob', 'Charlie', 'David', 'Eve'])
# Sort the series in ascending order
s_sorted = s.sort_values()
# The sorted series will have the values sorted in ascending order
print(s_sorted)
Output:
Alice 25
Bob 30
Charlie 35
David 40
Eve 45
dtype: int64
To sort the series in descending order, you can use the ascending parameter:
# Sort the series in descending order
s_sorted = s.sort_values(ascending=False)
# The sorted series will have the values sorted in descending order
print(s_sorted)
Output:
Eve 45
David 40
Charlie 35
Bob 30
Alice 25
dtype: int64
Note that the sort_values() function returns a new series, so you need to assign the returned series to a new variable or overwrite the existing series if you want to keep the sorted data.
If you want to sort the series by its index instead of its values, you can use the sort_index() function. This function allows you to specify the sort order (ascending or descending) for the index.
# Sort the index of the series in ascending order
s_sorted = s.sort_index()
# The sorted series will have the index sorted in ascending order
print(s_sorted)
# Sort the index of the series in descending order
s_sorted = s.sort_index(ascending=False)
# The sorted series will have the index sorted in descending order
print(s_sorted)
Sort datetime index
To sort a Pandas dataframe or series with a datetime index, you can use the sort_index() function and specify the level parameter if the index has multiple levels.
Here is an example of how to use the sort_index() function to sort a dataframe with a datetime index in ascending order:
import pandas as pd
# Load a sample dataframe with a datetime index
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45]},
index=pd.date_range('2022-01-01', periods=5))
# Sort the index of the dataframe in ascending order
df_sorted = df.sort_index()
# The sorted dataframe will have the index sorted in ascending order
print(df_sorted)
Output:
Name Age
2022-01-01 Alice 25
2022-01-02 Bob 30
2022-01-03 Charlie 35
2022-01-04 David 40
2022-01-05 Eve 45
To sort the index in descending order, you can use the ascending parameter:
# Sort the index of the dataframe in descending order
df_sorted = df.sort_index(ascending=False)
# The sorted dataframe will have the index sorted in descending order
print(df_sorted)
Output:
Name Age
2022-01-05 Eve 45
2022-01-04 David 40
2022-01-03 Charlie 35
2022-01-02 Bob 30
2022-01-01 Alice 25
If the dataframe has a multi-level index with a datetime index, you can use the level parameter to specify the level to sort by:
# Load a sample dataframe with a multi-level index with a datetime index
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45, 25, 30, 35, 40, 45]},
index=pd.MultiIndex.from_product([pd.date_range('2022-01-01', periods=5), ['a', 'b']],
names=['Date', 'Category']))
# Sort the index of the dataframe by the first level (the datetime index) in ascending order
df_sorted = df.sort_index(level=1, ascending=False)
# The sorted dataframe will have the index sorted by the first level (the datetime index) in ascending order
print(df_sorted)
Output:
Name Age
Date Category
2022-01-05 b Eve 45
2022-01-04 b Charlie 35
2022-01-03 b Alice 25
2022-01-02 b David 40
2022-01-01 b Bob 30
2022-01-05 a David 40
2022-01-04 a Bob 30
2022-01-03 a Eve 45
2022-01-02 a Charlie 35
2022-01-01 a Alice 25
Sort by date
To sort a Pandas dataframe or series by a date column, you can use the sort_values() function and specify the column name to sort by.
Here is an example of how to use the sort_values() function to sort a dataframe by a date column in ascending order:
import pandas as pd
# Load a sample dataframe with a date column
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 45],
'Date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05']})
# Convert the "Date" column to datetime
df['Date'] = pd.to_datetime(df['Date'])
# Sort the dataframe by the "Date" column in ascending order
df_sorted = df.sort_values(by='Date')
# The sorted dataframe will have the rows sorted by the "Date" column in ascending order
print(df_sorted)
Output:
Name Age Date
0 Alice 25 2022-01-01
1 Bob 30 2022-01-02
2 Charlie 35 2022-01-03
3 David 40 2022-01-04
4 Eve 45 2022-01-05
To sort the dataframe in descending order, you can use the ascending parameter:
# Sort the dataframe by the "Date" column in descending order
df_sorted = df.sort_values(by='Date', ascending=False)
# The sorted dataframe will have the rows sorted by the "Date" column in descending order
print(df_sorted)
Output:
Name Age Date
4 Eve 45 2022-01-05
3 David 40 2022-01-04
2 Charlie 35 2022-01-03
1 Bob 30 2022-01-02
0 Alice