Pandas >> Sort

2022-12-29 Pandas

Table of Contents

This tutorial will explain how to sort in Pandas. (Sort values, ascending, descending, Sort multiple columns, Sort index, Sort columns by name, Sort a series, Sort datetime index, Sort by date)

Pandas show pretty table

Sort values

To sort a Pandas dataframe by the values in a particular column, you can use the sort_values() function. This function allows you to specify the column to sort by and the sort order (ascending or descending).

Here is an example of how to use the sort_values() function to sort a dataframe by the values in the “Age” column in ascending order:

ascending

import pandas as pd

# Load a sample dataframe
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
                   'Age': [25, 30, 35, 40, 45]})

# Sort the dataframe by the values in the "Age" column in ascending order
df_sorted = df.sort_values(by='Age')

# The sorted dataframe will have the rows in ascending order of the "Age" column
print(df_sorted)

Output:

    Name  Age
0  Alice   25
1    Bob   30
2  Charlie   35
3   David   40
4     Eve   45

descending

To sort the dataframe in descending order, you can use the ascending parameter:

# Sort the dataframe by the values in the "Age" column in descending order
df_sorted = df.sort_values(by='Age', ascending=False)

# The sorted dataframe will have the rows in descending order of the "Age" column
print(df_sorted)

Output:

    Name  Age
4     Eve   45
3   David   40
2  Charlie   35
1    Bob   30
0  Alice   25

Sort multiple columns

You can also sort the dataframe by multiple columns by specifying a list of column names in the by parameter. The rows will be sorted first by the values in the first column, and then by the values in the second column, and so on.

Here is a sample of sorting values by two columns.

# Sort the dataframe by the values in the "Age" column and then by the values in the "Name" column
df_sorted = df.sort_values(by=['Age', 'Name'], ascending=[True, True])

# The sorted dataframe will have the rows first sorted by the values in the "Age" column and then by the values in the "Name" column
print(df_sorted)

Output:

    Name  Age
0  Alice   25
1    Bob   30
2  Charlie   35
3   David   40
4     Eve   45

Note that the sort_values() function returns a new dataframe, so you need to assign the returned dataframe to a new variable or overwrite the existing dataframe if you want to keep the sorted data.

Sort index

To sort the index of a Pandas dataframe, you can use the sort_index() function. This function allows you to specify the sort order (ascending or descending) for the index.

Here is an example of how to use the sort_index() function to sort the index of a dataframe in ascending order:

import pandas as pd

# Load a sample dataframe with a multi-level index
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
                   'Age': [25, 30, 35, 40, 45]},
                  index=pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z'), ('d', 'x'), ('e', 'y')],
                                                 names=['First', 'Second']))

# Sort the index of the dataframe in ascending order
df_sorted = df.sort_index()

# The sorted dataframe will have the index sorted in ascending order
print(df_sorted)

Output:

                Name  Age
First Second
a     x          Alice   25
b     y            Bob   30
c     z        Charlie   35
d     x          David   40
e     y            Eve   45

To sort the index in descending order, you can use the ascending parameter:

# Sort the index of the dataframe in descending order
df_sorted = df.sort_index(ascending=False)

# The sorted dataframe will have the index sorted in descending order
print(df_sorted)

Output:

                Name  Age
First Second
e     y            Eve   45
d     x          David   40
c     z        Charlie   35
b     y            Bob   30
a     x          Alice   25

You can also sort the index by a specific level by using the level parameter. For example, to sort the index by the second level:

# Sort the index of the dataframe by the second level
df_sorted = df.sort_index(level=1)

# The sorted dataframe will have the index sorted by the second level
print(df_sorted)

Output:

                Name  Age
First Second
a     x          Alice   25
d     x          David   40
b     y            Bob   30
e     y            Eve   45
c     z        Charlie   35

Note that the sort_index() function returns a new dataframe, so you need to assign the returned dataframe to a new variable or overwrite the existing dataframe if you want to keep the sorted data.

Sort columns by name

To sort the columns of a Pandas dataframe by column name, you can use the sort_index() function and specify the axis parameter as 1.

Here is an example of how to use the sort_index() function to sort the columns of a dataframe in ascending order:

import pandas as pd

# Load a sample dataframe
df = pd.DataFrame({'Age': [25, 30, 35, 40, 45],
                   'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve']})

# Sort the columns of the dataframe in ascending order
df_sorted = df.sort_index(axis=1)

# The sorted dataframe will have the columns sorted in ascending order
print(df_sorted)

Output:

   Age    Name
0   25   Alice
1   30     Bob
2   35 Charlie
3   40   David
4   45     Eve

To sort the columns in descending order, you can use the ascending parameter:

# Sort the columns of the dataframe in descending order
df_sorted = df.sort_index(axis=1, ascending=False)

# The sorted dataframe will have the columns sorted in descending order
print(df_sorted)

Output:

     Name  Age
0   Alice   25
1     Bob   30
2 Charlie   35
3   David   40
4     Eve   45

Note that the sort_index() function returns a new dataframe, so you need to assign the returned dataframe to a new variable or overwrite the existing dataframe if you want to keep the sorted data.

Sort a series

To sort a Pandas series by its values, you can use the sort_values() function. This function allows you to specify the sort order (ascending or descending) for the values.

Here is an example of how to use the sort_values() function to sort a series in ascending order:

import pandas as pd

# Load a sample series
s = pd.Series([25, 30, 35, 40, 45], index=['Alice', 'Bob', 'Charlie', 'David', 'Eve'])

# Sort the series in ascending order
s_sorted = s.sort_values()

# The sorted series will have the values sorted in ascending order
print(s_sorted)

Output:

Alice      25
Bob        30
Charlie    35
David      40
Eve        45
dtype: int64

To sort the series in descending order, you can use the ascending parameter:

# Sort the series in descending order
s_sorted = s.sort_values(ascending=False)

# The sorted series will have the values sorted in descending order
print(s_sorted)

Output:

Eve        45
David      40
Charlie    35
Bob        30
Alice      25
dtype: int64

Note that the sort_values() function returns a new series, so you need to assign the returned series to a new variable or overwrite the existing series if you want to keep the sorted data.

If you want to sort the series by its index instead of its values, you can use the sort_index() function. This function allows you to specify the sort order (ascending or descending) for the index.

# Sort the index of the series in ascending order
s_sorted = s.sort_index()

# The sorted series will have the index sorted in ascending order
print(s_sorted)

# Sort the index of the series in descending order
s_sorted = s.sort_index(ascending=False)

# The sorted series will have the index sorted in descending order
print(s_sorted)

Sort datetime index

To sort a Pandas dataframe or series with a datetime index, you can use the sort_index() function and specify the level parameter if the index has multiple levels.

Here is an example of how to use the sort_index() function to sort a dataframe with a datetime index in ascending order:

import pandas as pd

# Load a sample dataframe with a datetime index
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
                   'Age': [25, 30, 35, 40, 45]},
                  index=pd.date_range('2022-01-01', periods=5))

# Sort the index of the dataframe in ascending order
df_sorted = df.sort_index()

# The sorted dataframe will have the index sorted in ascending order
print(df_sorted)

Output:

            Name  Age
2022-01-01 Alice   25
2022-01-02    Bob   30
2022-01-03 Charlie   35
2022-01-04   David   40
2022-01-05     Eve   45

To sort the index in descending order, you can use the ascending parameter:

# Sort the index of the dataframe in descending order
df_sorted = df.sort_index(ascending=False)

# The sorted dataframe will have the index sorted in descending order
print(df_sorted)

Output:

            Name  Age
2022-01-05     Eve   45
2022-01-04   David   40
2022-01-03 Charlie   35
2022-01-02    Bob   30
2022-01-01 Alice   25

If the dataframe has a multi-level index with a datetime index, you can use the level parameter to specify the level to sort by:

# Load a sample dataframe with a multi-level index with a datetime index
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Alice', 'Bob', 'Charlie', 'David', 'Eve'],
                   'Age': [25, 30, 35, 40, 45, 25, 30, 35, 40, 45]},
                  index=pd.MultiIndex.from_product([pd.date_range('2022-01-01', periods=5), ['a', 'b']],
                                                 names=['Date', 'Category']))

# Sort the index of the dataframe by the first level (the datetime index) in ascending order
df_sorted = df.sort_index(level=1, ascending=False)

# The sorted dataframe will have the index sorted by the first level (the datetime index) in ascending order
print(df_sorted)

Output:

                        Name  Age
Date       Category              
2022-01-05 b             Eve   45
2022-01-04 b         Charlie   35
2022-01-03 b           Alice   25
2022-01-02 b           David   40
2022-01-01 b             Bob   30
2022-01-05 a           David   40
2022-01-04 a             Bob   30
2022-01-03 a             Eve   45
2022-01-02 a         Charlie   35
2022-01-01 a           Alice   25

Sort by date

To sort a Pandas dataframe or series by a date column, you can use the sort_values() function and specify the column name to sort by.

Here is an example of how to use the sort_values() function to sort a dataframe by a date column in ascending order:

import pandas as pd

# Load a sample dataframe with a date column
df = pd.DataFrame({'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
                   'Age': [25, 30, 35, 40, 45],
                   'Date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04', '2022-01-05']})

# Convert the "Date" column to datetime
df['Date'] = pd.to_datetime(df['Date'])

# Sort the dataframe by the "Date" column in ascending order
df_sorted = df.sort_values(by='Date')

# The sorted dataframe will have the rows sorted by the "Date" column in ascending order
print(df_sorted)

Output:

    Name  Age       Date
0  Alice   25 2022-01-01
1    Bob   30 2022-01-02
2  Charlie   35 2022-01-03
3   David   40 2022-01-04
4     Eve   45 2022-01-05

To sort the dataframe in descending order, you can use the ascending parameter:

# Sort the dataframe by the "Date" column in descending order
df_sorted = df.sort_values(by='Date', ascending=False)

# The sorted dataframe will have the rows sorted by the "Date" column in descending order
print(df_sorted)

Output:

    Name  Age       Date
4     Eve   45 2022-01-05
3   David   40 2022-01-04
2  Charlie   35 2022-01-03
1    Bob   30 2022-01-02
0  Alice

Subscribe and be the FIRST reader of our latest articles

* indicates required

Contact us