Pandas >> How to Process a Whole Column Like String
Table of Contents
In this article, we will talk about how to process a whole column like a string.
Firstly, we prepare data for demonstration.
Preparing data
import pandas as pd
df = pd.DataFrame({
"name": [" Kevin ", " Jack", "Mary ", " Bob", " Robert ", "Amy "],
"score": [80, 90, 95, 93, 88, 81],
"class": ["A", "B", "A", "A", "B", "B"]
}, index=["K", "J", "M", "B", "R", "A"])
df
Result
name | score | class | |
---|---|---|---|
K | Kevin | 80 | A |
J | Jack | 90 | B |
M | Mary | 95 | A |
B | Bob | 93 | A |
R | Robert | 88 | B |
A | Amy | 81 | B |
How to convert column to lower case
We can use lower()
method of str
attribute of the column to convert the value of the column to lower case.
df["name"] = df["name"].str.lower()
df.loc["R", "name"]
Result
' robert '
How to convert column to upper case
We can use upper()
method of str
attribute of the column to convert the value of the column to upper case.
df["name"] = df["name"].str.upper()
df.loc["R", "name"]
Result
K KEVIN
J JACK
M MARY
B BOB
R ROBERT
A AMY
Name: name, dtype: object
How to remove whitespace at the end of all string in a column
We can use rstrip()
method of str
attribute of the column to remove whitespace at the end of a string in a column.
df["name"].str.rstrip()
Result
K Kevin
J Jack
M Mary
B Bob
R Robert
A Amy
Name: name, dtype: object
How to remove whitespace at the beginning of all string in a column
We can use lstrip()
method of str
attribute of the column to remove whitespace at the beginning of a string in a column.
df["name"].str.lstrip()
Result
K Kevin
J Jack
M Mary
B Bob
R Robert
A Amy
Name: name, dtype: object
How to remove whitespace at the beginning and end of all string in a column
We can use strip()
method of str
attribute of the column to remove whitespace at the beginning and end of a string in a column.
df["message"] = "hello " + df["name"].str.strip()
df["message"]
Result
K hello kevin
J hello jack
M hello mary
B hello bob
R hello robert
A hello amy
Name: message, dtype: object
How to capitalize the first letter of each string in the Series
We can use str.capitalize
method to capitalize the first letter of each string in the Series.
df["message"].str.capitalize()
Result
K Hello kevin
J Hello jack
M Hello mary
B Hello bob
R Hello robert
A Hello amy
Name: message, dtype: object
How to capitalizes each word’s first letter in the Series
We can use str.title
to capitalizes each word’s first letter in the Series.
df["message"].str.title()
Result
K Hello Kevin
J Hello Jack
M Hello Mary
B Hello Bob
R Hello Robert
A Hello Amy
Name: message, dtype: object
How to slice a substring from string in the column Series
We can use str.slice
to extract a substring from a string in the column Series.
df["message"].str.slice(6,)
Result
K kevin
J jack
M mary
B bob
R robert
A amy
Name: message, dtype: object
Of course, you can also replace the slice method with Python’s list-slicing syntax.
df["message"].str[6:]
Result
K kevin
J jack
M mary
B bob
R robert
A amy
Name: message, dtype: object
How to check if a string is contained in all column values.
We can use str.contains
to check every string in a column if it contains a string.
The boolean result can be used to filter rows that meet the condition.
df["message"].str.contains("b")
Result
K False
J False
M False
B True
R True
A False
Name: message, dtype: bool
How to check if all values in column starts with a string.
We can use str.startswith
to check every string in a column if it starts with a string.
df["name"].str.startswith("k")
Result
K True
J False
M False
B False
R False
A False
Name: name, dtype: bool
How to split string in column.
We can use str.split
to split every string in a column. A list will be returned.
df["message"].str.split()
Result
K [hello, kevin]
J [hello, jack]
M [hello, mary]
B [hello, bob]
R [hello, robert]
A [hello, amy]
Name: message, dtype: object