Pandas >> How to Process a Whole Column Like String

2021-11-14 Pandas

Table of Contents

[Pandas] How to Process Column as String

In this article, we will talk about how to process a whole column like a string.

Firstly, we prepare data for demonstration.

Preparing data

import pandas as pd

df = pd.DataFrame({
    "name": [" Kevin ", " Jack", "Mary ", "  Bob", "  Robert  ", "Amy  "],
    "score": [80, 90, 95, 93, 88, 81],
    "class": ["A", "B", "A", "A", "B", "B"]
}, index=["K", "J", "M", "B", "R", "A"])

df

Result

name score class
K Kevin 80 A
J Jack 90 B
M Mary 95 A
B Bob 93 A
R Robert 88 B
A Amy 81 B

How to convert column to lower case

We can use lower() method of str attribute of the column to convert the value of the column to lower case.

df["name"] = df["name"].str.lower()
df.loc["R", "name"]

Result

'  robert  '

How to convert column to upper case

We can use upper() method of str attribute of the column to convert the value of the column to upper case.

df["name"] = df["name"].str.upper()
df.loc["R", "name"]

Result

K        KEVIN 
J          JACK
M         MARY 
B           BOB
R      ROBERT  
A         AMY  
Name: name, dtype: object

How to remove whitespace at the end of all string in a column

We can use rstrip() method of str attribute of the column to remove whitespace at the end of a string in a column.

df["name"].str.rstrip()

Result

K       Kevin
J        Jack
M        Mary
B         Bob
R      Robert
A         Amy
Name: name, dtype: object

How to remove whitespace at the beginning of all string in a column

We can use lstrip() method of str attribute of the column to remove whitespace at the beginning of a string in a column.

df["name"].str.lstrip()

Result

K      Kevin 
J        Jack
M       Mary 
B         Bob
R    Robert  
A       Amy  
Name: name, dtype: object

How to remove whitespace at the beginning and end of all string in a column

We can use strip() method of str attribute of the column to remove whitespace at the beginning and end of a string in a column.

df["message"] = "hello " + df["name"].str.strip()
df["message"]

Result

K     hello kevin
J      hello jack
M      hello mary
B       hello bob
R    hello robert
A       hello amy
Name: message, dtype: object

How to capitalize the first letter of each string in the Series

We can use str.capitalize method to capitalize the first letter of each string in the Series.

df["message"].str.capitalize()

Result

K     Hello kevin
J      Hello jack
M      Hello mary
B       Hello bob
R    Hello robert
A       Hello amy
Name: message, dtype: object

How to capitalizes each word’s first letter in the Series

We can use str.title to capitalizes each word’s first letter in the Series.

df["message"].str.title()

Result

K     Hello Kevin
J      Hello Jack
M      Hello Mary
B       Hello Bob
R    Hello Robert
A       Hello Amy
Name: message, dtype: object

How to slice a substring from string in the column Series

We can use str.slice to extract a substring from a string in the column Series.

df["message"].str.slice(6,)

Result

K     kevin
J      jack
M      mary
B       bob
R    robert
A       amy
Name: message, dtype: object

Of course, you can also replace the slice method with Python’s list-slicing syntax.

df["message"].str[6:]

Result

K     kevin
J      jack
M      mary
B       bob
R    robert
A       amy
Name: message, dtype: object

How to check if a string is contained in all column values.

We can use str.contains to check every string in a column if it contains a string.
The boolean result can be used to filter rows that meet the condition.

df["message"].str.contains("b")

Result

K    False
J    False
M    False
B     True
R     True
A    False
Name: message, dtype: bool

How to check if all values in column starts with a string.

We can use str.startswith to check every string in a column if it starts with a string.

df["name"].str.startswith("k")

Result

K     True
J    False
M    False
B    False
R    False
A    False
Name: name, dtype: bool

How to split string in column.

We can use str.split to split every string in a column. A list will be returned.

df["message"].str.split()

Result

K     [hello, kevin]
J      [hello, jack]
M      [hello, mary]
B       [hello, bob]
R    [hello, robert]
A       [hello, amy]
Name: message, dtype: object

Subscribe and be the FIRST reader of our latest articles

* indicates required

Contact us