String Operations on Pandas DataFrame

String Methods

In this article, let’s cover the string operations that can be performed on pandas dataframe.

Converting to uppercase/lowercase/titlecase — str. upper, str. lower, str. title
strip,lstrip, rstrip
split
replace
startswith,endswith, contains

1. Converting to uppercase/lowercase/title case

Example 1: Converting strings in a column in pandas dataframe to uppercase

s1=pd.read_csv("s1.csv")
s1.head()

s1['Place']=s1.Place.str.upper()
s1.head()

All values in the “Place” columns are converted to uppercase.

Example 2: Converting strings in a column in pandas dataframe to lowercase

s1['Place']=s1.Place.str.lower()
s1.head()

Example 3: Converting strings in a column in pandas dataframe to titlecase

s1['Place']=s1.Place.str.title()
s1.head()

2. strip,lstrip,rstrip

Example 1: Let’s strip ‘$’ from the “Salary” column. Both leading and trailing ‘$’ sign.

strip → used to remove leading and trailing whitespaces/character mentioned.

s1['Salary']=s1.Salary.str.strip('$')
s1.head()

Example 2: Let’s strip leading ‘$’ from the “Salary” column.

lstrip → used to remove leading whitespaces/character mentioned.

s1['Salary']=s1.Salary.str.lstrip('$')
s1.head()

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

s1['Salary']=s1.Salary.str.rstrip('$')
s1.head()

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

s1[['City','State']]=s1.Place.str.split(',',expand=True)
s1.head()

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

s1['State']=s1.State.str.replace('nc','North Carolina')
s1.head()

5. startswith,endswith, contains

startswith

str. startswith(“prefix”) → Returns True if the string starts with the mentioned “prefix”.
We can apply this function to a column in pandas dataframe, to filter the rows that start with the mentioned “substring” in a particular column.

Example 1: Filtering rows that startswith “C” in the “Place” column

s2=s1.loc[s1.Place.str.startswith("C")]
s2

endswith

str. endswith(“suffix”) → Returns True if the string endswith the mentioned “suffix”.
We can apply this function to a column in pandas dataframe, to filter the rows that end with the mentioned “substring” in a particular column.

Example 2: Filtering rows that endswith “as” in the “Place” column

s2=s1.loc[s1.Place.str.endswith("as")]
s2

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

s2=s1.loc[s1.Place.str.contains("lotte")]
s2

References:

https://pandas.pydata.org/pandas-docs/version/0.22/api.html#string-handling

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

)
s1.head()

Example 2: Let’s strip leading ‘$’ from the “Salary” column.

lstrip → used to remove leading whitespaces/character mentioned.

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

References:

https://pandas.pydata.org/pandas-docs/version/0.22/api.html#string-handling

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

)
s1.head()

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

References:

https://pandas.pydata.org/pandas-docs/version/0.22/api.html#string-handling

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

)
s1.head()

Example 2: Let’s strip leading ‘$’ from the “Salary” column.

lstrip → used to remove leading whitespaces/character mentioned.

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

References:

https://pandas.pydata.org/pandas-docs/version/0.22/api.html#string-handling

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

)
s1.head()

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

References:

https://pandas.pydata.org/pandas-docs/version/0.22/api.html#string-handling

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

)
s1.head()

Example 2: Let’s strip leading ‘$’ from the “Salary” column.

lstrip → used to remove leading whitespaces/character mentioned.

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

s1.head()

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

References:

https://pandas.pydata.org/pandas-docs/version/0.22/api.html#string-handling

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

Become a Medium Member by Clicking here: https://indhumathychelliah.medium.com/membership

)
s1.head()

Example 2: Let’s strip leading ‘$’ from the “Salary” column.

lstrip → used to remove leading whitespaces/character mentioned.

Example 3: Let’s strip the trailing ‘$’ symbol from the “Salary” column.

rstrip → used to remove trailing whitespaces/character mentioned.

3. split

If we need to split the column in the dataframe into two columns based on some delimiter string, we can use the str.split() function

Example 1: Splitting “Place” column into “City” and “State” columns based on delimiter string “,”

4. replace

If we need to replace a substring in a column in the pandas dataframe, we can use the str. replace() function.

Example: Replacing “nc” by “North Carolina” in the “State” column

5. startswith,endswith, contains

startswith

Example 1: Filtering rows that startswith “C” in the “Place” column

endswith

Example 2: Filtering rows that endswith “as” in the “Place” column

contains

Example 2: Filtering rows that contain the substring “lotte” in the “Place” column

Thanks for reading!

If you like to read more of my tutorials on Python and Data Science,
follow me on medium, Twitter

String Methods

1. Converting to uppercase/lowercase/title case

2. strip,lstrip,rstrip

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

References:

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

References:

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

References:

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

References:

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

References:

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

References:

3. split

4. replace

5. startswith,endswith, contains

startswith

endswith

contains

Share this:

Related

Leave a Reply Cancel reply