Different Ways to Replace Occurences of a Substring in Python Strings

Using string methods and regexes in python

Photo by Karolina Grabowska from Pexels

Replace Occurrences of Substrings in Strings in Python

  1. str.replace()
  2. re.sub()
  3. re.subn()

By using the above-mentioned methods, let’s see how to replace substrings in strings.


1. Replace all occurrences of substring

‘Using str.replace()’

Syntax: str.replace(old,new,count)

Example 1: Replace substring “two” by “one”

s1="one apple,two orange,two banana"
s2=s1.replace("two","one")
print (s2)
#Output:one apple,one orange,one banana

By default, str.replace() will replace all occurrences of “two” by “one”

2. Replace only the first occurrence of a substring

Using ‘str.replace’

Example 1: Replace substring “two” by “one” for first occurrence only.

If we want to replace the substring by the first occurrence only, we can mention count =1. Likewise, for the first two occurrences, we can mention count=2.

s1="one apple,two orange,two banana"
s2=s1.replace("two","one",1)
print (s2)
#Output:one apple,one orange,two banana

3. Case-insensitive replacement.

Using ‘re.sub()’

Syntax:

re.sub(pattern, repl, string, count=0, flags=0)

By mentioning flag=re.IGNORECASE

Example 1: Have to replace “An” or “an” by “one”.

import re
s1="An apple,an avocado"
pattern = re.compile('an', re.IGNORECASE)
s2=pattern.sub("one",s1)
print (s2)
#Output:one apple,one avocado
  • pattern = re.compile(‘an’, re.IGNORECASE) → defined the pattern which matches “an” substring. Flag is set as re.IGNORECASE which means case insensitive. It will match “an”,”AN”,”An” substrings.
  • s2=pattern.sub(“one”,s1) →Replacing the matching pattern by “one” for string “s1”

Example 2: Doing case insensitive replacement by using re.subn()

Using ‘re.subn()’

Syntax : re.subn(pattern, repl, string, count=0, flags=0)

Same as re.sub(), but it will return a tuple (new_string, number_of_subs_made)

If we want to know the number of substitutions made, re.subn() can be used.

import re
s1="An apple,an avocado"
pattern = re.compile('an', re.IGNORECASE)
s2=pattern.subn("one",s1)
print (s2)
#Output:('one apple,one avocado', 2)
  • (‘one apple,one avocado’, 2) → Returns the modified string and the number of substitutions made.

4. Avoid replacement on parts of words.

Example 1: To replace “an” by “one”. But it should not replace parts of words.

If we use str.replace(), “an” inside “orange” also gets replaced.

s1="an apple,an orange"
s2=s1.replace("an","one")
print(s2)
#Output:one apple,one oronege

To avoid replacement on parts on words, re.sub() can be used.

import re
s1="an apple,an orange"
pattern = re.compile(r'banb')
s2=pattern.sub("one",s1)
print (s1)
#Output:an apple,an orange
  • pattern = re.compile(r’banb’)b matches empty string. Since ‘banb’ matches empty string before and after “an” ,this will avoid replacement on parts of words. So ‘an’ inside ‘orange’ is not replaced.

5.Replace multiple words by one word.

Example 1: Repalce “hr”, “hour” to “Hours”

import re
s1="hr,hour"
pattern = re.compile('(hr|hour)')
s2=pattern.sub("Hours",s1)
print (s2)
#Output:Hours,Hours
  • pattern = re.compile(‘(hr|hour)’)
  • () →group
  • | → either or
  • ‘(hr|hour)’ → matches either “hr” or “hour”

6. Replace a specific set of characters by a single character.

Example: Replace @,#,$,% by ‘-’

import re
s1="1@2#3$4%5"
s2=re.sub("[@#$%]","-",s1)
print (s2)
  • re.sub(“[@#$%]”,”-”,s1)
    [] → used to indicate a set of characters
  • “[@#$%]” → pattern is matching any of these characters mentioned within []
  • re.sub(“[@#$%]”,”-”,s1) → matched characters then replaced by ‘-’

7. Replace one or more occurrences of a character by a single character.

import re
s1="1.99,2.999,3.9999"
import re
s2=re.sub("[9]+","0",s1)
print (s2)
#Output:1.0,2.0,3.0
  • s1=re.sub(“[9]+”,”0",s1)
  • “[9]+” → It will match one or more occurrences of 9.
  • + → match one or more occurrences of the character mentioned.
  • re.sub(“[9]+”,”0",s1) → Replacing one or more occurrence of ‘9’ by ‘0’ in string s1.

Using ‘re.subn()

import re
s1="1.99,2.999,3.9999"
import re
s2=re.subn("[9]+","0",s1)
print (s2)
#Output:('1.0,2.0,3.0', 3)

Takeaways:

  1. To replace a fixed string, str.replace() can be used. It is much faster when compared to re module.
  2. If a specific pattern needs to be replaced, then re.sub() or re,subn() can be used.
  3. All the above-mentioned methods will replace characters from the string and return a new string. It won’t modify the original string.

My other blogs related to string methods

Remove Whitespaces from Strings in Python

5 Ways to Find the Index of a Substring in Python

5 Different Ways to Remove Specific Characters From a String in Python

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s