Both the sub() and gsub() functions in R can replace strings with others, but how different are they? This article will show you the difference between them and how to use the functions effectively.
What are the sub() and gsub() functions in R?
The sub() function
The sub() function replaces the first matched string with another string.
Syntax:
sub(target, replacement, string)
Parameters:
- target: The substring needs to be replaced
- replacement: The alternative substring
- string: The string to replace the substring
The gsub() function
The gsub() function finds and replaces all the matched strings with another string. Because the function replaces all the matched strings, it can replace one set of characters with another.
Syntax:
gsub(target, replacement, string)
Parameters:
- target: The substring needs to be replaced.
- replacement: The alternative substring.
- string: The string to replace the substring.
How to use the sub() and gsub() functions?
Replace a substring in a string
Take a simple example to compare the sub() and gsub() strings. In the example, we replace the character i
with the upper character. In the first string, we use the sub() function, and only the first character is replaced, while the gsub() function is used with the second string, and all the characters are replaced.
Code:
string1 = "This is a string replaced by the sub() function" string2 = "This is a string replaced by the gsub() function" # Replace the character i by I by the sub() function string1 = sub("i", "I", string1) # Replace the character i by I by the gsub() function string2 = gsub("i", "I", string2) print(string1) print(string2)
Result:
"ThIs is a string replaced by the sub() function"
"ThIs Is a strIng replaced by the gsub() functIon"
Replace a substring in a data frame
We can apply sub(), and gsub() functions to a data frame to replace substrings. Each cell is considered a string when replacing substrings in a data frame’ column. As a result, the sub() function only replaces the first character. Look at the following example and see the difference.
Code:
index = c(0, 1, 2, 3) char = c("this is", "a", "sample", "string") df1 = data.frame(index, char) df2 = data.frame(index, char) # Replace the character s by S by the sub() function df1$char = sub("s", "S", df1$char) # Replace the character s by S by the gsub() function df2$char = gsub("s", "S", df2$char) print("The first data frame") print(df1) print("The second data frame") print(df2)
Result:
"The first data frame"
index char
1 0 thiS is
2 1 a
3 2 Sample
4 3 String
"The second data frame"
index char
1 0 thiS iS
2 1 a
3 2 Sample
4 3 String
Replace a set of characters with the gsub() function
An advantage of the gsub() function is that it can replace a set of characters, which is known as a regular expression. In this example, we will eliminate all digits from the string by assigning [0-9].
Code:
string = "Abraham Lincoln was born on February 12 1809" # Eliminate all digits from the string result = gsub('[0-9]', "", string) print("The strings before and after are:") print(string) print(trimws(result))
Result:
"The strings before and after are:"
"Abraham Lincoln was born on February 12 1809"
"Abraham Lincoln was born on February"
Summary
In summary, both the sub() and gsub() functions in R replace a substring with another, but the sub() function only replaces the first matched substring, while the gsub() function replaces all the matched strings. Moreover, the gsub() function can be used with regular expression.
Maybe you are interested:
- The cut() Function In R: How To Use cut() In R
- The lines Function In R: Adding Lines To A Plot
- var() Function In R: Calculate Variance

My name is Robert Collier. I graduated in IT at HUST university. My interest is learning programming languages; my strengths are Python, C, C++, and Machine Learning/Deep Learning/NLP. I will share all the knowledge I have through my articles. Hope you like them.
Name of the university: HUST
Major: IT
Programming Languages: Python, C, C++, Machine Learning/Deep Learning/NLP