This article will introduce regular expressions in the R programming language. They play crucial parts as tools and capabilities to use widespread regex patterns. So, follow below to learn how to use them.
What is the regular expression in R?
Regular expressions, also known as “regex” or “regexp,” are used to define patterns that can be matched against a string. These patterns can be used to search for and replace specific characters within a text string. Regular expressions are a powerful tool for manipulating and processing text data.
How to use the regular expression in R?
The regular expression commands
Let’s use an unlawful column name that describes age categories or other characteristics and that we eventually wish to prefix with “m” as an example. The first column names imported from Excel are as follows:
# Create a data x <- c("2_2apple", "3_2banana", "4_3apple", "5_1orange")
First, we need to select anything in this string that contains the word “cat.” The grep function is the basic regex command in R, so if you don’t know how to use this function, you can click here. It only outputs the index of the components that match:
# Create a data x <- c("2_2apple", "3_2banana", "4_3apple", "5_1orange") grep(pattern = "apple", x = x)
 1 3
You can also use the grepl() function, which returns a logical vector indicating which elements of a character vector match a regex pattern.
# Create a data x <- c("2_2apple", "3_2banana", "4_3apple", "5_1orange") grepl(pattern = "apple", x = x)
 TRUE FALSE TRUE FALSE
Now, we will use the strsplit() function to re-combine the list output into an appropriate form and use the sapply() function to apply this function above.
# Create a data x <- c("2_2apple", "3_2banana", "4_3apple", "5_1orange") sapply(strsplit(x, split = "_"), "[", 2)
 "2apple" "2banana" "3apple" "1orange"
Finding and replacing in R
To perform a search and replace on all instances of a pattern, use gsub. For example, to replace every occurrence of the word ‘apple’ with the word ‘banana’, you would use the following code:
# Create a data x <- c("2_2apple", "3_2banana", "4_3apple", "5_1orange") gsub(pattern = "apple$", replacement = "banana", x = x, ignore.case = T)
 "2_2banana" "3_2banana" "4_3banana" "5_1orange"
Another example, you can replace the string with the sub() function. Check out the code example below.
First, we have the data below
# Create a data str <- "The best is 42" # View a data str
 "The best is 42"
Here, we use the gsub() function to replace ’42’ to ‘Learn Share IT’.
# Create a data str <- "The best is 42" # Replace the string gsub("[0-9]+", "Learn Share IT", str)
 "The best is Learn Share IT"
In conclusion, we will learn about the regular expression in R and we hope you can understand it after the examples above. So, if you have any questions, don’t hesitate to comment below. Thanks for reading!
Have a great day!
Name of the university: HCMUS