When working with strings, you will encounter the case of nested strings. In this article, we share with you how to extract strings between quotes in Python. Keep reading for more information.
Extract strings between quotes in Python
When you want to extract the substrings that are in a parent string, we provide you with three methods as follows:
Combination of replace(), startswith() and endswith() and methods
In this first way, we use the startswith() and endswith() methods to extract strings in between the quotations. Meanwhile, the replace() method is used to remove the double quotes.
message = 'RGB stands for "red" "green" and "blue"' temp = message.split() result = [] for i in temp: if(i.startswith('"') and i.endswith('"')): i = i.replace('"', "") result.append(i) print(result)
Output:
['red', 'green', 'blue']
This method looks quite cumbersome and manual when using all three methods simultaneously. One downside of this approach is that when the quotes are sentences, then it will not work.
message = '"The Starry Night", "Wheatfield with Crows" \ and "Café Terrace at Night" are one of the masterpieces of Vincent van Gogh' temp = message.split() result = [] for i in temp: if i.startswith('"') and i.endswith('"'): i = i.replace('"', "") result.append(i) print(result)
Output:
[]
As you can see, the result is an empty array instead of the array: [“The Starry Night”]. You can copy this code to your computer and run it in debug mode to find out why we get such results.
Using split() method
In this approach, the split() method will help us to break the string at positions that start with a double quote character.
message = 'RGB stands for "red" "green" and "blue"' temp = message.split('"') finalResult = message.split('"')[1::2] print(temp); print(finalResult);
Output:
['RGB stands for ', 'red', ' ', 'green', ' and ', 'blue', '']
['red', 'green', 'blue']
We will then retrieve the substrings we need by accessing their index in the newly created array.
Specifically, in this example, we use the [1::2] syntax to access the corresponding positions of the substrings in the ‘temp’ array. Consider the substrings we are interested in. The first string starts at index ‘1’, and their indexes are separated by 2 units. So, that’s why we use [1::2].
Pay attention to the indicators when you use this method.
Using findall() method
This is the last way. Also, the simplest way to extract strings between quotes in Python is using the regular expression.
import re message = '"The Starry Night", "Wheatfield with Crows" \ and "Café Terrace at Night" are one of the masterpieces of Vincent van Gogh' result = re.findall('"([^"]*)"', message) print(result)
Output:
['The Starry Night', 'Wheatfield with Crows', 'Café Terrace at Night']
Now you do not need to care about positions or characters to determine the string split.
All you need is to use the findall() method and pass it the following string argument: “([^”]*)”.
Summary
In conclusion, we have introduced you to three methods to extract strings between quotes in Python. Python provides you with several methods that can help you to do this, such as findall(), split(), startswith(), endswith(), and replace(). You can choose one of the above methods depending on your preference. However, we recommend method 2,3 because the syntax is more concise.
Maybe you are interested:
- Split string by space and preserve quoted strings in Python
- Add single quotes around a variable in Python
- Join a list of strings wrapping each string in quotes in Python
My name’s Christopher Gonzalez. I graduated from HUST two years ago, and my major is IT. So I’m here to assist you in learning programming languages. If you have any questions about Python, JavaScript, TypeScript, Node.js, React.js, let’s contact me. I will back you up.
Name of the university: HUST
Major: IT
Programming Languages: Python, JavaScript, TypeScript, Node.js, React.js