Python Strings And Simple Text Manipulation

python strings

Manipulating text with strings is a cornerstone of any programming language, including Python. It can be simple and complicated at the same time. Let’s learn the basics of Python strings with the example below.

Python Strings

Python uses strings (or, more specifically, str objects) to handle textual data. These objects are also a sequence type (like lists or tuples) but tailored for representing sequences of Unicode code points.

String Literals

String literals in Python can be written in several ways, depending on which quotation mark you want to enclose your strings: single quotes, double quotes, and triple quotes.

These strings represent the same sequence of characters in Python:

>>> single = 'learnshareit'
>>> double = "learnshareit"
>>> triple_single = '''learnshareit'''
>>> triple_double = """learnshareit"""

You can check their contents with the equality operator:

>>> single == double
True
>>> double == triple_single
True
>>> triple_single == triple_double
True

While most of the time, you can use single and double quotes to enclose a string interchangeably, you shouldn’t mix them. This would lead to a syntactical error:

>>> print("this is a mixed quote')
  File "<stdin>", line 1
    print("this is a mixed quote')
          ^
SyntaxError: unterminated string literal (detected at line 1)

There are times when you need to have quotation marks inside your strings, such as for marking the contraction of “it is” or “do not”. Using single quotes in those situations will produce errors:

>>> a = 'I'm Tom'
  File "<stdin>", line 1
    a = 'I'm Tom'
           ^
SyntaxError: invalid syntax

Python interprets the apostrophe here as the enclosing quotation mark of your string. As a result, your sequence of characters is treated as an invalid string literal.

You should use double quotes to embed the apostrophe:

>>> a = "I'm Tom"

Likewise, you can use single quotes to embed double quotes, such as when quoting direct speech:

>>> b = 'He said "Do it"'

Triple quoted literals can’t only contain both single and double quotes but also allow you to span them to multiple lines. All the whitespace characters will be preserved in the returned str object, including trailing or leading spaces:

>>> print("""He said "I'm done"
... "Okay", I responded""")
He said "I'm done"
"Okay", I responded

str()

In addition to quotation marks, the str() constructor can also be used to create strings. What it does is convert an input object into a string. In these examples, str() creates the string version of a string literal, a number

>>> str(4)
'4'
>>> str('learnshareit')
'learnshareit'
If you don't provide any object, str() returns an empty string.
>>> str()
''

Escape Characters

Python strings support the backslash (\) escape characters. Using them in the middle of a string gives the next character special meaning. You can put them before a single or double quote to tell Python not to treat them as a string delimiter.

>>> a = 'I\'m Tom'
>>> print(a)
I'm Tom

Another common use of escape characters is to break a line and start a new one with \n:

>>> a = 'Hello\nWelcome to LearnShareIT'
>>> print(a)
Hello
Welcome to LearnShareIT

Sequence Operations

You can use common sequence operations on string objects in the same way you invoke them on other sequence types like lists.

For instance, you can check for the existence of a substring in another string with the “in” keyword:

>>> a = 'learnshareit'
>>> 'it' in a
True

This keyword also works with a for loop, helping you iterate through a string:

>>> a = 'Tom'
>>> for i in a:
...     print(i)
... 
T
o
m

The plus (+) operator can be used to concatenate two strings, creating another string:

>>> b = '.com'
>>> a + b
'learnshareit.com'

You can access a character or a substring with indexing and slicing:

>>> a = 'learnshareit'
>>> a[1]
'e'
>>> a[2:5]
'arn'

The length of a string is returned by the len() function:

>>> len(a)
12

You can even find the number of occurrences of a substring in a string with the count() method:

>>> a.count('e')
2

Additional methods

In addition to the sequence operations above, str objects also come with several methods of their own.

split()

The built-in split() method breaks a string into substrings based on a certain separator. It returns a list whose items are those smaller strings:

>>> a = 'Tom,Donald,Noah,Oliver'
>>> a.split(',')
['Tom', 'Donald', 'Noah', 'Oliver']

If you don’t give the split() method any delimiter, it will treat consecutive whitespace as a single separator:

>>> a = 'Tom Donald  Noah    Oliver'
>>> a.split()
['Tom', 'Donald', 'Noah', 'Oliver']

join()

This method has the opposite purpose of split() – it concatenates multiple strings into one.

It is important to note that join() joins all the strings in an iterable while using the string you run it on as the separator. For instance, this example shows you can join names in a list together, separated by using a comma and a space as the:

>>> names = ['Tom', 'Donald', 'Noah', 'Oliver']
>>> (', ').join(names)
'Tom, Donald, Noah, Oliver'

replace()

For simple string substitution, use the replace() method. Its syntax is as follows:

replace(old, new, count)

Remember that strings are immutable, so replace() can’t modify the original string. Instead, it creates a copy of the string and replaces all the occurrences of old with new. The count parameter is optional, determining how many occurrences of old should be removed and replaced with new.

>>> a = 'Copyright 2014-2021'
>>> b = a.replace('2021', '2022')
>>> print(a)
Copyright 2014-2021
>>> print(b)
Copyright 2014-2022

Notice how the string is still intact, even when you have run replace() on it.

strip()

It is common to remove leading or trailing characters from a space, typically whitespace. In fact, the strip() method removes whitespace characters from the beginning and the end of a string by default when no argument is given:

>>> a = '    learnshareit  '
>>> a.strip()
'learnshareit'

To strip characters only on the left or right of the string, use lstrip() and rstrip()

>>> a.lstrip()
'learnshareit  '
>>> a.rstrip()
'    learnshareit'

You can also tell strip() to remove certain characters:

>>> a = 'learnshareit.com'
>>> a.strip('cmowz.')
'learnshareit'

Letter Case Methods

There are plenty of methods you can use to change the letter case of a string object. Remember that none of them actually affects the original string. They just carry out the operations on a copy of it and return the result.

The capitalize() method returns a copy of the string where only the first word is capitalized while the rest of them is lowercase:

>>> a = 'goOd morNiNg'
>>> a.capitalize()
'Good morning'

To capitalize all the words within a string, use title():

>>> a.title()
'Good Morning'

This is how you can convert all characters to uppercase or lowercase:

>>> a.upper()
'GOOD MORNING'
>>> a.lower()
'good morning'

Tutorials about Python Strings

You can learn more about strings in Python in the articles below.

Summary

At first glance, Python strings seem to be simple. But you can actually use them in several ways. There are many other operations you can use with strings as well, which you can learn more about on our website.

Leave a Reply

Your email address will not be published. Required fields are marked *