# How To Remove URLs From Text In Python

To remove URLs from Text in Python, we can use use the findall() or re.sub() functions. Follow the article to understand better.

## Remove URLs from Text in Python

We have a text string and a URL inside the string.

Example:

myString = '''
'''

To remove URLs from Text, do as the following guide:

### Use the findall() function

You can use the findall() function to search for URLs and then delete those URLs with the replace() function. Note that the findall() function is in the re module, so you need to import re before calling findall().

Syntax:

re.findall(regex, string))

Parameters:

• regex: regular expression to search for digits.
• string: string you want the regular expression to search for.

The findall() function returns a list containing the pattern matches in the string. If not found, the function returns an empty list.

Example:

• Import re module.
• Create a string with the URL.
• Use findall() function to find URL in the string.
• Use the replace function to replace that URL with a space. So that URL has been removed.
import re

# String containing URL
myString = "This is a string with a URL https://learnshareit.com/"

# Use findall() function to search for URL
search = re.findall('http://\S+|https://\S+', myString)

for i in search:
# Remove that URL with replace() function
text = myString.replace(i, '')

print('String after removing URL:', text)    

Output:

String after removing URL: This is a string with a URL

### Use the re.sub function

Module ‘re’ has many methods and functions to work with RegEx, but one of the essential methods is ‘re.sub’.

The Re.sub() method will replace all pattern matches in the string with something else passed in and return the modified string.

Syntax:

re.sub(pattern, replace, string, count)

Parameters:

• pattern: is RegEx.
• replace: is the replacement for the resulting string that matches the pattern.
• string: is the string to match.
• count: is the number of replacements. Python will treat this value as 0, match, and replace all qualified strings if left blank.

Example:

• Import re module
• Create a string with a URL
• Use the re.sub() function to remove those URLs.
import re

# String of URL
myString = '''
'''

# Use the re.sub function to remove URL from the string
text = re.sub(r"\S*https?:\S*", "", myString)

print('String after removing URL:', text)

Output:

String after removing URL:
Text1

### Use module urllib

You can use the urllib module with the urllib.urlparse class has a scheme attribute combined with the split() function to remove the URL in the string.

Example:

• In the urllib module, there is a urllib.urlparse class that helps with URL parsing.
• Use the scheme attribute to check if the string matches the URL structure.
• To remove the URL with this: Use the split() function to split the string into a list, then use the scheme function to check if each string in the list matches a URL.
• Finally, use the join() function to join the remaining elements.
from urllib.parse import urlparse

# String containing URL
myString = "This is a text with a URL https://learnshareit.com/"

# Search and delete URL
search = [l for l in myString.split() if not urlparse(l).scheme]

# Merge string after removing URL
text = ' '.join(search)

print('String after removing URL:', text)

Output:

String after removing URL: This is a text with a URL