Using Python Regex to extract phone numbers from a text file

with open ('lorem.txt', 'rt') as myfile: # Open lorem.txt for reading text
contents = myfile.read() # Read the entire file to a string
# print(contents) # Print the string if you want to
# Now let's extract the text from here
import re
reg_ex=r"\+?\d+(?:[- (]+\d+\)?)+"
print(re.findall(rs, contents))
The code imports the re
module, which provides support for regular expressions in Python.
reg_ex = r"\+?\d+(?:[- (]+\d+\)?)+"
defines a regular expression pattern. Let's break it down:\+?
: Matches an optional plus sign (\+
). The backslash\
is used to escape the plus sign because it has a special meaning in regular expressions.\d+
: Matches one or more digits (\d
). This captures the numeric part of the phone number.(?:[- (]+\d+\)?)+
: This is a non-capturing group(?: ... )
that matches one or more occurrences of a sequence of characters. Let's break it down further:[- (]+
: Matches one or more occurrences of a hyphen, space, or opening parenthesis character. The characters are enclosed within square brackets[- (]
.\d+
: Matches one or more digits.\)?
: Matches an optional closing parenthesis\)
.The combination of
(?:[- (]+\d+\)?)+
inside the capturing group(...)+
allows the regular expression to match multiple occurrences of the separator and digit pattern, capturing the entire phone number.re.findall(rs, contents)
searches for all non-overlapping matches of the regular expression patternrs
in thecontents
string. It returns a list of all matched substrings.
tips: \+?
: The plus sign (\+
) is optional (?
). It matches zero or one occurrence of the plus sign. This allows for phone numbers with or without a plus sign at the beginning, indicating an international number.\d+
: This matches one or more digits (\d
). It captures the numeric portion of the phone number, such as the area code and subscriber number.
Subscribe to my newsletter
Read articles from Data Sensei directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

Data Sensei
Data Sensei
A data analytics engineer with four years of experience working as a data engineer. Holds a MSc in Data.