Python Regular Expressions (Regex)
Regular Expressions, commonly known as Regex, are powerful tools used for searching, matching, extracting, and manipulating text.
Python provides built-in support for regular expressions through the re module.
Regex is widely used for:
- Data validation
- Text searching
- Email verification
- Log analysis
- Data extraction
- Web scraping
Understanding regular expressions can significantly improve your ability to process text efficiently.
What is a Regular Expression?
A Regular Expression is a sequence of characters that defines a search pattern.
For example:
r"python"This pattern matches the word:
pythonRegex can be simple or extremely complex depending on the requirements.
Importing the re Module
Before using regular expressions, import Python's built-in re module:
import reThe re.search() Function
The search() function searches a string for a match.
Example
import re
text = "Welcome to Python Programming"
result = re.search("Python", text)
if result:
print("Match Found")Output
Match FoundThe re.match() Function
The match() function checks only the beginning of a string.
Example
import re
text = "Python Tutorial"
result = re.match("Python", text)
print(result)Output
<re.Match object>The re.findall() Function
Returns all matching occurrences.
Example
import re
text = "Python Java Python C++ Python"
matches = re.findall("Python", text)
print(matches)Output
['Python', 'Python', 'Python']The re.finditer() Function
Returns an iterator containing match objects.
Example
import re
text = "Python Python Python"
for match in re.finditer("Python", text):
print(match.start())Output
0
7
14The re.sub() Function
Used to replace matching text.
Example
import re
text = "I love Java"
new_text = re.sub("Java", "Python", text)
print(new_text)Output
I love PythonThe re.split() Function
Splits a string based on a pattern.
Example
import re
text = "Apple,Banana,Orange"
result = re.split(",", text)
print(result)Output
['Apple', 'Banana', 'Orange']Common Regex Metacharacters
| Symbol | Meaning |
|---|---|
| . | Any character |
| ^ | Start of string |
| $ | End of string |
| * | Zero or more occurrences |
| + | One or more occurrences |
| ? | Optional occurrence |
| [] | Character set |
| | | OR operator |
| () | Grouping |
Character Classes
Digits
\dMatches:
0-9Example:
import re
text = "Order 123"
print(re.findall(r"\d", text))Output:
['1', '2', '3']Non-Digits
\DMatches any non-numeric character.
Word Characters
\wMatches:
Letters, numbers, underscoreExample:
import re
text = "Python_3"
print(re.findall(r"\w", text))Whitespace
\sMatches spaces, tabs, and line breaks.
Quantifiers
Zero or More (*)
ab*Matches:
a
ab
abb
abbbOne or More (+)
ab+Matches:
ab
abb
abbbOptional (?)
colou?rMatches:
color
colourCharacter Sets
Example
import re
text = "cat bat rat"
matches = re.findall("[cbr]at", text)
print(matches)Output
['cat', 'bat', 'rat']Range Matching
Example
import re
text = "ABC123"
matches = re.findall("[A-Z]", text)
print(matches)Output
['A', 'B', 'C']Email Validation Example
import re
email = "user@example.com"
pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$'
if re.match(pattern, email):
print("Valid Email")
else:
print("Invalid Email")Output
Valid EmailPhone Number Validation
import re
phone = "1234567890"
if re.match(r'^\d{10}$', phone):
print("Valid Number")Output
Valid NumberExtracting Numbers from Text
import re
text = "Price: $250, Discount: 15%"
numbers = re.findall(r'\d+', text)
print(numbers)Output
['250', '15']Extracting URLs
import re
text = "Visit https://example.com"
urls = re.findall(r'https?://\S+', text)
print(urls)Output
['https://example.com']Real-World Applications of Regex
Regular expressions are commonly used for:
- Email validation
- Phone number validation
- Password checking
- Data extraction
- Log file analysis
- Search engines
- Web scraping
- Form validation
- Text processing
- Data cleaning
Best Practices
- Keep regex patterns simple.
- Use raw strings (
r"") whenever possible. - Test patterns thoroughly.
- Add comments for complex regex.
- Avoid overly complicated expressions.
Common Mistakes
Forgetting Raw Strings
Wrong:
"\d+"Correct:
r"\d+"Using Match Instead of Search
match() checks only the beginning of a string.
search() scans the entire string.
Summary
Python Regular Expressions provide a flexible and powerful way to work with text. Using the re module, developers can search, validate, replace, split, and extract data efficiently.
Mastering regex is essential for handling real-world text processing tasks and is a valuable skill for every Python developer.
Conclusion
Regular Expressions are among the most useful text-processing tools available in Python. Whether you are validating user input, extracting information from files, cleaning datasets, or building web applications, regex can save time and simplify complex tasks.
Learning regex may seem challenging at first, but with practice, it becomes an indispensable part of your Python toolkit.


0 Comments