Regular Expressions, or regex, are patterns used to match and manipulate text. They are a powerful tool for text processing tasks such as validation, searching, extracting, and replacing data.
Regex syntax can appear intimidating at first, but once you understand the basics, it becomes a versatile tool for various text-processing tasks.
Why Learn Regex?
- Validation: Check if a string matches a specific format (e.g., email, phone number).
- Searching: Find specific text patterns in large datasets.
- Text Manipulation: Extract, replace, or split text based on patterns.
Regex Basics
1. Literals
Literals match the exact text you specify. For example:
cat
matches cat in the text.hello
matches hello.
2. Metacharacters
Metacharacters are special symbols with specific meanings:
Character | Description | Example |
---|---|---|
. | Matches any single character | c.t matches cat, cut |
^ | Matches the start of a string | ^Hello matches Hello world only at the beginning |
$ | Matches the end of a string | world$ matches Hello world |
* | Matches 0 or more repetitions | ca*t matches ct, cat, caaat |
+ | Matches 1 or more repetitions | ca+t matches cat, caaat but not ct |
? | Matches 0 or 1 repetitions | ca?t matches cat or ct |
\ | Escapes a special character | \. matches a literal period . |
Regex Character Classes
Character classes let you define a set of characters to match.
Class | Description | Example |
---|---|---|
[abc] | Matches a, b, or c | [aeiou] matches vowels |
[^abc] | Matches anything but a, b, or c | [^aeiou] matches consonants |
[a-z] | Matches any lowercase letter | [0-9] matches any digit |
\d | Matches any digit (0-9) | \d+ matches 123 |
\D | Matches any non-digit | \D+ matches abc |
\w | Matches any word character (letters, digits, underscores) | \w+ matches hello123 |
\W | Matches any non-word character | \W+ matches !@# |
\s | Matches any whitespace | \s+ matches spaces, tabs |
\S | Matches any non-whitespace | \S+ matches hello |
Quantifiers
Quantifiers define how many times a character or group should be matched.
Quantifier | Description | Example |
---|---|---|
{n} | Matches exactly n times | \d{3} matches 123 |
{n,} | Matches n or more times | \d{3,} matches 12345 |
{n,m} | Matches between n and m times | \d{2,4} matches 12, 123, 1234 |
Grouping and Capturing
Parentheses ()
group patterns and capture matched text.
(ab)+
matches one or more repetitions of ab.- Captured groups can be referred to using backreferences (
\1
,\2
).
Example:
Regex: (cat|dog)
Input: I have a cat and a dog.
Matches: cat, dog
Anchors
Anchors define the position in the text:
Anchor | Description | Example |
---|---|---|
^ | Start of a string | ^Hello matches Hello |
$ | End of a string | world$ matches world |
\b | Word boundary | \bcat\b matches cat but not cats |
\B | Non-word boundary | \Bcat matches scatter |
Common Regex Patterns
Here are some common regex examples:
Task | Regex | Explanation |
---|---|---|
Match a valid email | [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} | Email validation |
Match a phone number | \+?\d{1,3}[-.\s]?\(?\d{1,4}\)?[-.\s]?\d{1,4}[-.\s]?\d{1,9} | Phone numbers in various formats |
Match a URL | https?://[^\s/$.?#].[^\s]* | URL matching |
Match digits only | ^\d+$ | Entire string is digits |
Match a date (DD/MM/YYYY) | \b\d{2}/\d{2}/\d{4}\b | Matches dates in this format |
Using Regex in Code
Here’s an example of using regex in C#:
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main()
{
string input = "My email is example@domain.com";
string pattern = @"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}";
Match match = Regex.Match(input, pattern);
if (match.Success)
{
Console.WriteLine($"Found email: {match.Value}");
}
}
}
Tools for Practicing Regex
- Regex101 – Interactive regex testing tool with explanations.
- RegExr – Test regex with examples and a cheat sheet.
- Regex Crossword – Fun regex puzzles to practice.
Tips for Mastering Regex
- Start Simple: Begin with basic patterns and gradually use advanced features.
- Use Tools: Visual tools like Regex101 can help you debug and understand patterns.
- Practice Often: Experiment with different patterns to get comfortable.
- Break Down Patterns: Analyze complex regex step-by-step.
Regex is a skill that improves with practice. Once mastered, it becomes an indispensable tool in your programming toolkit.