Skip to main content
Toolsbase Logo

Introduction to Regular Expressions — Common Patterns and Practical Examples

Toolsbase Editorial Team
(Updated: )
Regular ExpressionsProgrammingText Processing

What Are Regular Expressions?

Regular expressions (regex) are a special notation for describing patterns in strings. They are used in virtually every situation that requires string processing, including text searching, replacement, and validation.

Nearly all programming languages support regular expressions, so once you learn them, you can apply them across JavaScript, Python, Java, PHP, Go, and more.

Basic Metacharacters

Regular expressions are composed of ordinary characters and metacharacters (characters with special meanings). Let's start with the basic metacharacters.

Character Classes

Metacharacter Meaning Example Matches
. Any single character (except newline) a.c abc, a1c, a-c
\d Digit (0-9) \d{3} 123, 456
\D Non-digit \D+ abc, ---
\w Alphanumeric and underscore \w+ hello_123
\W Non-word character \W @, #,
\s Whitespace \s+ , \t, \n
\S Non-whitespace \S+ hello

Anchors

Anchors match positions rather than characters themselves.

Metacharacter Meaning Example Description
^ Start of line ^Hello Matches "Hello" at the start
$ End of line end$ Matches "end" at the end
\b Word boundary \bcat\b Matches "cat", not "catch"

Character Sets

Square brackets [] let you define custom character classes.

[abc]       # Any of a, b, or c
[a-z]       # Any lowercase letter
[A-Za-z]    # Any letter (upper or lower)
[0-9]       # Any digit (equivalent to \d)
[^0-9]      # Any non-digit (^ means negation)
[a-zA-Z0-9_] # Alphanumeric and underscore (equivalent to \w)

Quantifiers

Quantifiers specify how many times the preceding element should repeat.

Quantifier Meaning Example Matches
* 0 or more ab*c ac, abc, abbc
+ 1 or more ab+c abc, abbc
? 0 or 1 colou?r color, colour
{n} Exactly n \d{4} 2026
{n,} n or more \d{2,} 12, 123, 1234
{n,m} Between n and m \d{2,4} 12, 123, 1234

Greedy vs. Lazy Matching

By default, quantifiers are "greedy" — they match as many characters as possible. Adding ? after a quantifier makes it "lazy," matching as few as possible.

# Input: <div>hello</div>
<.*>      # Greedy: matches <div>hello</div> entirely
<.*?>     # Lazy: matches <div> only

Lazy matching is particularly useful when extracting HTML tags or quoted strings.

Grouping and Capturing

Parentheses () group patterns together and capture (extract) the matched portions.

Basic Grouping

(abc)+      # One or more repetitions of "abc"
(red|blue)  # "red" or "blue"

Using Capture Groups

const pattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = "2026-03-08".match(pattern);

console.log(match[1]); // "2026" (year)
console.log(match[2]); // "03"   (month)
console.log(match[3]); // "08"   (day)

Named Capture Groups

The (?<name>...) syntax lets you assign names to groups, greatly improving readability.

const pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = "2026-03-08".match(pattern);

console.log(match.groups.year);  // "2026"
console.log(match.groups.month); // "03"
console.log(match.groups.day);   // "08"

Non-Capturing Groups

When you don't need to save the match result, (?:...) creates a non-capturing group. This offers a slight performance advantage.

(?:http|https)://   # Groups the protocol part without capturing it

Lookahead and Lookbehind

Lookaheads and lookbehinds specify conditions about surrounding text without including it in the match result.

Syntax Name Description
(?=...) Positive lookahead Position followed by the pattern
(?!...) Negative lookahead Position not followed by the pattern
(?<=...) Positive lookbehind Position preceded by the pattern
(?<!...) Negative lookbehind Position not preceded by the pattern
# Match digits followed by "px" (without including "px" in the match)
\d+(?=px)

# Input: "100px 200em 300px"
# Matches: "100", "300"
# Match digits preceded by "$"
(?<=\$)\d+

# Input: "$100 200 $300"
# Matches: "100", "300"

Practical Pattern Examples

Email Address Validation

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This pattern validates common email address formats. However, full RFC 5321 compliance would require a much more complex pattern. In practice, a simple format check combined with a confirmation email is the recommended approach.

Phone Number (US Format)

^(\+1)?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$

This pattern handles various US phone number formats with or without country code, parentheses, and different separators.

URL Validation

^https?:\/\/[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?(\.[a-zA-Z]{2,})+([\/\w.-]*)*\/?$

ZIP Code (US)

^\d{5}(-\d{4})?$

Matches both "12345" and "12345-6789" formats.

Password Strength Check

Validates that a password is at least 8 characters long and contains at least one uppercase letter, one lowercase letter, and one digit.

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d@$!%*?&]{8,}$

This pattern uses three positive lookaheads to independently check for the presence of each character type.

Regular Expression Performance

Regular expressions are powerful, but poorly written patterns can cause performance issues.

Patterns to Avoid

  • Catastrophic backtracking: Nested quantifiers like (a+)+b can cause exponentially increasing processing time with certain inputs
  • Unnecessary capturing: Use (?:...) for groups where capturing is not needed
  • Overly complex patterns: Instead of handling everything in a single regex, split the logic into multiple steps

Recommendations

  • Use regex visualization and debugging tools
  • Start with simple patterns and add conditions incrementally
  • Test with real input data and cover edge cases

Conclusion

Regular expressions are a versatile tool for text processing. By mastering metacharacters, quantifiers, grouping, and lookaheads, you can handle most practical challenges. While they may seem complex at first, learning the most commonly used patterns gradually will build your confidence. Using a regex tester to verify pattern behavior in real-time is the most efficient way to learn.