Regex Guide — Patterns & Examples | Toolsbase

What Are Regular Expressions?

Regular expressions (regex) are a special notation for describing patterns in strings. They are used in virtually every situation that requires string processing, including text searching, replacement, and validation.

Nearly all programming languages support regular expressions, so once you learn them, you can apply them across JavaScript, Python, Java, PHP, Go, and more.

Basic Metacharacters

Regular expressions are composed of ordinary characters and metacharacters (characters with special meanings). Let's start with the basic metacharacters.

Character Classes

Metacharacter	Meaning	Example	Matches
`.`	Any single character (except newline)	`a.c`	`abc`, `a1c`, `a-c`
`\d`	Digit (0-9)	`\d{3}`	`123`, `456`
`\D`	Non-digit	`\D+`	`abc`, `---`
`\w`	Alphanumeric and underscore	`\w+`	`hello_123`
`\W`	Non-word character	`\W`	`@`, `#`,
`\s`	Whitespace	`\s+`	, `\t`, `\n`
`\S`	Non-whitespace	`\S+`	`hello`

Anchors

Anchors match positions rather than characters themselves.

Metacharacter	Meaning	Example	Description
`^`	Start of line	`^Hello`	Matches "Hello" at the start
`$`	End of line	`end$`	Matches "end" at the end
`\b`	Word boundary	`\bcat\b`	Matches "cat", not "catch"

Character Sets

Square brackets [] let you define custom character classes.

[abc]       # Any of a, b, or c
[a-z]       # Any lowercase letter
[A-Za-z]    # Any letter (upper or lower)
[0-9]       # Any digit (equivalent to \d)
[^0-9]      # Any non-digit (^ means negation)
[a-zA-Z0-9_] # Alphanumeric and underscore (equivalent to \w)

Quantifiers

Quantifiers specify how many times the preceding element should repeat.

Quantifier	Meaning	Example	Matches
`*`	0 or more	`ab*c`	`ac`, `abc`, `abbc`
`+`	1 or more	`ab+c`	`abc`, `abbc`
`?`	0 or 1	`colou?r`	`color`, `colour`
`{n}`	Exactly n	`\d{4}`	`2026`
`{n,}`	n or more	`\d{2,}`	`12`, `123`, `1234`
`{n,m}`	Between n and m	`\d{2,4}`	`12`, `123`, `1234`

Greedy vs. Lazy Matching

By default, quantifiers are "greedy" — they match as many characters as possible. Adding ? after a quantifier makes it "lazy," matching as few as possible.

# Input: <div>hello</div>
<.*>      # Greedy: matches <div>hello</div> entirely
<.*?>     # Lazy: matches <div> only

Lazy matching is particularly useful when extracting HTML tags or quoted strings.

Grouping and Capturing

Parentheses () group patterns together and capture (extract) the matched portions.

Basic Grouping

(abc)+      # One or more repetitions of "abc"
(red|blue)  # "red" or "blue"

Using Capture Groups

const pattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = "2026-03-08".match(pattern);

console.log(match[1]); // "2026" (year)
console.log(match[2]); // "03"   (month)
console.log(match[3]); // "08"   (day)

Named Capture Groups

The (?<name>...) syntax lets you assign names to groups, greatly improving readability.

const pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/;
const match = "2026-03-08".match(pattern);

console.log(match.groups.year);  // "2026"
console.log(match.groups.month); // "03"
console.log(match.groups.day);   // "08"

Non-Capturing Groups

When you don't need to save the match result, (?:...) creates a non-capturing group. This offers a slight performance advantage.

(?:http|https)://   # Groups the protocol part without capturing it

Lookahead and Lookbehind

Lookaheads and lookbehinds specify conditions about surrounding text without including it in the match result.

Syntax	Name	Description
`(?=...)`	Positive lookahead	Position followed by the pattern
`(?!...)`	Negative lookahead	Position not followed by the pattern
`(?<=...)`	Positive lookbehind	Position preceded by the pattern
`(?<!...)`	Negative lookbehind	Position not preceded by the pattern

# Match digits followed by "px" (without including "px" in the match)
\d+(?=px)

# Input: "100px 200em 300px"
# Matches: "100", "300"

# Match digits preceded by "$"
(?<=\$)\d+

# Input: "$100 200 $300"
# Matches: "100", "300"

Practical Pattern Examples

Email Address Validation

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$

This pattern validates common email address formats. However, full RFC 5321 compliance would require a much more complex pattern. In practice, a simple format check combined with a confirmation email is the recommended approach.

Phone Number (US Format)

^(\+1)?[-.\s]?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$

This pattern handles various US phone number formats with or without country code, parentheses, and different separators.

URL Validation

^https?:\/\/[a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9])?(\.[a-zA-Z]{2,})+([\/\w.-]*)*\/?$

ZIP Code (US)

^\d{5}(-\d{4})?$

Matches both "12345" and "12345-6789" formats.

Password Strength Check

Validates that a password is at least 8 characters long and contains at least one uppercase letter, one lowercase letter, and one digit.

^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)[a-zA-Z\d@$!%*?&]{8,}$

This pattern uses three positive lookaheads to independently check for the presence of each character type.

Regular Expression Performance

Regular expressions are powerful, but poorly written patterns can cause performance issues.

Patterns to Avoid

Catastrophic backtracking: Nested quantifiers like (a+)+b can cause exponentially increasing processing time with certain inputs
Unnecessary capturing: Use (?:...) for groups where capturing is not needed
Overly complex patterns: Instead of handling everything in a single regex, split the logic into multiple steps

Recommendations

Use regex visualization and debugging tools
Start with simple patterns and add conditions incrementally
Test with real input data and cover edge cases

Conclusion

Regular expressions are a versatile tool for text processing. By mastering metacharacters, quantifiers, grouping, and lookaheads, you can handle most practical challenges. While they may seem complex at first, learning the most commonly used patterns gradually will build your confidence. Using a regex tester to verify pattern behavior in real-time is the most efficient way to learn.