regex
Regex Cheat Sheet: Syntax, Patterns & Copy-Paste Examples
A practical regular expression cheat sheet — anchors, character classes, quantifiers, groups, and lookarounds — plus ready-to-use patterns for email, URLs, dates and more, and the gotchas to avoid.
Regular expressions are a tiny language for describing text patterns. The syntax is dense but small — once you have the building blocks, you can read and write most patterns. Here they are, grouped, with real examples.
Anchors & boundaries
| Token | Matches |
|---|---|
^ | Start of string (or line, in multiline mode) |
$ | End of string (or line) |
\b | Word boundary (between word and non-word char) |
\B | Not a word boundary |
Character classes
| Token | Matches |
|---|---|
. | Any character except newline |
\d / \D | A digit / a non-digit |
\w / \W | Word char [A-Za-z0-9_] / non-word char |
\s / \S | Whitespace / non-whitespace |
[abc] | Any one of a, b, c |
[^abc] | Any char except a, b, c |
[a-z] | Any char in the range a–z |
Quantifiers
| Token | Meaning |
|---|---|
* | 0 or more |
+ | 1 or more |
? | 0 or 1 (optional) |
{n} | Exactly n |
{n,} | n or more |
{n,m} | Between n and m |
*? +? ?? | Lazy versions — match as few as possible |
By default quantifiers are greedy — they grab as much as they can, then back off. Add ? to make them lazy. On <a><b>, the pattern <.*> matches the whole thing; <.*?> matches just <a>.
Groups & alternation
(abc) capturing group — remembered as group 1
(?:abc) non-capturing group — grouped but not stored
(?<name>abc) named group — refer to it as "name"
a(?=b) lookahead: "a" only if followed by "b"
a(?!b) negative lookahead: "a" only if NOT followed by "b"
(?<=b)a lookbehind: "a" only if preceded by "b"
(?<!b)a negative lookbehind
Alternation uses the pipe: the pattern cat OR dog is written cat|dog. Wrap it in a group to bound it — ^(cat|dog)$.
Flags
| Flag | Effect |
|---|---|
g | Global — find all matches, not just the first |
i | Case-insensitive |
m | Multiline — ^ and $ match line starts/ends |
s | Dotall — . also matches newline |
u | Unicode mode |
Copy-paste patterns
These are pragmatic, not RFC-perfect — they cover the real-world 99% without becoming unreadable.
# Email (good enough for validation)
^[^\s@]+@[^\s@]+\.[^\s@]+$
# URL (http/https)
^https?:\/\/[^\s/$.?#].[^\s]*$
# IPv4 address
^(\d{1,3}\.){3}\d{1,3}$
# ISO date YYYY-MM-DD
^\d{4}-\d{2}-\d{2}$
# Hex color #fff or #ffffff
^#(?:[0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
# Slug my-post-title
^[a-z0-9]+(?:-[a-z0-9]+)*$
# US phone, loose
^\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$
# Strong-ish password: 8+, upper, lower, digit
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$
That last one is a great example of lookaheads as AND conditions: each (?=...) independently asserts “somewhere ahead there’s a lowercase / uppercase / digit,” and .{8,} enforces the length — all without consuming characters.
The gotchas
1. Greedy by default. If a pattern is matching too much, you probably want a lazy quantifier (*?) or a negated class ([^"]* instead of .*).
2. Escape special characters. . * + ? ( ) [ ] { } ^ $ | \ / are metacharacters. To match a literal dot, write \.. Inside [...], most lose their special meaning, but escape ], \, ^ (if first), and - (if not at an edge).
3. Validating email with regex is a trap. The “correct” RFC 5322 email regex is monstrous and still wrong. Use the simple pattern above for a sanity check, then actually send a confirmation. Don’t reject real addresses to satisfy a regex.
4. Catastrophic backtracking. Nested quantifiers like (a+)+$ on a long non-matching string can hang for seconds or minutes. Avoid quantifiers inside quantified groups; prefer specific classes and anchors.
5. Anchor your validators. A pattern without ^…$ matches if it appears anywhere. \d{4} happily matches inside abc1234xyz. For validation, anchor both ends.
Test it before you ship it
Regex is famously easy to get almost right. Don’t eyeball it:
- Try patterns against your real input in the Regex Tester — live highlighting of every match and group as you type.
- Grab a vetted starting point from the Regex Library instead of reinventing email/URL/date patterns.
Build the pattern, throw your edge cases at it, then paste it into your code.