Back to QuickRef
Regex
Regular expressions for pattern matching and text processing in various programming languages and tools.
Overview
Regular expressions (regex) are patterns used to match character combinations in strings. They’re essential for text processing, validation, searching, and data extraction across virtually all programming languages and tools.
Basic Syntax
Literal Characters
hello # Matches "hello" exactly
123 # Matches "123" exactly
Meta Characters
. # Any character except newline
^ # Start of string/line
$ # End of string/line
\ # Escape character
| # OR operator
() # Group
[] # Character class
{} # Quantifier
Character Classes
Predefined Classes
. # Any character except newline
\d # Any digit (0-9)
\D # Any non-digit
\w # Any word character (a-z, A-Z, 0-9, _)
\W # Any non-word character
\s # Any whitespace character
\S # Any non-whitespace character
Custom Classes
[abc] # Any of a, b, or c
[a-z] # Any lowercase letter
[A-Z] # Any uppercase letter
[0-9] # Any digit
[^abc] # Any character except a, b, or c
[a-zA-Z0-9] # Any alphanumeric character
Quantifiers
Basic Quantifiers
* # 0 or more
+ # 1 or more
? # 0 or 1 (optional)
{n} # Exactly n times
{n,} # n or more times
{n,m} # Between n and m times
Examples
a* # "", "a", "aa", "aaa", ...
a+ # "a", "aa", "aaa", ... (not empty)
a? # "" or "a"
a{3} # "aaa" exactly
a{2,4} # "aa", "aaa", or "aaaa"
Anchors
Position Anchors
^ # Start of string/line
$ # End of string/line
\b # Word boundary
\B # Non-word boundary
Examples
^hello # "hello" at start of line
world$ # "world" at end of line
\bcat\b # "cat" as whole word
\Bcat\B # "cat" not as whole word
Groups and Capturing
Groups
(abc) # Capture group
(?:abc) # Non-capturing group
(?P<name>abc) # Named group (Python)
(?<name>abc) # Named group (C#, Java)
Backreferences
(cat)\1 # Matches "catcat"
(\w+)\s+\1 # Matches repeated words
Common Patterns
Email Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Phone Numbers
^\+?1?[-.\s]?\(?[0-9]{3}\)?[-.\s]?[0-9]{3}[-.\s]?[0-9]{4}$
URLs
^https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)$
IPv4 Address
^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$
Date Formats
# MM/DD/YYYY
^(0[1-9]|1[0-2])\/(0[1-9]|[12][0-9]|3[01])\/\d{4}$
# YYYY-MM-DD
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])$
Credit Card Numbers
# Visa
^4[0-9]{12}(?:[0-9]{3})?$
# MasterCard
^5[1-5][0-9]{14}$
# American Express
^3[47][0-9]{13}$
Language-Specific Usage
JavaScript
// Create regex
const regex = /pattern/flags;
const regex = new RegExp('pattern', 'flags');
// Test match
regex.test(string);
// Find matches
string.match(regex);
string.search(regex);
string.replace(regex, replacement);
Python
import re
# Compile regex
pattern = re.compile(r'regex_pattern')
# Match functions
re.match(pattern, string) # Match at beginning
re.search(pattern, string) # Find first match
re.findall(pattern, string) # Find all matches
re.sub(pattern, replacement, string) # Replace
Bash/grep
# Basic grep
grep 'pattern' file.txt
# Extended regex
grep -E 'pattern' file.txt
egrep 'pattern' file.txt
# Perl-compatible regex
grep -P 'pattern' file.txt
sed
# Replace with regex
sed 's/pattern/replacement/g' file.txt
# Extended regex
sed -E 's/pattern/replacement/g' file.txt
Flags/Modifiers
Common Flags
i # Case insensitive
g # Global (find all matches)
m # Multiline (^ and $ match line breaks)
s # Dot matches newline
x # Extended (ignore whitespace)
Examples
/hello/i # Case insensitive
/hello/g # Global search
/hello/gi # Case insensitive + global
Advanced Features
Lookahead/Lookbehind
(?=pattern) # Positive lookahead
(?!pattern) # Negative lookahead
(?<=pattern) # Positive lookbehind
(?<!pattern) # Negative lookbehind
Examples
\d+(?=px) # Numbers followed by "px"
\d+(?!px) # Numbers not followed by "px"
(?<=\$)\d+ # Numbers preceded by "$"
(?<!\$)\d+ # Numbers not preceded by "$"
Practical Examples
Extract Domain from Email
@([a-zA-Z0-9.-]+\.[a-zA-Z]{2,})
Find HTML Tags
<\/?[a-zA-Z][^>]*>
Match Quoted Strings
"([^"\\]|\\.)*"
Find CSS Colors
#[0-9a-fA-F]{3,6}
Extract URLs from Text
https?:\/\/[^\s]+
Match JSON Values
"[^"]*":\s*("[^"]*"|\d+|true|false|null)
Testing and Debugging
Online Tools
- regex101.com - Interactive regex tester
- regexr.com - Visual regex builder
- regexpal.com - Simple regex tester
Command Line Testing
# Test with grep
echo "test string" | grep -E 'pattern'
# Test with sed
echo "test string" | sed 's/pattern/replacement/'
# Test with Python
python3 -c "import re; print(re.search(r'pattern', 'test string'))"
Performance Tips
Best Practices
- Use specific characters instead of
.
when possible - Avoid nested quantifiers like
(a+)+
- Use non-capturing groups
(?:...)
when you don’t need the match - Anchor patterns with
^
and$
when appropriate - Use word boundaries
\b
for word matching
Common Pitfalls
- Greedy vs Non-greedy:
.*
vs.*?
- Backtracking: Avoid complex nested patterns
- Case sensitivity: Remember to use
i
flag when needed - Escaping: Don’t forget to escape special characters
Quick Reference
Most Common Patterns
\d+ # One or more digits
\w+ # One or more word characters
\s+ # One or more whitespace
[a-zA-Z]+ # One or more letters
\b\w+\b # Whole words
^.+$ # Entire line
.* # Any characters (greedy)
.*? # Any characters (non-greedy)
Escape Sequences
\. # Literal dot
\* # Literal asterisk
\+ # Literal plus
\? # Literal question mark
\\ # Literal backslash
\( # Literal parenthesis
\[ # Literal bracket
\{ # Literal brace
\| # Literal pipe
See Also
man grep
- Pattern matching in filesman sed
- Stream editor with regex supportman awk
- Pattern scanning and processing- MDN Regex Guide
- Python re module
Categories:
toolsLast updated: January 1, 2023