Formula Guide

    How to Write and Test Regular Expressions

    A regular expression (regex) is a pattern that matches text. It is used to validate input (email, phone, password), search and replace text, extract data from logs, and parse structured content. The same regex syntax works across Python, JavaScript, Java, and most other languages with minor differences. Once you understand the core building blocks, you can construct complex patterns from simple parts.

    Last updated: March 31, 2026

    The Formula

    Anchors:      ^ (start), $ (end)
    Character classes: [abc] matches a, b, or c | [a-z] range | \d digit | \w word char | \s whitespace
    Quantifiers:  * (0+), + (1+), ? (0 or 1), {n} exactly n, {n,m} between n and m
    Groups:       (abc) capturing group | (?:abc) non-capturing | | alternation
    Lookaheads:   (?=...) positive | (?!...) negative
    Flags: g = global (find all matches), i = case-insensitive, m = multiline (^ and $ match line boundaries). In JavaScript: /pattern/gi

    Variable Definitions

    SymbolNameDescription
    .WildcardMatches any single character except newline
    \d, \w, \sShorthand Classes\d = [0-9], \w = [a-zA-Z0-9_], \s = [ \t\n\r]
    ^, $Anchors^ matches start of string/line, $ matches end

    Step-by-Step Example

    Write a regex to validate a basic email address like user@example.com.

    Given

    Target format:user@domain.tld

    Solution

    1. 1
      Match local part (letters, digits, dots, underscores): [a-zA-Z0-9._%+-]+
    2. 2
      Match the @ symbol literally: @
    3. 3
      Match domain name: [a-zA-Z0-9.-]+
    4. 4
      Match a dot before TLD: \.
    5. 5
      Match TLD (2-6 letters): [a-zA-Z]{2,6}
    6. 6
      Add anchors for full-string match: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$

    Pattern: /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/i — matches user@example.com ✓, rejects user@, @domain.com ✗

    Ready to calculate?

    Use the free Regex Tester — instant results, no sign-up.

    Open Calculator

    Common Mistakes to Avoid

    Forgetting to escape the dot (.) — unescaped . matches any character. Use \. when you mean a literal dot.

    Using greedy quantifiers when lazy is needed — .* is greedy and matches as much as possible. .*? is lazy and stops at the first match.

    Anchoring with ^ and $ but enabling multiline flag unintentionally — in multiline mode, ^ and $ match line boundaries, not the whole string.

    Not escaping special characters in strings — in many languages, the backslash in \d needs to be \\d in a string literal.

    Frequently Asked Questions

    Related Guides