You are previewing Regular Expressions Cookbook, 2nd Edition.

Regular Expressions Cookbook, 2nd Edition

Cover of Regular Expressions Cookbook, 2nd Edition by Jan Goyvaerts... Published by O'Reilly Media, Inc.
  1. Regular Expressions Cookbook
  2. Preface
    1. Caught in the Snarls of Different Versions
    2. Intended Audience
    3. Technology Covered
    4. Organization of This Book
    5. Conventions Used in This Book
    6. Using Code Examples
    7. Safari® Books Online
    8. How to Contact Us
    9. Acknowledgments
  3. 1. Introduction to Regular Expressions
    1. Regular Expressions Defined
      1. Many Flavors of Regular Expressions
      2. Regex Flavors Covered by This Book
    2. Search and Replace with Regular Expressions
      1. Many Flavors of Replacement Text
    3. Tools for Working with Regular Expressions
      1. RegexBuddy
      2. RegexPal
      3. RegexMagic
      4. More Online Regex Testers
      5. More Desktop Regular Expression Testers
      6. grep
      7. Popular Text Editors
  4. 2. Basic Regular Expression Skills
    1. 2.1. Match Literal Text
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    2. 2.2. Match Nonprintable Characters
      1. Problem
      2. Solution
      3. Discussion
      4. Variations on Representations of Nonprinting Characters
      5. See Also
    3. 2.3. Match One of Many Characters
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. Flavor-Specific Features
      6. See Also
    4. 2.4. Match Any Character
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    5. 2.5. Match Something at the Start and/or the End of a Line
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    6. 2.6. Match Whole Words
      1. Problem
      2. Solution
      3. Discussion
      4. Word Characters
      5. See Also
    7. 2.7. Unicode Code Points, Categories, Blocks, and Scripts
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    8. 2.8. Match One of Several Alternatives
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    9. 2.9. Group and Capture Parts of the Match
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    10. 2.10. Match Previously Matched Text Again
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    11. 2.11. Capture and Name Parts of the Match
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. 2.12. Repeat Part of the Regex a Certain Number of Times
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    13. 2.13. Choose Minimal or Maximal Repetition
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    14. 2.14. Eliminate Needless Backtracking
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    15. 2.15. Prevent Runaway Repetition
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    16. 2.16. Test for a Match Without Adding It to the Overall Match
      1. Problem
      2. Solution
      3. Discussion
      4. Alternative to Lookbehind
      5. Solution Without Lookbehind
      6. See Also
    17. 2.17. Match One of Two Alternatives Based on a Condition
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    18. 2.18. Add Comments to a Regular Expression
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
    19. 2.19. Insert Literal Text into the Replacement Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    20. 2.20. Insert the Regex Match into the Replacement Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    21. 2.21. Insert Part of the Regex Match into the Replacement Text
      1. Problem
      2. Solution
      3. Discussion
      4. Solution Using Named Capture
      5. See Also
    22. 2.22. Insert Match Context into the Replacement Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
  5. 3. Programming with Regular Expressions
    1. Programming Languages and Regex Flavors
      1. Languages Covered in This Chapter
      2. More Programming Languages
    2. 3.1. Literal Regular Expressions in Source Code
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    3. 3.2. Import the Regular Expression Library
      1. Problem
      2. Solution
      3. Discussion
    4. 3.3. Create Regular Expression Objects
      1. Problem
      2. Solution
      3. Discussion
      4. Compiling a Regular Expression Down to CIL
      5. Discussion
      6. See Also
    5. 3.4. Set Regular Expression Options
      1. Problem
      2. Solution
      3. Discussion
      4. Additional Language-Specific Options
      5. See Also
    6. 3.5. Test If a Match Can Be Found Within a Subject String
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    7. 3.6. Test Whether a Regex Matches the Subject String Entirely
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    8. 3.7. Retrieve the Matched Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    9. 3.8. Determine the Position and Length of the Match
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    10. 3.9. Retrieve Part of the Matched Text
      1. Problem
      2. Solution
      3. Discussion
      4. Named Capture
      5. See Also
    11. 3.10. Retrieve a List of All Matches
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. 3.11. Iterate over All Matches
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    13. 3.12. Validate Matches in Procedural Code
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    14. 3.13. Find a Match Within Another Match
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    15. 3.14. Replace All Matches
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    16. 3.15. Replace Matches Reusing Parts of the Match
      1. Problem
      2. Solution
      3. Discussion
      4. Named Capture
      5. See Also
    17. 3.16. Replace Matches with Replacements Generated in Code
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    18. 3.17. Replace All Matches Within the Matches of Another Regex
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    19. 3.18. Replace All Matches Between the Matches of Another Regex
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    20. 3.19. Split a String
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    21. 3.20. Split a String, Keeping the Regex Matches
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    22. 3.21. Search Line by Line
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    23. Construct a Parser
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
  6. 4. Validation and Formatting
    1. 4.1. Validate Email Addresses
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    2. 4.2. Validate and Format North American Phone Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    3. 4.3. Validate International Phone Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    4. 4.4. Validate Traditional Date Formats
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    5. 4.5. Validate Traditional Date Formats, Excluding Invalid Dates
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    6. 4.6. Validate Traditional Time Formats
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    7. 4.7. Validate ISO 8601 Dates and Times
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    8. 4.8. Limit Input to Alphanumeric Characters
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    9. 4.9. Limit the Length of Text
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    10. 4.10. Limit the Number of Lines in Text
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    11. 4.11. Validate Affirmative Responses
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. 4.12. Validate Social Security Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    13. 4.13. Validate ISBNs
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    14. 4.14. Validate ZIP Codes
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    15. 4.15. Validate Canadian Postal Codes
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    16. 4.16. Validate U.K. Postcodes
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    17. 4.17. Find Addresses with Post Office Boxes
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    18. 4.18. Reformat Names From “FirstName LastName” to “LastName, FirstName”
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    19. 4.19. Validate Password Complexity
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    20. 4.20. Validate Credit Card Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. Extra Validation with the Luhn Algorithm
      5. See Also
    21. 4.21. European VAT Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
  7. 5. Words, Lines, and Special Characters
    1. 5.1. Find a Specific Word
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    2. 5.2. Find Any of Multiple Words
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    3. 5.3. Find Similar Words
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    4. 5.4. Find All Except a Specific Word
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    5. 5.5. Find Any Word Not Followed by a Specific Word
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    6. 5.6. Find Any Word Not Preceded by a Specific Word
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    7. 5.7. Find Words Near Each Other
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    8. 5.8. Find Repeated Words
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    9. 5.9. Remove Duplicate Lines
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    10. 5.10. Match Complete Lines That Contain a Word
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    11. 5.11. Match Complete Lines That Do Not Contain a Word
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. 5.12. Trim Leading and Trailing Whitespace
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    13. 5.13. Replace Repeated Whitespace with a Single Space
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    14. 5.14. Escape Regular Expression Metacharacters
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
  8. 6. Numbers
    1. 6.1. Integer Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    2. 6.2. Hexadecimal Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    3. 6.3. Binary Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    4. 6.4. Octal Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    5. 6.5. Decimal Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    6. 6.6. Strip Leading Zeros
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    7. 6.7. Numbers Within a Certain Range
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    8. 6.8. Hexadecimal Numbers Within a Certain Range
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    9. 6.9. Integer Numbers with Separators
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    10. 6.10. Floating-Point Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    11. 6.11. Numbers with Thousand Separators
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. 6.12. Add Thousand Separators to Numbers
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    13. 6.13. Roman Numerals
      1. Problem
      2. Solution
      3. Discussion
      4. Convert Roman Numerals to Decimal
      5. See Also
  9. 7. Source Code and Log Files
    1. Keywords
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    2. Identifiers
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    3. Numeric Constants
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    4. Operators
      1. Problem
      2. Solution
      3. Discussion
    5. Single-Line Comments
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    6. Multiline Comments
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    7. All Comments
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    8. Strings
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    9. Strings with Escapes
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    10. Regex Literals
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    11. Here Documents
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. Common Log Format
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    13. Combined Log Format
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    14. Broken Links Reported in Web Logs
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
  10. 8. URLs, Paths, and Internet Addresses
    1. 8.1. Validating URLs
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    2. 8.2. Finding URLs Within Full Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    3. 8.3. Finding Quoted URLs in Full Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    4. 8.4. Finding URLs with Parentheses in Full Text
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    5. 8.5. Turn URLs into Links
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    6. 8.6. Validating URNs
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    7. 8.7. Validating Generic URLs
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    8. 8.8. Extracting the Scheme from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    9. 8.9. Extracting the User from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    10. 8.10. Extracting the Host from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    11. 8.11. Extracting the Port from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    12. 8.12. Extracting the Path from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    13. 8.13. Extracting the Query from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    14. 8.14. Extracting the Fragment from a URL
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    15. 8.15. Validating Domain Names
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    16. 8.16. Matching IPv4 Addresses
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    17. 8.17. Matching IPv6 Addresses
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    18. 8.18. Validate Windows Paths
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    19. 8.19. Split Windows Paths into Their Parts
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    20. 8.20. Extract the Drive Letter from a Windows Path
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    21. 8.21. Extract the Server and Share from a UNC Path
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    22. 8.22. Extract the Folder from a Windows Path
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    23. 8.23. Extract the Filename from a Windows Path
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    24. 8.24. Extract the File Extension from a Windows Path
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    25. 8.25. Strip Invalid Characters from Filenames
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
  11. 9. Markup and Data Formats
    1. Processing Markup and Data Formats with Regular Expressions
      1. Basic Rules for Formats Covered in This Chapter
    2. 9.1. Find XML-Style Tags
      1. Problem
      2. Solution
      3. Discussion
      4. Skip Tricky (X)HTML and XML Sections
      5. See Also
    3. 9.2. Replace <b> Tags with <strong>
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    4. 9.3. Remove All XML-Style Tags Except <em> and <strong>
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    5. 9.4. Match XML Names
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    6. 9.5. Convert Plain Text to HTML by Adding <p> and <br> Tags
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    7. 9.6. Decode XML Entities
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    8. 9.7. Find a Specific Attribute in XML-Style Tags
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    9. 9.8. Add a cellspacing Attribute to <table> Tags That Do Not Already Include It
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    10. 9.9. Remove XML-Style Comments
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    11. 9.10. Find Words Within XML-Style Comments
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    12. 9.11. Change the Delimiter Used in CSV Files
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    13. 9.12. Extract CSV Fields from a Specific Column
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    14. 9.13. Match INI Section Headers
      1. Problem
      2. Solution
      3. Discussion
      4. Variations
      5. See Also
    15. 9.14. Match INI Section Blocks
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
    16. 9.15. Match INI Name-Value Pairs
      1. Problem
      2. Solution
      3. Discussion
      4. See Also
  12. Index
  13. About the Authors
  14. Colophon
  15. Copyright
O'Reilly logo

4.19. Validate Password Complexity

Problem

You’re tasked with ensuring that any passwords chosen by your website users meet your organization’s minimum complexity requirements.

Solution

The following regular expressions check many individual conditions, and can be mixed and matched as necessary to meet your business requirements. At the end of this section, we’ve included several JavaScript code examples that show how you can tie these regular expressions together as part of a password security validation routine.

Length between 8 and 32 characters

^.{8,32}$
Regex options: Dot matches line breaks (“^ and $ match at line breaks” must not be set)
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

Standard JavaScript doesn’t have a “dot matches line breaks” option. Use [\s\S] instead of a dot in JavaScript to ensure that the regex works correctly even for crazy passwords that include line breaks:

^[\s\S]{8,32}$
Regex options: None (“^ and $ match at line breaks” must not be set)
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

ASCII visible and space characters only

If this next regex matches a password, you can be sure it includes only the characters AZ, az, 09, space, and ASCII punctuation. No control characters, line breaks, or characters outside of the ASCII table are allowed:

^[\x20-\x7E]+$
Regex options: None (“^ and $ match at line breaks” must not be set)
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

If you want to additionally prevent the use of spaces, use ^[\x21-\x7E]+$ instead.

One or more uppercase letters

ASCII uppercase letters only:

[A-Z]
Regex options: None (“case insensitive” must not be set)
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Any Unicode uppercase letter:

\p{Lu}
Regex options: None (“case insensitive” must not be set)
Regex flavors: .NET, Java, PCRE, Perl, Ruby 1.9

If you want to check for the presence of any letter character (not limited to uppercase), enable the “case insensitive” option or use [A-Za-z]. For the Unicode case, you can use \p{L}, which matches any kind of letter from any language.

One or more lowercase letters

ASCII lowercase letters only:

[a-z]
Regex options: None (“case insensitive” must not be set)
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Any Unicode lowercase letter:

\p{Ll}
Regex options: None (“case insensitive” must not be set)
Regex flavors: .NET, Java, PCRE, Perl, Ruby 1.9

One or more numbers

[0-9]
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

One or more special characters

ASCII punctuation and spaces only:

[!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Anything other than ASCII letters and numbers:

[^A-Za-z0-9]
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Disallow three or more sequential identical characters

This next regex is intended to rule out passwords like 111111. It works in the opposite way of the others in this recipe. If it matches, the password doesn’t meet the condition. In other words, the regex only matches strings that repeat a character three times in a row.

(.)\1\1
Regex options: Dot matches line breaks
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby
([\s\S])\1\1
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Example JavaScript solution, basic

The following code combines five password requirements:

  • Length between 8 and 32 characters.

  • One or more uppercase letters.

  • One or more lowercase letters.

  • One or more numbers.

  • One or more special characters (ASCII punctuation or space characters).

function validate(password) {
    var minMaxLength = /^[\s\S]{8,32}$/,
        upper = /[A-Z]/,
        lower = /[a-z]/,
        number = /[0-9]/,
        special = /[ !"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]/;

    if (minMaxLength.test(password) &&
        upper.test(password) &&
        lower.test(password) &&
        number.test(password) &&
        special.test(password)
    ) {
        return true;
    }

    return false;
}

The validate function just shown returns true if the provided string meets the password requirements. Otherwise, false is returned.

Example JavaScript solution, with x out of y validation

This next example enforces a minimum and maximum password length (8–32 characters), and additionally requires that at least three of the following four character types are present:

  • One or more uppercase letters.

  • One or more lowercase letters.

  • One or more numbers.

  • One or more special characters (anything other than ASCII letters and numbers).

function validate(password) {
    var minMaxLength = /^[\s\S]{8,32}$/,
        upper = /[A-Z]/,
        lower = /[a-z]/,
        number = /[0-9]/,
        special = /[^A-Za-z0-9]/,
        count = 0;

    if (minMaxLength.test(password)) {
        // Only need 3 out of 4 of these to match
        if (upper.test(password)) count++;
        if (lower.test(password)) count++;
        if (number.test(password)) count++;
        if (special.test(password)) count++;
    }

    return count >= 3;
}

As before, this modified validate function returns true if the provided password meets the overall requirements. If not, it returns false.

Example JavaScript solution, with password security ranking

This final code example is the most complicated of the bunch. It assigns a positive or negative score to various conditions, and uses the regexes we’ve been looking at to help calculate an overall score for the provided password. The rankPassword function returns a number from 04 that corresponds to the password rankings “Too Short,” “Weak,” “Medium,” “Strong,” and “Very Strong”:

var rank = {
    TOO_SHORT: 0,
    WEAK: 1,
    MEDIUM: 2,
    STRONG: 3,
    VERY_STRONG: 4
};

function rankPassword(password) {
    var upper = /[A-Z]/,
        lower = /[a-z]/,
        number = /[0-9]/,
        special = /[^A-Za-z0-9]/,
        minLength = 8,
        score = 0;

    if (password.length < minLength) {
        return rank.TOO_SHORT; // End early
    }

    // Increment the score for each of these conditions
    if (upper.test(password)) score++;
    if (lower.test(password)) score++;
    if (number.test(password)) score++;
    if (special.test(password)) score++;

    // Penalize if there aren't at least three char types
    if (score < 3) score--;

    if (password.length > minLength) {
        // Increment the score for every 2 chars longer than the minimum
        score += Math.floor((password.length - minLength) / 2);
    }

    // Return a ranking based on the calculated score
    if (score < 3) return rank.WEAK; // score is 2 or lower
    if (score < 4) return rank.MEDIUM; // score is 3
    if (score < 6) return rank.STRONG; // score is 4 or 5
    return rank.VERY_STRONG; // score is 6 or higher
}

// Test it...
var result = rankPassword("password1"),
    labels = ["Too Short", "Weak", "Medium", "Strong", "Very Strong"];

alert(labels[result]); // -> Weak

Because of how this password ranking algorithm is designed, it can serve two purposes equally well. First, it can be used to give users guidance about the quality of their password while they’re still typing it. Second, it lets you easily reject passwords that don’t rank at whatever you choose as your minimum security threshold. For example, the condition if(result <= rank.MEDIUM) can be used to reject any password that isn’t ranked as “Strong” or “Very Strong.”

Discussion

Users are notorious for choosing simple or common passwords that are easy to remember. But easy to remember doesn’t necessarily translate into something that keeps their account and your company’s information safe. It’s therefore typically necessary to protect users from themselves by enforcing minimum password complexity rules. However, the exact rules to use can vary widely between businesses and systems, which is why this recipe includes numerous regexes that serve as the raw ingredients to help you cook up whatever combination of validation rules you choose.

Limiting each regex to a specific rule brings the additional benefit of simplicity. As a result, all of the regexes shown thus far are fairly straightforward. Following are a few additional notes on each of them:

Length between 8 and 32 characters

To require a different minimum or maximum length, change the numbers used as the upper and lower bounds for the quantifier {8,32}. If you don’t want to specify a maximum, use {8,}, or remove the $ anchor and change the quantifier to {8}.

All of the programming languages covered by this book provide a simple and efficient way to determine the length of a string. However, using a regex allows you to test both the minimum and maximum length at the same time, and makes it easier to mix and match password complexity rules by choosing from a list of regexes.

ASCII visible and space characters only

As mentioned earlier, this regex allows the characters AZ, az, 09, space, and ASCII punctuation only. To be more specific about the allowed punctuation characters, they are !, ", #, $, %, &, ', (, ), *, +, -, ., /, :, ;, <, =, >, ?, @, [, \, ], ^, _, `, {, |, }, ~, and comma. In other words, all the punctuation you can type using a standard U.S. keyboard.

Limiting passwords to these characters can help avoid character encoding related issues, but keep in mind that it also limits the potential complexity of your passwords.

Uppercase letters

To check whether the password contains two or more uppercase letters, use [A-Z].*[A-Z]. For three or more, use [A-Z].*[A-Z].*[A-Z] or (?:[A-Z].*){3}. If you’re allowing any Unicode uppercase letters, just change each [A-Z] in the preceding examples to \p{Lu}. In JavaScript, replace the dots with [\s\S].

Lowercase letters

As with the “uppercase letters” regex, you can check whether the password contains at least two lowercase letters using [a-z].*[a-z]. For three or more, use [a-z].*[a-z].*[a-z] or (?:[a-z].*){3}. If you’re allowing any Unicode lowercase letters, change each [a-z] to \p{Ll}. In JavaScript, replace the dots with [\s\S].

Numbers

You can check whether the password contains two or more numbers using [0-9].*[0-9], and [0-9].*[0-9].*[0-9] or (?:[0-9].*){3} for three or more. In JavaScript, replace the dots with [\s\S].

We didn’t include a listing for matching any Unicode decimal digit (\p{Nd}), because it’s uncommon to treat characters other than 09 as numbers (although readers who speak Arabic or Hindi might disagree!).

Special characters

Use the same principles shown for letters and numbers if you want to require more than one special character. For instance, using [^A-Za-z0-9].*[^A-Za-z0-9] would require the password to contain at least two special characters.

Note that [^A-Za-z0-9] is different than \W (the negated version of the \w shorthand for word characters). \W goes beyond [^A-Za-z0-9] by additionally excluding the underscore, which we don’t want to do here. In some regex flavors, \W also excludes any Unicode letter or decimal digit from any language.

Disallow three or more sequential identical characters

This regex matches repeated characters using backreferences to a previously matched character. Recipe 2.10 explains how backreferences work. If you want to disallow any use of repeated characters, change the regex to (.)\1. To allow up to three repeated characters but not four, use (.)\1\1\1 or (.)\1{3}.

Remember that you need to check whether this regular expression doesn’t match your subject text. A match would indicate that repeated characters are present.

Example JavaScript solutions

The three blocks of JavaScript example code each use this recipe’s regular expressions a bit differently.

The first example requires all conditions to be met or else the password fails. In the second example, acing the password test requires three out of four conditional requirements to be met. The third example, titled , is probably the most interesting. It includes a function called rankPassword that does what it says on the tin and ranks passwords by how secure they are. It can thus help provide a more user-friendly experience and encourage users to choose strong passwords.

The rankPassword function’s password ranking algorithm increments and decrements an internal password score based on multiple conditions. If the password’s length is less than the specified minimum of eight characters, the function returns early with the numeric equivalent of “Too Short.” Not including at least three character types incurs a one-point penalty, but this can be balanced out because every two additional characters after the minimum of eight adds a point to the running score.

The code can of course be customized to further improve it or to meet your particular requirements. However, it works quite well as-is, regardless of what you throw at it. As a sanity check, we ran it against several hundred of the known most common (and therefore most insecure) user passwords. All came out ranked as either “Too Short” or “Weak,” which is exactly what we were hoping for.

Caution

Using JavaScript to validate passwords in a web browser can be very beneficial for your users, but make sure to also implement your validation routine on the server. If you don’t, it won’t work for users who disable JavaScript or use custom scripts to circumvent your client-side validation.

Variations

Validate multiple password rules with a single regex

Up to this point, we’ve split password validation into discrete rules that can be tested using simple regexes. That’s usually the best approach. It keeps the regexes readable, and makes it easier to provide error messages that identify why a password isn’t up to code. It can even help you rank a password’s complexity, as we’ve seen. However, there may be times when you don’t care about all that, or when one regex is all you can use. In any case, it’s common for people to want to validate multiple password rules using a single regex, so let’s take a look at how it can be done. We’ll use the following requirements:

  • Length between 8 and 32 characters.

  • One or more uppercase letters.

  • One or more lowercase letters.

  • One or more numbers.

Here’s a regex that pulls it all off:

^(?=.{8,32}$)(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9]).*
Regex options: Dot matches line breaks (“^ and $ match at line breaks” must not be set)
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

This regex can be used with standard JavaScript (which doesn’t have a “dot matches line breaks” option) if you replace each of the five dots with [\s\S]. Otherwise, you might fail to match some valid passwords that contain line breaks. Either way, though, the regex won’t match any invalid passwords.

Notice how this regular expression puts each validation rule into its own lookahead group at the beginning of the regex. Because lookahead does not consume any characters as part of a match (see Recipe 2.16), each lookahead test runs from the very beginning of the string. When a lookahead succeeds, the regex moves along to test the next one, starting from the same position. Any lookahead that fails to find a match causes the overall match to fail.

The first lookahead, (?=.{8,32}$), ensures that any match is between 8 and 32 characters long. Make sure to keep the $ anchor after {8,32}, otherwise the match will succeed even when there are more than 32 characters. The next three lookaheads search one by one for an uppercase letter, lowercase letter, and digit. Because each lookahead searches from the beginning of the string, they use .* before their respective character classes. This allows other characters to appear before the character type that they’re searching for.

By following the approach shown here, it’s possible to add as many lookahead-based password tests as you want to a single regex, so long as all of the conditions are always required.

The .* at the very end of this regex is not actually required. Without it, though, the regex would return a zero-length empty string when it successfully matches. The trailing .* lets the regex include the password itself in successful match results.

Caution

It’s equally valid to write this regex as ^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9]).{8,32}$, with the length test coming after the lookaheads. Unfortunately, writing it this way triggers a bug in Internet Explorer 5.5–8 that prevents it from working correctly. Microsoft fixed the bug in the new regex engine included in IE9.

See Also

Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.2 explains how to match nonprinting characters. Recipe 2.3 explains character classes. Recipe 2.4 explains that the dot matches any character. Recipe 2.5 explains anchors. Recipe 2.7 explains how to match Unicode characters. Recipe 2.9 explains grouping. Recipe 2.10 explains backreferences. Recipe 2.12 explains repetition. Recipe 2.16 explains lookaround.

The best content for your career. Discover unlimited learning on demand for around $1/day.