zxcvbn: realistic password strength estimation

Posted by Dan Wheeler on April 10, 2012

Over the last few months, I’ve seen a password strength meter on almost every signup form I’ve encountered. Password strength meters are on fire.

Here’s a question: does a meter actually help people secure their accounts? It’s less important than other areas of web security, a short sample of which include:

  • Preventing online cracking with throttling or CAPTCHAs.
  • Preventing offline cracking by selecting a suitably slow hash function with user-unique salts.
  • Securing said password hashes.

With that disclaimer — yes. I’m convinced these meters have the potential to help. According to Mark Burnett’s 2006 book, Perfect Passwords: Selection, Protection, Authentication, which counted frequencies from a few million passwords over a variety of leaks, one in nine people had a password in this top 500 list. These passwords include some real stumpers: password1, compaq, 7777777, merlin, rosebud. Burnett ran a more recent study last year, looking at 6 million passwords, and found an insane 99.8% occur in the top 10,000 list, with 91% in the top 1,000. The methodology and bias is an important qualifier — for example, since these passwords mostly come from cracked hashes, the list is biased towards crackable passwords to begin with.

These are only the really easy-to-guess passwords. For the rest, I’d wager a large percentage are still predictable enough to be susceptible to a modest online attack. So I do think these meters could help, by encouraging stronger password decisions through direct feedback. But right now, with a few closed-source exceptions, I believe they mostly hurt. Here’s why.

Strength is best measured as entropy, in bits: it’s the number of times a space of possible passwords can be cut in half. A naive strength estimation goes like this:

# n: password length
# c: password cardinality: the size of the symbol space
#    (26 for lowercase letters only, 62 for a mix of lower+upper+numbers)
entropy = n * lg(c) # base 2 log

This brute-force analysis is accurate for people who choose random sequences of letters, numbers and symbols. But with few exceptions (shoutout to 1Password / KeePass), people of course choose patterns — dictionary words, spatial patterns like qwerty, asdf or zxcvbn, repeats like aaaaaaa, sequences like abcdef or 654321, or some combination of the above. For passwords with uppercase letters, odds are it’s the first letter that’s uppercase. Numbers and symbols are often predictable as well: l33t speak (3 for e, 0 for o, @ or 4 for a), years, dates, zip codes, and so on.

As a result, simplistic strength estimation gives bad advice. Without checking for common patterns, the practice of encouraging numbers and symbols means encouraging passwords that might only be slightly harder for a computer to crack, and yet frustratingly harder for a human to remember. xkcd nailed it:

As an independent Dropbox hackweek project, I thought it’d be fun to build an open source estimator that catches common patterns, and as a corollary, doesn’t penalize sufficiently complex passphrases like correcthorsebatterystaple. It’s now live on dropbox.com/register and available for use on github. Try the demo to experiment and see several example estimations.

The table below compares zxcvbn to other meters. The point isn’t to dismiss the others — password policy is highly subjective — rather, it’s to give a better picture of how zxcvbn is different.

qwER43@! Tr0ub4dour&3 correcthorsebatterystaple
zxcvbn
Dropbox (old)
Citibank
Bank of America (password not allowed) (password not allowed) (password not allowed)
Twitter
PayPal
eBay (password not allowed)
Facebook
Yahoo!
Gmail

A few notes:

  • I took these screenshots on April 3rd, 2012. I needed to crop the bar from the gmail signup form to make it fit in the table, making the difference in relative width more pronounced than on the form itself.
  • zxcvbn considers correcthorsebatterystaple the strongest password of the 3. The rest either consider it the weakest or disallow it. (Twitter gives about the same score for each, but if you squint, the scores are slightly different.)
  • zxcvbn considers qwER43@! weak because it’s a short QWERTY pattern. It adds extra entropy for each turn and shifted character.
  • The PayPal meter considers qwER43@! weak but aaAA11!! strong. Speculation, but that might be because it detects spatial patterns too.
  • Bank of America doesn’t allow passwords over 20 characters, disallowing correcthorsebatterystaple. Passwords can contain some symbols, but not & or !, disallowing the other two passwords. eBay doesn’t allow passwords over 20 characters either.
  • Few of these meters appear to use the naive estimation I opened with; otherwise correcthorsebatterystaple would have a high rating from its long length. Dropbox used to add points for each unique lowercase letter, uppercase letter, number, and symbol, up to a certain cap for each group. This mostly has the same only-works-for-brute-force problem, although it also checked against a common passwords dictionary. I don’t know the details behind the other meters, but a scoring checklist is another common approach (which also doesn’t check for many patterns).
  • I picked Troubadour to be the base word of the second column, not Troubador as occurs in xkcd, which is an uncommon spelling.

Installation

zxcvbn has no dependencies and works on ie7+/opera/ff/safari/chrome. The best way to add it to your registration page is:

<script type="text/javascript" src="zxcvbn-async.js">
</script>

zxcvbn-async.js is a measly 350 bytes. On window.load, after your page loads and renders, it’ll load zxcvbn.js, a fat 680k (320k gzipped), most of which is a dictionary. I haven’t found the script size to be an issue; since a password is usually not the first thing a user enters on a signup form, there’s plenty of time to load. Here’s a comprehensive rundown of crossbrowser asynchronous script loading.

zxcvbn adds a single function to the global namespace:

zxcvbn(password, user_inputs)

It takes one required argument, a password, and returns a result object. The result includes a few properties:

result.entropy            # bits
result.crack_time         # estimation of actual crack time, in seconds.
result.crack_time_display # same crack time, as a friendlier string:
                          # "instant", "6 minutes", "centuries", etc.
result.score              # 0, 1, 2, 3 or 4 if crack time is less than
                          # 10**2, 10**4, 10**6, 10**8, Infinity.
                          # (helpful for implementing a strength bar.)
result.match_sequence     # the detected patterns used to calculate entropy.
result.calculation_time   # how long it took to calculate an answer,
                          # in milliseconds. usually only a few ms.

The optional user_inputs argument is an array of strings that zxcvbn will add to its internal dictionary. This can be whatever list of strings you like, but it’s meant for user inputs from other fields of the form, like name and email. That way a password that includes the user’s personal info can be heavily penalized. This list is also good for site-specific vocabulary. For example, ours includes dropbox.

zxcvbn is written in CoffeeScript. zxcvbn.js and zxcvbn-async.js are unreadably closure-compiled, but if you’d like to extend zxcvbn and send me a pull request, the README has development setup info.

The rest of this post details zxcvbn’s design.

The model

zxcvbn consists of three stages: match, score, then search.

  • match enumerates all the (possibly overlapping) patterns it can detect. Currently zxcvbn matches against several dictionaries (English words, names and surnames, Burnett’s 10,000 common passwords), spatial keyboard patterns (QWERTY, Dvorak, and keypad patterns), repeats (aaa), sequences (123, gfedcba), years from 1900 to 2019, and dates (3-13-1997, 13.3.1997, 1331997). For all dictionaries, match recognizes uppercasing and common l33t substitutions.
  • score calculates an entropy for each matched pattern, independent of the rest of the password, assuming the attacker knows the pattern. A simple example: rrrrr. In this case, the attacker needs to iterate over all repeats from length 1 to 5 that start with a lowercase letter:

    entropy = lg(26*5) # about 7 bits
    
  • search is where Occam’s razor comes in. Given the full set of possibly overlapping matches, search finds the simplest (lowest entropy) non-overlapping sequence. For example, if the password is damnation, that could be analyzed as two words, dam and nation, or as one. It’s important that it be analyzed as one, because an attacker trying dictionary words will crack it as one word long before two. (As an aside, overlapping patterns are also the primary agent behind accidentally tragic domain name registrations, like childrens-laughter.com but without the hyphen.)

Search is the crux of the model. I’ll start there and work backwards.

Minimum entropy search

zxcvbn calculates a password’s entropy to be the sum of its constituent patterns. Any gaps between matched patterns are treated as brute-force “patterns” that also contribute to the total entropy. For example:

entropy("stockwell4$eR123698745") == surname_entropy("stockwell") +
                                     bruteforce_entropy("4$eR") +
                                     keypad_entropy("123698745")

That a password’s entropy is the sum of its parts is a big assumption. However, it’s a conservative assumption. By disregarding the “configuration entropy” — the entropy from the number and arrangement of the pieces — zxcvbn is purposely underestimating, by giving a password’s structure away for free: It assumes attackers already know the structure (for example, surname-bruteforce-keypad), and from there, it calculates how many guesses they’d need to iterate through. This is a significant underestimation for complex structures. Considering correcthorsebatterystaple, word-word-word-word, an attacker running a program like L0phtCrack or John the Ripper would typically try many simpler structures first, such as word, word-number, or word-word, before reaching word-word-word-word. I’m OK with this for three reasons:

  • It’s difficult to formulate a sound model for structural entropy; statistically, I don’t happen to know what structures people choose most, so I’d rather do the safe thing and underestimate.
  • For a complex structure, the sum of the pieces alone is often sufficient to give an “excellent” rating. For example, even knowing the word-word-word-word structure of correcthorsebatterystaple, an attacker would need to spend centuries cracking it.
  • Most people don’t have complex password structures. Disregarding structure only underestimates by a few bits in the common case.

With this assumption out of the way, here’s an efficient dynamic programming algorithm in CoffeeScript for finding the minimum non-overlapping match sequence. It runs in O(n·m) time for a length-n password with m (possibly overlapping) candidate matches.

# matches: the password's full array of candidate matches.
# each match has a start index (match.i) and an end index (match.j) into
# the password, inclusive.
minimum_entropy_match_sequence = (password, matches) ->
  # e.g. 26 for lowercase-only
  bruteforce_cardinality = calc_bruteforce_cardinality password
  up_to_k = []      # minimum entropy up to k.
  backpointers = [] # for the optimal sequence of matches up to k,
                    # holds the final match (match.j == k).
                    # null means the sequence ends w/ a brute-force char
  for k in [0...password.length]
    # starting scenario to try to beat:
    # adding a brute-force character to the minimum entropy sequence at k-1.
    up_to_k[k] = (up_to_k[k-1] or 0) + lg bruteforce_cardinality
    backpointers[k] = null
    for match in matches when match.j == k
      [i, j] = [match.i, match.j]
      # see if minimum entropy up to i-1 + entropy of this match is less
      # than the current minimum at j.
      candidate_entropy = (up_to_k[i-1] or 0) + calc_entropy(match)
      if candidate_entropy < up_to_k[j]
        up_to_k[j] = candidate_entropy
        backpointers[j] = match

  # walk backwards and decode the best sequence
  match_sequence = []
  k = password.length - 1
  while k >= 0
    match = backpointers[k]
    if match
      match_sequence.push match
      k = match.i - 1
    else
      k -= 1
  match_sequence.reverse()

  # fill in the blanks between pattern matches with bruteforce "matches"
  # that way the match sequence fully covers the password:
  # match1.j == match2.i - 1 for every adjacent match1, match2.
  make_bruteforce_match = (i, j) ->
    pattern: 'bruteforce'
    i: i
    j: j
    token: password[i..j]
    entropy: lg Math.pow(bruteforce_cardinality, j - i + 1)
    cardinality: bruteforce_cardinality
  k = 0
  match_sequence_copy = []
  for match in match_sequence # fill gaps in the middle
    [i, j] = [match.i, match.j]
    if i - k > 0
      match_sequence_copy.push make_bruteforce_match(k, i - 1)
    k = j + 1
    match_sequence_copy.push match
  if k < password.length # fill gap at the end
    match_sequence_copy.push make_bruteforce_match(k, password.length - 1)
  match_sequence = match_sequence_copy

  # or 0 corner case is for an empty password ''
  min_entropy = up_to_k[password.length - 1] or 0
  crack_time = entropy_to_crack_time min_entropy

  # final result object
  password: password
  entropy: round_to_x_digits min_entropy, 3
  match_sequence: match_sequence
  crack_time: round_to_x_digits crack_time, 3
  crack_time_display: display_time crack_time
  score: crack_time_to_score crack_time

backpointers[j] holds the match in this sequence that ends at password position j, or null if the sequence doesn’t include such a match. Typical of dynamic programming, constructing the optimal sequence requires starting at the end and working backwards.

Especially because this is running browser-side as the user types, efficiency does matter. To get something up and running I started with the simpler O(2m) approach of calculating the sum for every possible non-overlapping subset, and it slowed down quickly. Currently all together, zxcvbn takes no more than a few milliseconds for most passwords. To give a rough ballpark: running Chrome on a 2.4 GHz Intel Xeon, correcthorsebatterystaple took about 3ms on average. coRrecth0rseba++ery9/23/2007staple$ took about 12ms on average.

Threat model: entropy to crack time

Entropy isn’t intuitive: How do I know if 28 bits is strong or weak? In other words, how should I go from entropy to actual estimated crack time? This requires more assumptions in the form of a threat model. Let’s assume:

  • Passwords are stored as salted hashes, with a different random salt per user, making rainbow attacks infeasible.
  • An attacker manages to steal every hash and salt. The attacker is now guessing passwords offline at max rate.
  • The attacker has several CPUs at their disposal.

Here’s some back-of-the-envelope numbers:

# for a hash function like bcrypt/scrypt/PBKDF2, 10ms is a safe lower bound
# for one guess. usually a guess would take longer -- this assumes fast
# hardware and a small work factor. adjust for your site accordingly if you
# use another hash function, possibly by several orders of magnitude!
SINGLE_GUESS = .010 # seconds
NUM_ATTACKERS = 100 # number of cores guessing in parallel.

SECONDS_PER_GUESS = SINGLE_GUESS / NUM_ATTACKERS

entropy_to_crack_time = (entropy) ->
  .5 * Math.pow(2, entropy) * SECONDS_PER_GUESS

I added a .5 term because we’re measuring the average crack time, not the time to try the full space.

This math is perhaps overly safe. Large-scale hash theft is a rare catastrophe, and unless you’re being specifically targeted, it’s unlikely an attacker would dedicate 100 cores to your single password. Normally an attacker has to guess online and deal with network latency, throttling, and CAPTCHAs.

Entropy calculation

Up next is how zxcvbn calculates the entropy of each constituent pattern. calc_entropy() is the entry point. It’s a simple dispatch:

calc_entropy = (match) ->
  return match.entropy if match.entropy?
  entropy_func = switch match.pattern
    when 'repeat'     then repeat_entropy
    when 'sequence'   then sequence_entropy
    when 'digits'     then digits_entropy
    when 'year'       then year_entropy
    when 'date'       then date_entropy
    when 'spatial'    then spatial_entropy
    when 'dictionary' then dictionary_entropy
  match.entropy = entropy_func match

I gave an outline earlier for how repeat_entropy works. You can see the full scoring code on github, but I’ll describe two other scoring functions here to give a taste: spatial_entropy and dictionary_entropy.

Consider the spatial pattern qwertyhnm. It starts at q, its length is 9, and it has 3 turns: the initial turn moving right, then down-right, then right. To parameterize:

s # number of possible starting characters.
  # 47 for QWERTY/Dvorak, 15 for pc keypad, 16 for mac keypad.
L # password length. L >= 2
t # number of turns. t <= L - 1
  # for example, a length-3 password can have at most 2 turns, like "qaw".
d # average "degree" of each key -- the number of adjacent keys.
  # about 4.6 for QWERTY/Dvorak. (g has 6 neighbors, tilda only has 1.)

The space of total possibilities is then all possible spatial patterns of length L or less with t turns or less:

\lg \left ( \sum_{i=2}^{L} \sum_{j=1}^{\min{t,i - 1}} {{i - 1}\choose{j - 1}} s d^{j} \right )

(i – 1) choose (j – 1) counts the possible configurations of turn points for a length-i spatial pattern with j turns. The -1 is added to both terms because the first turn always occurs on the first letter. At each of j turns, there’s d possible directions to go, for a total of dj possibilities per configuration. An attacker would need to try each starting character too, hence the s. This math is only a rough approximation. For example, many of the alternatives counted in the equation aren’t actually possible on a keyboard: for a length-5 pattern with 1 turn, “start at q moving left” gets counted, but isn’t actually possible.

CoffeeScript allows natural expression of the above:

lg = (n) -> Math.log(n) / Math.log(2)

nPk = (n, k) ->
  return 0 if k > n
  result = 1
  result *= m for m in [n-k+1..n]
  result

nCk = (n, k) ->
  return 1 if k == 0
  k_fact = 1
  k_fact *= m for m in [1..k]
  nPk(n, k) / k_fact

spatial_entropy = (match) ->
  if match.graph in ['qwerty', 'dvorak']
    s = KEYBOARD_STARTING_POSITIONS
    d = KEYBOARD_AVERAGE_DEGREE
  else
    s = KEYPAD_STARTING_POSITIONS
    d = KEYPAD_AVERAGE_DEGREE
  possibilities = 0
  L = match.token.length
  t = match.turns
  # estimate num patterns w/ length L or less and t turns or less.
  for i in [2..L]
    possible_turns = Math.min(t, i - 1)
    for j in [1..possible_turns]
      possibilities += nCk(i - 1, j - 1) * s * Math.pow(d, j)
  entropy = lg possibilities
  # add extra entropy for shifted keys. (% instead of 5, A instead of a.)
  # math is similar to extra entropy from uppercase letters in dictionary
  # matches, see the next snippet below.
  if match.shifted_count
    S = match.shifted_count
    U = match.token.length - match.shifted_count # unshifted count
    possibilities = 0
    possibilities += nCk(S + U, i) for i in [0..Math.min(S, U)]
    entropy += lg possibilities
  entropy

On to dictionary entropy:

dictionary_entropy = (match) ->
  entropy = lg match.rank
  entropy += extra_uppercasing_entropy match
  entropy += extra_l33t_entropy match
  entropy

The first line is the most important: The match has an associated frequency rank, where words like the and good have low rank, and words like photojournalist and maelstrom have high rank. This lets zxcvbn scale the calculation to an appropriate dictionary size on the fly, because if a password contains only common words, a cracker can succeed with a smaller dictionary. This is one reason why xkcd and zxcvbn slightly disagree on entropy for correcthorsebatterystaple (45.2 bits vs 44). The xkcd example used a fixed dictionary size of 211 (about 2k words), whereas zxcvbn is adaptive. Adaptive sizing is also the reason zxcvbn.js includes entire dictionaries instead of a space-efficient Bloom filter — rank is needed in addition to a membership test.

I’ll explain how frequency ranks are derived in the data section at the end. Uppercasing entropy looks like this:

extra_uppercase_entropy = (match) ->
  word = match.token
  return 0 if word.match ALL_LOWER
  # a capitalized word is the most common capitalization scheme,
  # so it only doubles the search space (uncapitalized + capitalized):
  # 1 extra bit of entropy.
  # allcaps and end-capitalized are common enough too,
  # underestimate as 1 extra bit to be safe.
  for regex in [START_UPPER, END_UPPER, ALL_UPPER]
    return 1 if word.match regex
  # otherwise calculate the number of ways to capitalize
  # U+L uppercase+lowercase letters with U uppercase letters or less.
  # or, if there's more uppercase than lower (for e.g. PASSwORD), the number
  # of ways to lowercase U+L letters with L lowercase letters or less.
  U = (chr for chr in word.split('') when chr.match /[A-Z]/).length
  L = (chr for chr in word.split('') when chr.match /[a-z]/).length
  possibilities = 0
  possibilities += nCk(U + L, i) for i in [0..Math.min(U, L)]
  lg possibilities

So, 1 extra bit for first-letter-uppercase and other common capitalizations. If the uppercasing doesn’t fit these common molds, it adds:

\lg \left ( \sum_{i=1}^{min(U, L)} {{U + L}\choose{i}} \right )

The math for l33t substitution is similar, but with variables that count substituted and unsubstituted characters instead of uppers and lowers.

Pattern matching

So far I covered pattern entropy, but not how zxcvbn finds patterns in the first place. Dictionary match is straightforward: check every substring of the password to see if it’s in the dictionary:

dictionary_match = (password, ranked_dict) ->
  result = []
  len = password.length
  password_lower = password.toLowerCase()
  for i in [0...len]
    for j in [i...len]
      if password_lower[i..j] of ranked_dict
        word = password_lower[i..j]
        rank = ranked_dict[word]
        result.push(
          pattern: 'dictionary'
          i: i
          j: j
          token: password[i..j]
          matched_word: word
          rank: rank
	)
  result

ranked_dict maps from a word to its frequency rank. It’s like an array of words, ordered by high-frequency-first, but with index and value flipped. l33t substitutions are detected in a separate matcher that uses dictionary_match as a primitive. Spatial patterns like bvcxz are matched with an adjacency graph approach that counts turns and shifts along the way. Dates and years are matched with regexes. Hit matching.coffee on github to read more.

Data

As mentioned earlier, the 10k password list is from Burnett, released in 2011.

Frequency-ranked names and surnames come from the freely available 2000 US Census. To help zxcvbn not crash ie7, I cut off the surname dictionary, which has a long tail, at the 80th percentile (meaning 80% of Americans have one of the surnames in the list). Common first names include the 90th percentile.

The 40k frequency list of English words comes from a project on Wiktionary, which counted about 29M words across US television and movies. My hunch is that of all the lists I could find online, television and movie scripts will capture popular usage (and hence likely words used in passwords) better than other sources of English, but this is an untested hypothesis. The list is a bit dated; for example, Frasier is the 824th most common word.

Conclusion

At first glance, building a good estimator looks about as hard as building a good cracker. This is true in a tautological sort of way if the goal is accuracy, because “ideal entropy” — entropy according to a perfect model — would measure exactly how many guesses a given cracker (with a smart operator behind it) would need to take. The goal isn’t accuracy, though. The goal is to give sound password advice. And this actually makes the job a bit easier: I can take the liberty of underestimating entropy, for example, with the only downside of encouraging passwords that are stronger than they need to be, which is frustrating but not dangerous.

Good estimation is still difficult, and the main reason is there’s so many different patterns a person might use. zxcvbn doesn’t catch words without their first letter, words without vowels, misspelled words, n-grams, zipcodes from populous areas, disconnected spatial patterns like qzwxec, and many more. Obscure patterns (like Catalan numbers) aren’t important to catch, but for each common pattern that zxcvbn misses and a cracker might know about, zxcvbn overestimates entropy, and that’s the worst kind of bug. Possible improvements:

  • zxcvbn currently only supports English words, with a frequency list skewed toward American usage and spelling. Names and surnames, coming from the US census, are also skewed. Of the many keyboard layouts in the world, zxcvbn recognizes but a few. Better country-specific datasets, with an option to choose which to download, would be a big improvement.
  • As this study by Joseph Bonneau attests, people frequently choose common phrases in addition to common words. zxcvbn would be better if it recognized “Harry Potter” as a common phrase, rather than a semi-common name and surname. Google’s n-gram corpus fits in a terabyte, and even a good bigram list is impractical to download browser-side, so this functionality would require server-side evaluation and infrastructure cost. Server-side evaluation would also allow a much larger single-word dictionary, such as Google’s unigram set.
  • It’d be better if zxcvbn tolerated misspellings of a word up to a certain edit distance. That would bring in many word-based patterns, like skip-the-first-letter. It’s hard because word segmentation gets tricky, especially with the added complexity of recognizing l33t substitutions.

Even with these shortcomings, I believe zxcvbn succeeds in giving better password advice in a world where bad password decisions are widespread. I hope you find it useful. Please fork on github and have fun!

Big thanks to Chris Varenhorst, Gautam Jayaraman, Ben Darnell, Alicia Chen, Todd Eisenberger, Kannan Goundan, Chris Beckmann, Rian Hunter, Brian Smith, Martin Baker, Ivan Kirigin, Julie Tung, Tido the Great, Ramsey Homsany, Bart Volkmer and Sarah Niyogi for helping review this post.

  • Anonymous

    “Here’s a question: does a meter actually help people secure their accounts?”

    I was happy to see you start with this question, but I don’t think it was actually meaningfully answered in this post.

    The evidence you referenced does show that users choose common or easily-guessable passwords. However, do we have any evidence that password meters actually help? Regardless of how much work is put into building an accurate password meter, is there any evidence that shows that users pay attention to the meter or come up with better passwords because of it?

    • Dan Wheeler

      I don’t have data, that’d make a great usability study. That being said I do think they can help by giving direct, immediate (entertaining?) feedback — more so than a password policy listed on a help page, for example.

    • http://roryokane.com/ Rory

      At the very least, tools like these can help security-savvy people choose memorable-but-safe new passwords. That is, not on a website signup page, but on the zxcvbn demo page, purposefully trying to choose a strong password. Perhaps parents should encourage their children to choose passwords that zxcvbn rates as secure.

  • http://blog.justindorfman.com jdorfman

    This is amazing. Thanks, now I need to change *some* of my passwords…

  • http://dragon-lair.net/ Ahmed

    This is great! I will be using this :)

  • http://www.goldmark.org/jeff/ Jeffrey Goldberg

    This is the first strength meter I’ve seen that explicitly attempts to take into account the crucial fact that the strength of a password depends on the number of alternative passwords that the same system could have resulted in.

    So thank you very much for that.

    I’ve discussed this issue here:  http://blog.agilebits.com/2011/08/10/better-master-passwords-the-geek-edition/and recommend a diceware-like scheme for those passwords that people have to remember. (For others, I recommend a good password management system, and so I should disclose that I work for AgileBits, the makers of 1Password.)

    Cheers,

    -j

    • Dan Wheeler

      Thanks Jeffrey!

    • http://www.facebook.com/martin.max.7906 Martin Max

      Of couse, you are right.
      http://goo.gl/rQi32

  • http://profiles.google.com/perthorsheim Per Thorsheim

    Dan; more password meters – and passwords – you can use for your evaluation:

    http://securitynirvana.blogspot.com/2010/11/revisiting-password-meters.html 

    Please also consider submitting a proposal here:http://securitynirvana.blogspot.com/2012/04/passwords12-call-for-presentations.html Best regards,
    Per Thorsheim
    securitynirvana.blogspot.com

  • Pingback: Sane Password Strength Guidelines : Charles Gardner, Sterling Ideas Inc, IT Consultant

  • http://www.facebook.com/caveman.og Brian McNett

     I did a few dry runs using your tester with the system I use for generating secure, yet memorable passwords, which is a variation on Randal Munroe’s “correcthorsebatterystaple”.   I’m able to run between 70 and 90 bits of entropy, which is, I think sufficient for most purposes. My very best attempt hit 161 bits of entropy, and yes, I think I can remember that one too.

    Your limited client-side dictionary choked on some of the words in my best effort, and failed to recognize some uncommon words in my others, so it’s possible I’ve exceeded the limitations of your code.

    Bank of America disallowing some special characters in their passwords leads one to believe that they have issues sanitizing data inputs, which rather than fix, they’ve coded around. This results in a policy both overly restrictive, and quixotically, less secure. Oh, and it reveals they have unsanitized data inputs. Useful information for someone cracking their systems.

    At any rate, quite impressive.

  • Alankeister

    You may have a bug.  ”nothing to see here..” takes centuries.  But  ”nothing to see here…” with one more period, takes 43 years.

    • Dan Wheeler

      To keep candidate matches down, I currently only consider repeat patterns with three letters or more. I’ll change that to fix this problem!

  • http://twitter.com/adelcambre Andy Delcambre

    There is an interesting detail that one letter “words” like I and A count as dictionary words. So when they occur in random passwords they significantly bring down the entropy. 

    Compare these two password: qH3ipizm vs qH3fpfzm

    The only difference is that the first has I’s and the second has F’s in their place. 

    The first has an entropy of 35.489 and the second 46.692. Seems a bit off. 

    http://acd.io/FjYl

    http://acd.io/FjOE

    • Dan Wheeler

      The main problem at play here is I’m not adding “structural entropy” — entropy for each additional pattern that is added. As a result random character passwords are being overfitted instead of recognized as a single bruteforce region. 

      I still think it’s important to match single-letter words. For passphrases that contain the trigram “i am a”, for example, i’d like to give low entropy for “i” and “a” since zxcvbn doesn’t have an n-gram list to work with.

      Thanks for the feedback!

  • http://xato.net/ Mark Burnett

    First, glad to see someone else who has spent way too much time thinking about passwords!

    What you are trying to accomplish is actually much harder than most people realize and what you have done so far is pretty impressive. I have attempted this myself before and failed to come up with anything that actually works.

    Here’s a discrepancy to consider:
    aaaaaaaaaaz = crack time (seconds):
    0.338
    aaazaaaaaaa = crack time (seconds):
    18.455
    aaaZaaaaaaa = crack time (seconds):
    60.333
    azazazazazaz = crack time (seconds):
    3896630.102

    Here’s another:
    do you know? = crack time (seconds):
    3696.822
    do:you-kknow = crack time (seconds):
    3696.822
    do yuo know? = crack time (seconds):
    759249605.538

    And another:
    mountain = crack time (seconds):
    0.011
    niatnuom = crack time (seconds):
    70100.118

    I think all your logic is sound, you just need to take into consideration common cracking rules (see http://hashcat.net/wiki/rule_based_attack). Dictionary and brute-force you have covered well, now just take into consideration mutations.

     

    • Dan Wheeler

      Thanks Mark! Means a lot coming from you. Transformations like rotations, reflections, and range deletions get difficult to segment efficiently browser-side, but there’s definitely some easier improvements I’ll be working on to start with.

    • http://profiles.google.com/perthorsheim Per Thorsheim

      We meet again Mark. :)

      One of the things I’m a little surprised about with password meters such as this is the apparent lack of differentiating between online and offline password cracking attempts.

      No need to say, online attacks, lets say against a https webserver, cannot reach the massive number of attempts/second as we can do using a cheap graphics card offline against MD5, LM, NTLM or SHA-1…. or a bunch of others as well.
      While establishing online services without any type of account lockout or rate-limiting for logon attempts is pretty stupid to do these days, they do exist all over the internet.

      For offline attempts to crack password encryption or hash algorithms, with or without salt added to it (or more advanced systems), someone has to get access to the stored secrets. Usually if that happens, we’re already at “game over” status for the service owner. The next target is turning those low-value secrets into high-value resellable data by cracking the passwords and selling them off with usernames, e-mail addresses etc, in order to search for password reuse. Which, as we all know too well, happens all over the place. 67% of us is assumed to do it, according to Troy Hunt (sony/gawker analysis).

      I’m also surprised that many password meters does not ask, care or compare the password against the username, fullname, e-mail address and so forth. Isolated the password may show high entropy (or whatever flavor of password strength evaluation you choose), but still the password should be considered … mediocre or even complete crap (username = password). As far as I remember the rockyou list did contain e-mail addresses, although I never tried myself to confirm if those were actually passwords used for accounts with the same name.

      • http://xato.net/ Mark Burnett

        This illustrates just how complex it gets the more you think about it. If you are going to consider username and email address, why not also take into consideration other user info? What shows up on their facebook page? What about their answers to secret questions? And how about the user’s password history? Or as Dan mentioned, check Google’s n-gram corpus? And what about other languages? And what about crackability vs guessability?

        A password meter really can’t tell you for sure if a password is secure, although it can tell if it knows a particular password is weak. But ultimately the security of any particular password is unique to each user.

      • Dan Wheeler

        Hey Per,

        Focusing on an offline attack scenario is just meant to keep the math conservative, so it covers online attacks too.

        The second argument to zxcvbn(), user_inputs, is used to penalize passwords that contain a user’s full name, email, etc. 

        Good to hear from you

    • http://profiles.google.com/rauldmiller Raul Miller

      I think that assuming a specific search algorithm would be overspecifying the problem.  Keep in mind that a good password implementation should set off an alarm for query patterns corresponding to obvious search algorithms (even when the end points are distributed).

  • http://pulse.yahoo.com/_RJZRADMWDF7SJUEZGAABH6KKZM fook u

    the end of the day, learn another language possibly korean/japanese/chinese/arabic/swedish/hebrew/hindi and combine your multi-talented-linguistic to create password.  Want some advise about easy-remember password? pick any line from the bible/holy book of your religion and I can guarantee you that they won’t know (i.e. favorite mantras/gospels) 

  • Cliff Frey

    Excellent article and library.
    I think that you have a mistake in the stated entropy of the dictionary used for correcthorsebatterystaple.  The article says that it came from a dictionary of 2^16 words, but the xkcd cartoon assumes that it came from a dictionary of 2^11 words.

    • Dan Wheeler

      fail, thanks, fixed

  • Jens Günther

    It is possible to expand the example with other dictionaries, eg.: spanish, french, german etc…? 

    • Anonymous

      Yeah that would be even more difficult to crack, especially if you combine multiple languages!

  • http://twitter.com/stebehn Stephan

    BTW a similar project: http://howsecureismypassword.net/

  • Grégory Joseph

    Dude, I wanted to try it out in the context of DropBox (i.e to see how it looks like); aren’t you guys going to add this to the “change password” screen too ? ;)
    Thanks for the article and for open-sourcing though, great read and great project !

  • Michal Gebauer

    Password created from more concatenated words is not a solution to brute force character cracking, because instead of alphabet from single characters you use  alphabet of multiple characters.

    Basically using substitution you end up with same problem.

    • Joost

      An alphabet of multiple characters is way bigger than 26 entries, though, and quickly becomes infeasible for a brute forcer. In the end you’re just overfitting on passwords as you near an alphabet with entries of length 28. Then you do find correcthorsebatterystaple, but there are 26^28 entries in your alphabet.

    • Fadzlan

      Yes, you can brute force with characters. And you can brute force with words.

      Taking account of ASCII, one character represents 95 maximum permutation of printable characters. Hence for any 12 character password, there are 540,360,087,662,637,000,000,000 possible permutations.

      As for English language, for any 4 words password, there are possibility of 148,257,516,582,243,000,000,000,000 permutations, considering there are 171,476 permutations per word.

      I would argue that it will be easier to remember the 4 words password rather than any random characters of 12 words. As for brute force concern, if any of the 4 words passwords are longer than 12 characters, it will be more secure despite being made of dictionary words as some might say.

      And if you put some symbols between the words or … within the words, the maximum permutation will be significantly higher.

      I suppose that what it means by entropy right?

      • Anonymous

        In most cases you know the min.- and maxlength of a password for a given site. For passwords following the XKCD-Pattern, we can assume that:
        - Most passwords are longer than two words (Else it would be too easy to guess)
        - Most passwords are shorter than five words (Else it would be too hard to remember and most likely hit the max length restriction)
        - The password fits completely into the maxlength-restriction, i.e. the last word used is not truncated
        - Nouns are far more likely than other words
        - The words used have to be known by the users (obviously), but the vast majority of people have a very limited vocabulary (I’ve read somewhere that in extreme cases, some of us Germans use less than 500 different words frequently)
        - Long words are less likely than others, as they could cause the password to get too long

        With these assumptions, you could go and compile a list of passwords which are far more likely than others and use it as a starting point for the brute-force.

        • http://ploogy.com/ Adam Brenecki

          I think anyone recommending the XKCD technique would make a point of saying that it doesn’t work unless you let a computer randomly generate the password (such as on passphra.se). The comic assumes a random choice from a list of 2048 common words.

          Plus, I don’t think many sites that hash their passwords enforce upper length limits. (In fact, I’m always suspicious when they do.)

    • Anonymous

      I’ve thought about that too, but if you do the calculations then it actually takes longer to try and guess 4 correct patterns of the total English words than trying to guess random 20 or so combinations of alphabets.

  • http://www.thomasvanhoutte.be/ Thomas Vanhoutte

    Nice calculations :-)

  • Benedikt

    I hope you added correcthorsebatterystaple to the list of weak passwords after publishing this post? ;-)

    • Anonymous

      They did, and it displays a special message when you try :)

  • Anonymous

    I am surprised to see that https://www.grc.com/haystack.htm has not been brought up yet, it is another great resource for password evaluation.

    • http://twitter.com/greatwhitehat Jimmy Johansson

      Password Haystacks from Mr. Gibson made hard-cracked passwords easy to remember. 12349876Facebook12349876%12349876 is easy to remember, but takes time to crack.

      But there is more to this. Online attacks should be taken out by automatic IP-block on too many failed attempts. And the hashing and salting should be done correctly to make offline attacks too time consuming (rainbow tables).

    • http://www.goldmark.org/jeff/ Jeffrey Goldberg

      I have to disagree about Haystack for strength checking. Although it does mention it in the pages, the password strength measure crucially fails to graph the principle that:

      The strength of a password creation system is not how many letters, digits, and symbols you end up with, but how many ways you could get a different result using the same system.

  • Craig Jones

    The assumption, too, is that the website is storing a hashed version of the password and not a plaintext version.  Twice in the past couple of months I have “requested a password” and they send me back my password in *plaintext*. One of the websites is a secure site for storing patient information here in Maryland (I am going to write them, it is stupid, they should fire the vendor or developer).

  • http://chxor.chxo.com/ Anonymous

    Hey, nice!

    I built an open source password generator inspired by the same xkcd panel, you can use it at http://chxo.com/widgets/wordkey/

    There is also an html5 version that generates the passwords client-side so you don’t have to trust the server.
    http://chxo.com/widgets/wordkey/wordkey.html

  • Poopscumbucket

    Sometimes password fields are truncated, adding to short passwords.

  • look who’s talking

    I wish I didn’t need to see user names with offensive words no matter how good the contents are.

  • Pingback: matts perl | Pearltrees

  • http://twitter.com/bfred_it Federico Brigante

    Come on guys, even you know it’s not that great! :P

  • http://twitter.com/bfred_it Federico Brigante

    Come on guys, even you know it’s not that great! :P

  • http://blog.plover.com/ Mark Dominus

    Thanks for this excellent article.

    Your nCk function can be greatly improved. The version you have is extremely susceptible to arithmetic overflow even when the final result is small.  For example, nCk(20,18) = 380 is calculated as 20!/18! = 2432902008176640000 / 6402373705728000. Happily, there is a slightly simpler and more efficient algorithm, based on the recurrence nCk(n,k) = (n/k)*nCk(n-1,k-1), that does not suffer from this problem. Code is here: http://blog.plover.com/math/choose.html .

    • Dan Wheeler

      Thanks Mark! Pushed a fix adapted from your article. 

  • Wolf Software

    This MIGHT be of use to some of you also:

     http://www.wolf-software.com/downloads/php-classes/security-classes/brute-force-calculator-class/

  • Anonymous

    Great LIB. Thanks for sharing.

    And this need a Improve, on date pattern.
    13.3.1997 => date (MM/dd/yyyy)
    3.13.1997 => date (dd/MM/yyyy) 
    1331997 => NOT date
    3131997 => NOT date 
    19970313 => NOT date (ISO Date)
    1997.03.13 => NOT date (ISO Date) 

    https://en.wikipedia.org/wiki/Date_formats

    • Dan Wheeler

      thanks adad, date matching is now much improved, and includes all of your examples. 

  • http://twitter.com/LMDDGTFY LMDDGTFY

    Some i10n/l18n based on some browser/OS parameter would make this perfect

  • http://www.n8w.com/f/f/diqus Nate Williams

    Very interesting read. 

    The main reason people use easy-to-crack passwords is because they are easy to remember and the users are not informed about what makes a “good” password.

    Even a “good” password doesn’t protect you if you use the same password on multiple sites and one of those sites database gets hacked and the passwords are stored in clear text. The hacker can try your email and passwords on more important sites (paypal, banking,etc)A meter doesn’t really tell you how to create a strong password .. it just shows you whether it’s strong or not .. and then you can fiddle around with it until its very strong and very hard to remember.  I think the thing to do is to educate users on how they can strong unique passwords for multiple sites 

    example For passwords think of an algorithm that can be easily applied to all domain names. You will have to remember your algorithm, but once you know it , it will be easy to apply to any site and a lot easier to remember.Here are some algorithm examplesfirst letter of site +”2m0nkey!5″+ number of vowels + third letter of web site + “215″position of the  first vowel + “sHell02″ +   last letter + “06″ + second letternumber of letters + “t0mmy26″+ second to last letter + last voweletcExamplesSo if I am using the algorithm below, here would be my login information for each sitethe position of the first vowel + “sHell02″ + last letter + “06″ + second letterweb site: http://www.privatebanking.comusername: johndoe+privatebanking@gmail.compassword: 3sHell02g06rweb site: http://www.mywebsite.comusername:  johndoe +mywebsite@gmail.compassword: 4sHell02e06yweb site: http://www.emailsite.comusername:  johndoe +emailsite@gmail.compassword: 1sHell02e06m

  • Michael

    why is zyxwvutsrqponmlkjihgfedcb instant and zyxwvutsrqponmlkjihgfedcba  centuries?

    • Doug

      a drunk person can do that

  • Sheharyar Naseer

    Pretty Cool!

  • Intermernet

    Brilliant. I love the result of trying to use “correcthorsebatterystaple” as a dropbox password now :-)

    • http://www.gerv.net/ Gervase Markham

       Although if you try and use it with spaces, as the comic actually suggests, then you get a “Great!”.

      Also, there’s a message for “Tr0ub4dor&3″.

  • Pingback: Dropbox tech blog » Blog Archive » zxcvbn: realistic password strength estimation « wHo-et-Be

  • Pingback: And it’s all about security. | DocMan

  • Pingback: Dropbox ha reso open source il sistema di valutazione delle password | RampaCrew

  • Pingback: Hoe maak je een veilig wachtwoord? Zo dus. | Jaap Walhout

  • Pingback: Dropbox ha reso open source il sistema di valutazione delle password | Indipedia – Indipendenti nella rete

  • Pingback: zxcvbn: validazione delle password in javascript dal team di DropBox « Andrea Fortuna

  • Anonymous

    My first Hotmail password in the 90′s had 12 letters, mwahaha.

  • Adam

    I currently use the password_strength rails gem for the password checking, which has the same results as your old validation (stro, strong, so-so). I’d love to use your new validation, is there a way to have the check on both client and server side? Otherwise you’d just need to disable javascript (I browse with it always off for most websites) and any password would be accepted.
    Anyone who is already porting it to the rails world?

  • http://twitter.com/smallhadron Mark W.

    Great article and some good ideas. Been trying to implement some of this stuff at howsecureismypassword.net, but with very little free time I’ve not got round to a lot of it. Need to set aside a few days some time soon.

    • Alex Howell

      It’s sensible to immediately type your newly crafted password into a random website? (http://howsecureismypassword.net/) For all I know, the site just takes it and adds it into a cracker’s dictionary….?

      Surely http://howsecureismypassword.net/ should just have a sign saying “no matter how secure your password was, it will be a damn sight less if you type it in here”?

  • Sarahw

    I logged into my dropbox this morning to get a file I needed for school, and all the icons are missing? Does anyone know why this would be?

  • Pingback: Dropbox tech blog » zxcvbn: realistic password strength estimation

  • Pingback: Agency Byte » Fusion Radar: April 19, 2012

  • Pingback: Bocados de Actualidad (148º) | Versvs

  • Pingback: Ежедневник kvarkson'a | 23 Апреля / Атака трансформеров

  • Pingback: Passwords: What is a good minimum password requirement for a product requiring a login? - Quora

  • Pingback: La entropía y la seguridad

  • Leo

    I’m somewhat confused. Take for instance this password:

    Müller

    which is evaluated as follows:

    password: Müller
    entropy: 38.456
    crack time (seconds): 18857475.781
    crack time (display): 9 months

    However, Müller is the second most common family name in the German language. Granted there’s an umlaut, but still…

    • Rob

       In the source files, there are lots of _English_ surnames. When you want to use the script in Germany. Just make a list of surnames (and common German words). So that it will be of more use for you :-)

  • http://happywalldecals.com/ sarh @ Wall Stickers

    Wow, really enjoyed this post. So thorough and helpful, and I liked how you focused on each aspect which makes it easier to implement on my own blog.

  • Pingback: Zxcvbn: validazione delle password in javascript dal team di dropbox

  • Pingback: 10 Ways to Improve Password Security in Your Organization - Online File Storage | Online File Storage

  • lud mud

    Hello,

    I would like to know what i would have to modifiy in zxcvbn to get wxcvbn, an azerty-oriented version ?

    Thanks !

  • Pingback: ¿Cuan seguras son nuestras contraseñas? | Centro de Producciones Multimedia · UAE

  • Anonymous

    If a web site told me that it wouldn’t accept the first password because
    it was too weak, despite being the only website in the world I’ve used
    it on, I would naturally assume that their website is routinely hacked
    and I should assume the password database is public information. On the
    other extreme is github, where ‘password1′ is perfectly valid but
    ‘f^FH]!n’ isn’t because no numbers are present.

    Getting people
    on-board with one-use passwords, and building password stores or
    generators into browsers, is far more important than having an insanely
    long or complex password. The only time I need to care about Password
    Maker’s output is when a site thinks my password is too complex or not
    complex enough, when the reality is that any of my individual passwords.
    No, silly web sites, you aren’t as important to my security as you
    might think you are, and why should I care if it would take Russian
    hackers months or centuries to crack my Dropbox password when my banking
    passwords can’t be longer than 8 characters, isn’t case sensitive, and
    can’t have any symbols, and are probably being hacked nightly in a way
    the bank is too incompetent to find out? Fine, inform me, but let me sign up on my own terms.

  • Pingback: Correct horse battery staple « Populi Blog

  • Robin Powell

    It’s rather too generous with “dictionary words”; a random sequence of a and i characters, of any length, is considered to have an entropy of about length*1.5, because it considers them very obvious dictionary words.  :D  This is true with embedded a and i, too, so that 9SNF has an entropy of 24, but 9aNi has an entropy of 15.

    I don’t think this was intentional.  :D

    -Robin

  • http://www.geeknoob.com/ Kundan Bhardwaj

    To be honest I am more interested in the top 500 list of passwords cloud. Thanks as its amazing.

  • Pingback: Anonymous

  • http://www.formlogix.com/on-line-forms.htm on line forms

    A very simple technique for keeping the password is used. while filling up the online forms the password should be strong otherwise it doesn’t acceptable. Always select the password of good strength before filling online forms.

  • http://www.formlogix.com/on-line-forms.htm on line forms

    A very simple technique for keeping the password is used. while filling
    up the online forms the password should be strong otherwise it doesn’t
    acceptable. Always select the password of good strength before filling
    online forms. 

  • Guest

    Awe crap. Now I have to change my dropbox password. and -1 for requiring acceptance of 3rd party cookies to log-in. Bad web design.

  • http://twitter.com/grkuntzmd G Ralph Kuntz, MD

    Very nice coverage of a difficult but important subject.

  • Doug

    ; )smile(;=4 years
    ; )smile( ;=centuries

  • http://www.philliphaydon.com Phillip Haydon

    For the past 8 years, my banking passwords have been no shorter than 15 characters. About 2 years ago, anz.co.nz decided 16 characters was max… I was forced to shorten my 21 character password…

    • G. Ralph Kuntz, MD

      Why do financial institutions do that? American Express limits passwords to 12 characters. Have they not heard of secure hashes?

      • Dick Harris

        Because bank IT in general is a result of red tape and design by committee, not common sense and intelligence. I should know, I work in the financial industry.

  • Pingback: Password Policy in Cloud Foundry | CloudFoundry.org Blog

  • G. Ralph Kuntz, MD

    Very nice article. I have ported your algorithm to (functional programming) Scala code and plan on uploading it to Github after I have a chance to clean it up a bit.

    Have you considered adding the diceware (http://world.std.com/~reinhold/diceware.html) word list? Would this improve the algorithm?

  • http://profile.yahoo.com/Z5B23BDR7EZLMHIEV7BO7LTCDE Alex

    This is great! Thanks for your work on this, Dan.

    I found this after reading this: http://arstechnica.com/security/2012/08/passwords-under-assault/
    It led me to searching around for a good way to test some new passwords I’m trying out. I’ve been very impressed by zxcvbn’s ability to detect all kinds of words and patterns and thus recognize the password as being easy to crack, given the new algorithms hackers are using now.

    I type in the Dvorak keyboard as well as QWERTY, so I’ve been just typing a long, plain, l33tspeak word in the QWERTY pattern on a Dvorak keyboard. It makes it easy to remember, but very random. Even your otherwise very hard to fool zxcvbn tool doesn’t recognize any patterns. I don’t think this is a common enough practice for any hackers to run their normal smart algorithms over again, but this time assuming someone is typing in a QWERTY pattern on a Dvorak keyboard. Being unusual has its advantages!

  • Gfgdu

    hgfu,gvlikgjufifyugfufyuguwefgtusrdtgfugf

  • Jgbdcjdsk

    hjgdcxjkdshgcjsdhgk

  • “>

    nice

  • G. Ralph Kuntz, MD

    It looks like the `check_date` function in `matching.coffee` does not deal with 2-digit years.

  • Anthony

    As a gizmo lover but only a moderately capable techie/programmer, I’m not clear what the final outcome of this article was! Do we go with: (a) Tr0ub4dour&3 or (b) correcthorsebatterystaple or (c) zxcvbn ???

    • CheckTheFreeDemo

      Those score 2, 4 and 0 (out of 4), respectively.
      Check the example, it’s explained there.

  • http://blog.gingerlime.com/ Yoav Aner

    This is really neat. One thing I didn’t see mentioned and I really wonder about however – on the dropbox registration page, there’s no password confirmation input field.

    This must be the first time I’ve seen a password being requested without confirmation. Any particular reason for this? Do you simply assume people will reset their password if they typed it incorrectly?