We want to divorce the recovery code from the actual symmetric key that is derived from the recovery code, for two reasons:
- There is no need to force the user to write down the long symmetric key
- Every cipher needs a different key length, and there is no reason to tie the recovery code length to any particular cipher.
Keys for symmetric encryption do not need to be longer than 128 bits. Even if a computer could do trillions of attacks a second, and you had a million of these computers, it would still take a long time to exhaust the potential key space by brute force.
- possibilities = 2**128
- tries_per_second = 1e6 * 1e12 (million computers that could make a trillion attempts a second)
- 10 trillion years before the key space will be exhausted (possibilities / tries_per_second / years_in_seconds)
If we wanted to force users to have 128 bits in their recovery code:
- a hex code would require 32 characters
- an alphanumeric code would require 22 characters (e.g. Log2(63**22) = 131)
Both of those are higher than we would like. Typically, no one bothers with making users write down the raw 128 symmetric key because there is no need. Instead, a user is given a shorter key that is then passed through a key-stretching algorithm or Key Derivation Function (KDF) to achieve the same effect at a lower entropy. This works really well so long as the code is actually random. If it is not random, like a user chosen password, then obviously there are ways to shortcut the brute force attack process.
What about KDF using sha256?
- A typical unix crypt-style sha256 password digest is 5000 rounds of sha256 (with salt). PBKDF2-HMAC-SHA256 typically uses a similar number of rounds (although it is configurable).
- Off the shelf products like Brutalis can do 23 billion SHA256 hashes per second
- Brutalis are kind of cheap, so lets assume an adversary has 10 of them. This makes their attempt_rate = 46e6/second. e.g.
23e9 * 10 / 5000
- There are 3.37e20 seconds in 10 trillion years. e.g.
10.7e12 * 365 * 24 * 60 * 60
- You could make 1.55e28 attempts in that time. e.g.
46e6 * 3.37e20, or 93.64 bits of entropy
- It would require an alphanumeric code of length 16 to achieve similar entropy. e.g.
- So, to achieve the same security with a short code as you get with a raw 128 key, against this hypothetical adversary, it would require recovery codes with 16 random digits. If we wanted to make it case insensitive and remove digits that are easily confused (e.g. zero and 'o'), it would require 20 digits.
This assumes there is a salt. In other words, the salt is stored on disk in the clear and is available to anyone attempting recovery for a particular user (after authenticating using the recovery code, with a different salt).
This rough estimation does not include the time to attempt to decrypt once the symmetric key is derived, but it gives a ballpark estimate of how short you can make the code without compromising any resistance to brute force attacks.
A reduction in size from 22 characters to 16 characters does not seem so beneficial.
What about a KDF using argon2?
- Lets assume the attacker has 1000x the CPU power, but limited memory.