# PBKDF2

In cryptography, **PBKDF1** and **PBKDF2** (**Password-Based Key Derivation Function 2**) are key derivation functions with a sliding computational cost, aimed to reduce the vulnerability of encrypted keys to .

PBKDF2 is part of ‘ (PKCS) series, specifically PKCS #5 v2.0, also published as ‘s RFC 2898. It supersedes PBKDF1, which could only produce derived keys up to 160 bits long. RFC 8018, published in 2017, still recommends PBKDF2 for password hashing, even though newer password hashing functions such as Argon2 are designed to address weaknesses in older functions such as PBKDF2.

## Purpose and operation

PBKDF2 applies a , such as (HMAC), to the input password or along with a value and repeats the process many times to produce a *derived key*, which can then be used as a in subsequent operations. The added computational work makes much more difficult, and is known as .

When the standard was written in the year 2000 the recommended minimum number of iterations was 1000, but the parameter is intended to be increased over time as CPU speeds increase. As of 2005 a Kerberos standard recommended 4096 iterations, Apple iOS 3 used 2000, iOS 4 used , while in 2011 LastPass used 5000 iterations for JavaScript clients and iterations for server-side hashing.

Having a salt added to the password reduces the ability to use precomputed hashes () for attacks, and means that multiple passwords have to be tested individually, not all at once. The standard recommends a salt length of at least 64 bits.

## Key derivation process

The PBKDF2 key derivation function has five input parameters:

where:

*PRF*is a pseudorandom function of two parameters with output length*hLen*(e.g. a keyed HMAC)*Password*is the master password from which a derived key is generated-
*Salt*is a sequence of bits, known as a *c*is the number of iterations desired*dkLen*is the desired length of the derived key*DK*is the generated derived key

Each *hLen*-bit block T of derived key DK, is computed as follows (with marking string concatenation):

The function *F* is the (^) of *c* iterations of chained PRFs. The first iteration of PRF uses *Password* as the PRF key and *Salt* concatenated with *i* encoded as a big-endian 32-bit integer. (Note that *i* is a 1-based index.) Subsequent iterations of PRF use *Password* as the PRF key and the output of the previous PRF computation as the salt:

where:

For example, uses:

## HMAC Collisions

PBKDF2 has an interesting property when using HMAC as its pseudo-random function. It is possible to trivially construct any number of resulting collisions for different passwords. If a supplied password is longer than the block size of the underlying HMAC hash function, the password is first pre-hashed into a digest, and that digest is instead used as the password. For example, the following password is too long:

**Password:**

therefore (when for example using HMAC) it is pre-hashed using SHA-1 into:

**SHA1**(hex):

Which can be represented in ASCII as:

**SHA1**(ASCII):

This means that PBKDF2 will generate the same key bytes for the passwords:

- “plnlrtfpijpuhqylxbgqiiyipieyxvfsavzgxbbcfusqkozwpngsyejqlmjsytrmd”
- “eBkXQTfuBqp’cTcar&g*”

regardless of the hashing function (e.g. sha1, sha256), salt, or iterations.

For example, using:

**PRF**: PBKDF2**Salt:**A009C1A485912C6AE630D3E744240B04**Iterations:**1,000**Desired key length:**16 bytes

the following two function calls:

will generate the same derived key bytes (). These derived key collisions do not represent a security vulnerability; as you still must know the original password in order to generate the *hash* of the password. The presence of the collisions becomes a mere curiosity.

## Alternatives to PBKDF2

One weakness of PBKDF2 is that while its number of iterations can be adjusted to make it take an arbitrarily large amount of computing time, it can be implemented with a small circuit and very little RAM, which makes brute-force attacks using or relatively cheap. The bcrypt key derivation function requires a larger amount of RAM (but still not tunable separately, i. e. fixed for a given amount of CPU time) and is slightly stronger against such attacks, while the more modern scrypt key derivation function can use arbitrarily large amounts of memory and is therefore more resistant to ASIC and GPU attacks.