Encrypting Your Plaintext Passwords

Share

If you have been following technology news, you may have heard that the popular social application site RockYou was recently hacked, with all user passwords stored in plaintext stolen (over 32 million accounts). This is a terrible security lapse, not just because it compromises every RockYou account, but because many users use the same password across multiple sites, so a file containing emails and plaintext passwords means that the attacker can compromise a great many of these accounts for all those people across the web. Repeat: the popularity of RockYou means that anyone with a RockYou account is likely to have their other accounts on other sites compromised.

Like many people conscious of web security, my initial reaction to the incident was to shake my head and tsk-tsk at RockYou's foolishness at storing their passwords in plaintext, and then I realized that the real problem wasn't that plenty of sites do this, but rather that, yes, plenty of sites out there have done this, but even with this report, they likely have no idea how to fix that problem. Think about it: if you weren't sophisticated enough to encrypt your passwords in the first place, you likely aren't up to the task of migrating your plaintext passwords into an encrypted format, which is a tricky migration involving lots of moving parts and little details.

In fact, I imagine that right now there are countless startups and little web companies who have seen that report and realized, "Hey, our passwords are stored in plaintext!" and then "Crap, I have no idea how we're supposed to fix that!" and might just be waiting in silent/hopeful fear that their sites don't get hacked. While the migration is a straightforward operation for the best programmers, there are many amateur sites programmed by regular dudes who never thought about password encryption and some of those have probably taken off in popularity, and those pose a real danger. Unfortunately, once you get popular, getting hacked is only a matter of time.

This blog post is therefore a step-by-step description of how to migrate your site from using plaintext passwords to encrypted passwords. If you run a small (or large) but growing website, you probably want to fix this immediately. If you do not understand all the steps, to find a technical friend (or a reliable contractor) who does, and ask them to implement it for you. I am also available for consulting gigs at a very high hourly rate.

The Concepts

The core concept is the one-way encryption function. Don't worry, you won't need to write this function, you just need to use it. Most popular web scripting languages provide one, and it is relatively straightforward to use. In all my examples, I will use PHP, the most popular of the modern web languages - the examples all roughly transfer to other languages. In PHP, this function is called crypt(). There are actually a couple things wrong with this function so before you go and use it, read the documentation carefully (like the part where it says it only uses the first eight characters of the string).

Using crypt(), you take a password, encrypt it to get a string of gibberish, and you store that string of gibberish instead of the plaintext password. You do this when the user signs up - they type in their password, and your signup script encrypts the password and writes that result to the database instead of the plaintext that the user typed in.

How do you check if the user enters the correct password when they log in then? When the user logs in, they'll type in their password, and you run what they typed into through the crypt() function AGAIN, which gives you a string of gibberish to compared to the gibberish you originally recorded in the database. If they match, then it means the user entered the correct password. If they don't, it means they entered something else (i.e. wrong).

Why does this work? It's because crypt() is a one-way encryption, which means that you can use it to encrypt a piece of text (like a plaintext password) into gibberish, but you cannot decrypt that gibberish back into the original text. This means that if a hacker penetrates your security and steals your database contents, all they will get is a list of encrypted gibberish which they CAN'T use to log into your user's accounts on your site or other sites.

Got it? Okay, good, because I lied - it's not that simple: actually, they still can.

What they can do is sequentially try encrypting all possible passwords through brute force to see if crypt() yields the right string of gibberish and if any of those gibberish strings matches any gibberish they found in your database, then they've stumbled onto one of the passwords of your users. "All possible passwords" is not actually a very large set (for modern computers), especially since most users have very simple passwords so even if they don't crack the hard ones, they'll get a lot of the simple ones.

To fight that, we add something called the "salt." The salt is a random string of characters that you also pass to crypt(). When you give crypt() a plaintext password plus a salt, it will produce a different string of gibberish than if you gave it a different salt. The resulting gibberish has as its first few characters the salt, which seems insecure but actually isn't (think about it) and the result is that now your attacker can't just try every possible password, they have to try every possible password combined with the particular salt for that user, making the attacker's job exponentially harder.

To review, what you want to do is:

Upon signup, you take the user's password, come up with a random salt (do NOT produce the salt from the user's password, like MD5'ing it - it must be an unrelated random string), and do the following:
1. Create a random string of N characters, call the random_salt.

2. crypt(plaintext_password, random_salt) -> gibberish

3. The gibberish will look like

4. Store this thing in the database. Do not store the plaintext password anywhere.

When the user logs in, ask for their username and password.
1. Look up the encrypted gibberish for that username in the password

2. Take the first N characters of the gibberish, which will be the random_salt.

3. crypt(what_the_user_entered_as_their_password_upon_login, random_salt) -> new_gibberish

4. If the new_gibberish matches the gibberish you got from the database, then log them in.

Interestingly, what you do when a user forgets their password is the following:
1. You do NOT email them their password (you can't; you never stored their original plaintext)

2. Verify somehow that this is really the user (secret question, whatever).

3. Email them a link to a page where they reset their password. When they enter this new password, do the same thing as in signup: encrypt the password and store the resulting gibberish.

That describes the end state of what your system is supposed to do. Notice that it is able to handle all registration, login, and password resets without storing the plaintext passwords anywhere. Obviously, it's easiest to have originally written your site that way, but hey, if you didn't know about it, how could you have? The tough part is if you originally wrote your site to store plaintext passwords, how do you migrate to a system that only stores encrypted passwords?

The Migration Path

First, if you were paying attention to the above section, you would realize that it should involve writing a couple functions (these are pseudo-PHP):
// Takes a password the user gave us, and generates the
// crypt-text (gibberish)
password_encrypt($plaintext) { }

// Takes the password a user entered and the
// encrypted password from the database
// and checks if they match.
password_check($plaintext, $encrypted_password) { }

However, the main problem with the migration is that your passwords are currently stored in plaintext, so how do you migrate them all over without taking down your site in order to encrypt everything (which can take awhile) and launch the new encryption-ready code (which could have bugs)?

Step 1:

Write code that dual-writes the plaintext and encrypted text: Create a new column in your user table to store the encrypted password. Add code to your signup code right next to where you are currently storing the plaintext passwords that also encrypts the password and store the encrypted password in the new column. Launch this code. Now all new user registrations are producing an encrypted version of their password as well. You can verify this by looking in the database.

Step 2:

Write a script that goes through all user accounts and encrypts the plaintext passwords, storing the new encrypted password in the new column in the user table. This script might take a long time to run, but you can run it any time you want during off-peak hours because once it fixes all the existing plaintext data, new data is being stored in an encrypted format.

Step 3:

Change your login code to check against the encrypted passwords instead of the plaintext passwords. This is the moment of truth: if you've written your code correctly (and tested it), you shouldn't have any problems. I recommend that you do whatever you need to do in order to allow you to revert quickly (e.g. comment out the old code instead of deleting it, or if you have site infrastructure to flip back and forth between functionality, use that) in case you made a mistake: you will know almost immediately because if you made a mistake, your users won't be able to log in. Let this code run for a day or week or however long it takes for you to feel sure that users are successfully logging in and authentication is being done with the encrypted versions of the passwords.

Step 4:

Delete the plaintext passwords. Yes, you will never be able to get them back - that is the point. If you are really paranoid, you may want to copy them off to another table and delete them from the main user table and, after a requisite period of time, destroy the copy. Make sure you destroy all backups and logs as well.

A Caveat

I've simplified a number of core implementation issues in this description (in particular, there are some crypto quirks with PHP's built-in crypt function that make it non-ideal, plus a MySQL quirk where string-comparisons are case-insensitive) to make it accessible to the audience most likely to find this useful. If you are serious about doing this, my post should provide you with a high-level roadmap of what you need to do to close a big security hole, but before you begin make sure to read all the documentation on crypt() (or whatever function(s) you intend to use) very carefully and write out your plan. If you did not fully understand this article, find someone else who does - a good test for whether you have found the right person to do the job for you will be if that person can point out all the little technical errors and glossed-over details I've deliberately left in this article.

Good luck.



Originally posted here on 2009 Dec 15, by Yishan Wong.