Encryption is a must when dealing with sensitive data or passwords. In a previous blog post we've looked at encrypting using werkzeug
, which comes with Flask. In this post we'll take it further and use a popular encryption library called passlib
.
Not relying on werkzeug means you can take anything in this blog post and apply it to any Python app—and not just Flask apps.
I'd recommend reading my previous blog post on encryption for a general overview regarding what encryption is and why it's important. This is also another good, quick guide on cryptography.
Hashing algorithms
There are many different hashing algorithms, and we are constantly searching for and coming up with new and better algorithms. Some that were used in the past are now considered unsafe, and the most modern ones are generally considered uncrackable.
Hashing vs. encrypting. A simple explanation is that if you encrypt something, you can decrypt it. If you hash something, you cannot 'unhash' it.
For passwords, we hash them because we don't want anybody to be able to turn them back to readable text. Read on to see how to check a hashed password is correct!
Some of the hashing algorithms that are presently out there are:
- Argon2
- PBKDF2
- Bcrypt / Scrypt
These are different in their hardware requirements, speed, and implementation. However, they are all extremely difficult to crack.
I cannot recommend learning everything about how hashing algorithms work, because it is a very complex field (full of mathematics!). A very modern, heavily scrutinised, and seemingly well rounded algorithm is Argon2. However, you may find some issues in some operating systems when it comes to using it.
Since the biggest problem with encryption and hashing is that software developers don't use it enough, any of the choices is better than none of them. I would recommend using PBKDF2 or Argon2, as they are both "better" than Bcrypt and not as hardware-consuming as Scrypt.
To keep things simple we'll use PBKDF2—as it works on all operating systems and hosting providers.
We'll be using passlib as our Python library to implement hashing. They have an excellent guide on how to pick a hashing algorithm.
Using passlib
To install passlib, all we need to do is:
pip install passlib
You may get some warnings about missing libraries. These would be used for other encryption and hashing algorithms. If you want to use something other than PBKDF2, please look at the optional libraries section in the Passlib documentation for information on which one(s) you need.
The CryptContext
Passlib works by defining a context. In it, we specify what algorithm we'll use for hashing, as well as any configuration parameters.
We can skip using a CryptContext
, and instead interacting with the hashing algorithm directly. Passlib provides a way to do this. However, using a CryptContext
allows us to more explicitly define configuration parameters, and makes things more readable. Also, it allows us to support multiple algorithms at once should that be a requirement for our system.
Creating the context with PBKDF2 is done like as below. Place the following code in a new file (e.g. security.py
):
from passlib.context import CryptContext
pwd_context = CryptContext(
schemes=["pbkdf2_sha256"],
default="pbkdf2_sha256",
pbkdf2_sha256__default_rounds=30000
)
What are rounds?
A round is a part of the algorithm that runs many times in order to reduce "crackability". Different algorithms recommend using different numbers of rounds.
This is a fantastic StackOverflow answer from a passlib developer which goes into more detail about rounds, and also how to calculate what number of rounds to choose.
Depending on how many rounds you choose, the algorithm will take longer to encrypt. Thus, your users will have to wait for longer for your application to resume. Pick a number too large, and your users will be frustrated. Pick a number too low, and your user's data will not be safe.
A rule of thumb is to make the users wait a maximum of ~350 milliseconds—so you can fine tune the number of rounds so your algorithm doesn't take longer than that.
Here's an explanation on why waiting ~350ms is good.
Encrypting and verifying passwords
Now that we have our CryptContext
, we can use it to encrypt and verify passwords. Let's add a couple functions inside the same file where we created the context:
def encrypt_password(password):
return pwd_context.encrypt(password)
def check_encrypted_password(password, hashed):
return pwd_context.verify(password, hashed)
That's it! Whenever we want to store a user's password in a database, we should encrypt it first by using our encrypt_password()
function. Whenever we want to log in a user, we'll compare the password they provided in the log in form with the password stored in the database, by using the check_encrypted_password()
function.
Note that at no point we 'unhash' the password in the database. We just use the algorithm on the password we want to check, and that turns it into a new hash. If the one in the database and the new one we create are the same, we know the password is correct!
Last updated in May 2019. This post was initially published in December 2017.