Hash Algorithms – Web Development


There are many popular hash algorithms and, and, of course, you can write your own, but you should never write your own. The first lesson you’ll learn in, in CS 387 is don’t write your own, at least for our purposes. Of course, if you’re building a hashtable, you know, you can, you can do whatever you want. But if you’re going to use it for security purposes, don’t write your own. Then, you know, if you want to learn how to make a hashtable, knock yourself out. Of course, somebody’s probably already done it better, but hey, you never know. So anyways, some popular algorithms are crc32, which is basically designed for for checksums. If you were to send somebody a bunch of data, you know, like a big file you know, you might also include a crc of of, of that file and that, you know, is, is basically just a simple way to verify that you’ve got the entire file and it’s not corrupted because you know, you can send a c you can send a hash easier than you can send a whole file. You know, you can copy and paste a hash, it’s just a few bits. You can verify that the file that you received has the same crc as the file they sent. And then you don’t have to, you know, verify bit for bit that you have the correct file. So, crc is really fast. Its really, its really only purpose is for doing checksums, basically creating a hash of a large file. Its security properties are not very good. It’s very easy to find what we call a collision, which is when you know, when, when, when two things hash to the same value, which is, you know, the whole point of a hash is that we can get a, a different value for almost anything that we’re going to hash. Now, obviously if, if the size of the input is substantially greater than the size of the output yes, there are going to be collisions. The whole point is that it’s hard to find them. And with crc32, it’s very easy to find them. The reason you’d use crc is because you don’t care about collisions, you just care about speed. And crc is very fast. The most popular hashing algorithm out there is still md5, which used to be used because it was both fast, you know, ish, not as fast as crc32 but pretty fast, and people you know, thought it was pretty secure, except it’s not. Md5 has been broken repeatedly over the last few years. And it’s very easy to find md5 collisions. You know, given you know, an x that hashes into a y, it’s very easy given this y to find another x that hashes into it which is, as you’ll see, a big problem. So, we won’t be using md5 for much at all in this class. Well, it, it has its use cases, right? If, if you have a limited input it’s hard to find a collision. The certain class of attacks, you know, when making x longer and longer and longer and longer, that’s a really easy way to find a collision in md5. But if you limit the length of x, then you, you, you don’t have to worry too much about that vulnerability. So, anyway, just keep that in mind. When you really care about the data, don’t use md5. The second most popular hash is called sha1. This is not as fast, but it’s fairly secure. Just now are we starting to kind of hear, you know, demonstrations of people finding collisions in sha1. It’s still pretty good. It’s, it’s actually is the second most used hash behind md5. But for things going forward, you should really use something like for example sha256, which is, as you might guess, just kind of a bigger version of sha1. It’s actually not the algorithm I believe has changed as well. So we’ll say this one is secure-ish and this one is pretty good. It’s going to it’s going to take some time. Now, of course, the, the trade off is speed, you know? So the, the better hashing algorithm right now, the slower it is. So these are kind of, these are basically organized in cost and in security. So, no big surprise there that there’s kind of an inverse correlation between cost and security, but that’s the name of the game. So depending on what problem you’re working on you may to need, may need to actually make a decision. But, for our purposes, we’re not going to get a whole lot of traffic. We’ll probably sha256 for most things. So, let me show you how you would use this.

One Comment

Add a Comment

Your email address will not be published. Required fields are marked *