Here's part of a book I may never write.
Introduction
Why kids?
I read books on codes and ciphers when I was a kid and I loved them. Now I'm grown up I still like reading about codes and cryptography (and maths and science). I don't understand everything, but I reckon I can explain enough so that kids can have the same rough understanding of what's going on as me. There are still books on codes and ciphers for kids, but they don't go deep enough (note to self: check whether this is actually true before embarking on book). When I was a kid I wasn't aware that codes played that much part in life, apart from whatever spies got up to. These days however as soon as you go to a website and a padlock appears in your browser then you're using cryptography. This book still may not go deep enough for you, but hopefully it will take you deep enough to either satisfy your curiosity. You don't have to be a kid to read it, but the alliteration of the ks and cs sounds good.Why curious?
You don't have to know how things work in order to use them, but if you're the sort of person who wonders what's really going on then this book is for you.A shark through a drainpipe - one way hashes
Sharks have quite prominent gills. I can imagine, (though I haven't actually done it myself), that if you put a shark down a drainpipe that was just about wide enough to take it, it would be quite hard to drag it back up the other way. Not impossible, but so hard that you wouldn't really want to bother. A one way hash is like that.
Imagine you have a really long brioche roll with chocolate chips spread along it. They are spread so that every centimetre of roll either has one or no chocolate chips in it. You could chop this roll into 32cm pieces and then compare one pair of pieces side by side like this.
If one of the rolls has a chocolate chip at that point then you transfer that chip to a new roll at the same point.
If neither of the rolls has a chocolate chip then the new roll doesn't get one either.
If both have a chip then the new roll doesn't get one.
This set of rules is called the XOR function (exclusive or - one or the other but not both).
You then compare another 32cm length with the new roll until you've used up all the roll. (If the last bit is less than 32cm long then assume that it's actually 32cm long with no chips in the bits that aren't there.) At the end of it you'll have a brioche roll with a pattern of chocolate chips and gaps. As there's no randomness if you start with a certain pattern in your original roll then you'll always end up with the same pattern in the final roll. A set of rules used in this way is called a hash function. In practise hash functions are more complicated.
This is a one way hash. There could be two different original patterns that end up with the same final pattern. However to try and work backwards from the final roll to work out the original roll is hard - like dragging the shark backwards through the drainpipe. So it's not exactly one way, but near enough.
You could come up with different rules of chopping and combining and a good hash function will mean that if you start out with two original rolls that only have one chocolate chip difference the final roll will be really different.
So what's the use of a hash function? One thing you can do with it is to check that a file that you've downloaded is complete and hasn't got corrupted on the way. If the site you download the file tells you the hash of the file then you can use a program to work the hash of your file and if it matches then it is very probably the same as the original. Like the two rolls that are only different by one chocolate chip, a small corruption would give you a different hash, so it's really unlikely that the file would be corrupted in such as way that it's different to the original, but has the same hash.
Hashes and cryptography
One way that hashes are used in cryptography is when checking passwords. If a computer had a list of usernames and passwords in a file somewhere, and someone got that file, then they could log in as any one of those users. So the passwords are stored in hashed form. When someone types in their password it's hashed in the same way and then checked against the stored password. This is where the "really hard to find another thing that when hashed gives the same result" property is important.