A (Brief) History Of Codes And Code-Breaking (Part 1 of 3)


How confident are you of your privacy? When you log in to Facebook, gmail or Twitter, how confident are you that no one else can view your information? More importantly, how do you keep your bank details safe when paying for a novelty, festive hat on ebay?

In our modern world, we have come to rely on encryption as a method of keeping our secrets secret. Since we first began transposing our thoughts onto papyrus, wax-tablets or paper, we have tried to invent more complex codes so that our thoughts can remain a secret to be shared with only those with whom we want to share. A simple code is the Caesar shift. Named after Julius Caesar, who popularised its use, the Caesar shift works by substituting every letter with a letter a certain number of positions along the alphabet. For example, an ‘A’ would become ‘B’, a ‘B’ would become ‘C’, a ‘C’ would become ‘D’ and so on. This shift is called the ‘key’ and the text it produces is called the ‘cipher’. Using the key above, the following sentence would be transformed as so:

SEND TROOPS TO DEFEND ROME.
TFOE USPPQT UP EFGFOE SPNF.

It is important that the person receiving the code knows what the key is (i.e. that the cipher alphabet has been shifted by one letter). They can then apply the key in reverse to obtain the original message. This method looks quite a good way of hiding a message, however, it is actually very easy for a third party to “crack”. Since there are 26 letters in the alphabet, there are only 25 shifts that can be applied. This means that if a third person wants to read the message, they only have to try 25 keys to read the message.

A cipher is much harder to break if letters are randomly assigned, rather than just shifting the alphabet. By randomly assigning letters, the number of possible keys increases dramatically! There are 7,905,853,580,625 ways of substituting a letter with any other letter (for more fun with large numbers and permutations look here). This is a lot more than 25 keys and if someone suspected that you had substituted one letter for another random letter they would have to spend some time attempting every possible key (if the third party could try a thousand keys a second, it would take about 250 years to try every key). Of course, the recipient of the message has the key, so decrypting the cipher is much easier – they just need to apply the key in reverse!

This is unbreakable, surely? Well, actually no. Some letters are more frequent than others. For example, the letter ‘E’ is the most frequent in English, while the least frequent letters are ‘X’, ‘Q’, ‘J’ and Z’.  If a cipher is long enough, someone could do a frequency analysis of the text. They could then take a good guess that the most frequent letter is an ‘E’ and so on. From this, they might be able to build up the key and eventually decrypt the code. There's a really cool online tool for automatically analysing letter frequencies available at The Black Chamber, developed by Simon Singh. 

In this post, we've seen how simple, Caesar shift ciphers aren't very secure (because a third party only has to try 25 possible keys to crack the code). We've also seen that a code can be made more secure by increasing the number of keys it uses. However, a fundamental problem of the codes discussed in this post is that they use only one substitute alphabet in the place of the original alphabet. The language characteristics of the original alphabet are therefore still present in the cipher text (for example certain characters in the cipher will occur to a higher frequency than others – a third-party 'hacker' could guess that these correspond to the most frequent letters of the original language thereby figuring the rest of the code out). 

The next post will look at techniques for further improving the complexity of codes. But as the complexity of the code improves, so too do the methods for breaking it.


Comments

Popular posts from this blog

level 17

Level 16

This is notpr0n...