A (Brief) History Of Codes And Code-Breaking (Part 1 of 3)
How confident are you of your privacy? When you log in to
Facebook, gmail or Twitter, how confident are you that no one else can view
your information? More importantly, how do you keep your bank details safe when
paying for a novelty, festive hat on ebay?
In our modern world, we have come to rely on encryption as a
method of keeping our secrets secret. Since we first began transposing our
thoughts onto papyrus, wax-tablets or paper, we have tried to invent more
complex codes so that our thoughts can remain a secret to be shared with only
those with whom we want to share. A simple code is the Caesar shift. Named
after Julius Caesar, who popularised its use, the Caesar shift works by substituting
every letter with a letter a certain number of positions along the alphabet.
For example, an ‘A’ would become ‘B’, a ‘B’ would become ‘C’, a ‘C’ would
become ‘D’ and so on. This shift is called the ‘key’ and the text it produces is called the ‘cipher’. Using the key above, the following sentence would be
transformed as so:
SEND TROOPS TO DEFEND ROME.
TFOE USPPQT UP EFGFOE SPNF.
TFOE USPPQT UP EFGFOE SPNF.
It is important that the person receiving the code knows
what the key is (i.e. that the cipher alphabet has been shifted by one letter).
They can then apply the key in reverse to obtain the original message. This
method looks quite a good way of hiding a message, however, it is actually very
easy for a third party to “crack”. Since there are 26 letters in the alphabet,
there are only 25 shifts that can be applied. This means that if a third person
wants to read the message, they only have to try 25 keys to read the message.
A cipher is much harder to break if letters are randomly
assigned, rather than just shifting the alphabet. By randomly assigning
letters, the number of possible keys increases dramatically! There are 7,905,853,580,625
ways of substituting a letter with any other letter (for more fun with large numbers and
permutations look here).
This is a lot more than 25 keys and if someone suspected that you had substituted one letter for another random letter they would have to spend some
time attempting every possible key (if the third party could try a thousand
keys a second, it would take about 250 years to try every key). Of course, the
recipient of the message has the key, so decrypting the cipher is much easier –
they just need to apply the key in reverse!
This is unbreakable, surely? Well, actually no. Some letters are more frequent than others. For example, the letter ‘E’ is the
most frequent in English, while the least frequent letters are ‘X’, ‘Q’,
‘J’ and ‘Z’. If a cipher is long enough, someone could do
a frequency analysis of the text. They could then take a good guess that the
most frequent letter is an ‘E’ and so
on. From this, they might be able to build up the key and eventually decrypt
the code. There's a really cool online tool for automatically analysing letter frequencies available at The Black Chamber, developed by Simon Singh.
In this post, we've seen how simple, Caesar shift ciphers aren't very secure (because a third party only has to try 25 possible keys to crack the code). We've also seen that a code can be made more secure by increasing the number of keys it uses. However, a fundamental problem of the codes discussed in this post is that they use only one substitute alphabet in the place of the original alphabet. The language characteristics of the original alphabet are therefore still present in the cipher text (for example certain characters in the cipher will occur to a higher frequency than others – a third-party 'hacker' could guess that these correspond to the most frequent letters of the original language thereby figuring the rest of the code out).
The next post will look at techniques for further improving the complexity of codes. But as the complexity of the code improves, so too do the methods for breaking it.
Comments