Cryptography Puzzles

The Puzzles

Spoilers ahead! - This page contains the answers to the puzzles. If you haven't already come up with your own answers, you're in the wrong place. Please see the descriptions of the Cryptography puzzles first, think about them, and then read this page!

For solving some of these puzzles, we'll use the popular programming language Python (specifically python3) and use the interactive shell to execute the commands as we type them. If you don't have python handy, you can still read along to follow what's happening.

Puzzle 1

This is a really gentle start, even if you've never heard of cryptography, you should be able to just follow the letter mappings to get the questions. Here they are though:

1a: Where was the centre of C O D E B R E A K I N G during W W 2 ?
Answer: Bletchley

1b: The man who designed the machine that C R A C K E D the E N I G M A   C O D E was ?
Answer: Turing

1c: The fundamental B U I L D I N G block of E L E C T R O N I C devices is the ?
Answer: Transistor

Explanation: this is a simple substitution cipher, and apart from the 7 special symbols, it uses a very simple code of A=1, B=2, C=3 and so on. So it's about as simple as it gets. The extra symbols make things more cumbersome, but no more secure. Note that the translation goes both ways, so to get a "2" in puzzle 1a, the question had to use a "B".

Puzzle 2

5 8 1 14 13 0 2 2 8 18 4 16 20 4 13 2 4

It certainly looks like each number represents a letter, but we can't use our very simple scheme from puzzle 1 (where A=1, B=2 and so on) because we can see a "0" in the number list. So now we have to guess how the alphabet is organised, and given that there's a number zero, let's try A=0, B=1 and so on and see what we get.

F I B O N A C C I S E Q U E N C E. We're lucky first time, and get the Fibonacci Sequence as the answer.

The reason there was a picture of a sunflower as a hint was because the Fibonacci numbers occur surprisingly often in natural growth patterns, like pine cones, flower petals, ferns, pineapple skin patterns and various vegetables.

Explanation: this is clearly another substitution cipher, which is not only simply monoalphabetic but also not even shifted.

Python: Obviously we don't really need to use any programming to solve this easy puzzle, but the concepts will be useful later on. Important here is the function chr which takes a character code (like 65) and turns it into the corresponding character (in this case, 'A'). And the opposite function, ord, which takes a character (like 'h') and gives you the corresponding character code (in this case, 104).

So, in the python shell:

secret = "5 8 1 14 13 0 2 2 8 18 4 16 20 4 13 2 4"
for i in secret.split(" "):
    print(chr(ord('A') + int(i)))

Or, as a one-liner, using list comprehension:

"".join([chr(ord('A') + int(i)) for i in secret.split(" ")])
'FIBONACCISEQUENCE'

Puzzle 3

X S K L V V O H H Y L H V

So now instead of letters being transformed into numbers, it looks like they have been transformed into other letters. And we're given the clue that this puzzle is using a Caesar cipher, which means the letters are all just shifted forwards or backwards by a constant amount.

In the simplest variant, an A would be transformed to a B, a B to a C, and so on until a Z got cycled back around to an A. This corresponds to a forwards shift of 1 for each letter. We can shift by any number of letters from 1 to 25 (shifting by 0 or 26 doesn't do very much!), so the best thing to do here is try out all the 25 possible shifts and see if anything sensible comes out.

It wouldn't take too long to run the shifts with pencil and paper, but we'll run it through python again for practice. It'll help with the later puzzles too.

Python: Again, we'll use python3 and its interactive shell to execute the commands as we type them. Because we don't know what the shift is, we'll loop over all 26 possible shifts (including the trivial zero shift!).

secret = "XSKLVVOHHYLHV"
for key in range(26):
    msg = ""
    for x in secret:
       letterIndex = (ord(x) - ord('A') + key) % 26
       msg += chr(ord('a') + letterIndex)
    print(key, msg)

This code loops through the 26 possible values of key, and each time round the loop it builds a decoded version msg using the key as the shift. For each letter in the secret, it makes a new letterIndex from the character code, shifted by key and wrapped around at 26, and then makes a new character using 'a' and this new letter index. Then it prints the key and the decoded message to the screen.

The results are as follows, with each line showing the key and the message.

0 xsklvvohhylhv
1 ytlmwwpiizmiw
2 zumnxxqjjanjx
3 avnoyyrkkboky
4 bwopzzsllcplz
5 cxpqaatmmdqma
6 dyqrbbunnernb
7 ezrsccvoofsoc
8 fastddwppgtpd
 
9 gbtueexqqhuqe
10 hcuvffyrrivrf
11 idvwggzssjwsg
12 jewxhhattkxth
13 kfxyiibuulyui
14 lgyzjjcvvmzvj
15 mhzakkdwwnawk
16 niabllexxobxl
17 ojbcmmfyypcym
 
18 pkcdnngzzqdzn
19 qldeoohaareao
20 rmefppibbsfbp
21 snfgqqjcctgcq
22 toghrrkdduhdr
23 uphissleevies
24 vqijttmffwjft
25 wrjkuunggxkgu

Most of these are nonsense, but key 23 gives "up his sleevies" which is of course the answer to the question "where does he keep his armies?".

Here we've seen how the Caesar cipher works and how weak it is, and how easy it is to break just by trying the 25 possible shifts. Needless to say, it's normally useful for a cipher scheme to have more than 25 possible keys!

Puzzle 4

   
 · 
   
>
   
   
   
 1 
 8 
>
   
   
   
   
 · 
>
<
 · 
<·

,

   
   
 · 
   
   
 · 
   
 \/ 
 · 
 · 
 \/ 
   
<
 \/ 
   
   
   
 · 
   
   
   
 · 
   
 · 

   
   
 · 
   
   
 · 
 \/ 
   
>
 · 
   
 · 
   
   
 · 
   
>
   
   
   
 · 
   
 · 
 · 
   
/\
   
>
   
   
 · 
   
   
 · 
 · 
   
\/

 

If you have a substitution cipher like this, it doesn't make a great deal of difference whether letters are substituted for other letters, or for numbers, or for symbols. It's still just a consistent (monoalphabetic) substitution. So techniques like frequency analysis work just as well no matter what symbols are used. In this example, we're especially lucky that the spaces between the words have been preserved, which makes things much easier.

Our first step will be to get rid of the awkward symbols, by making another substitution. Quite arbitrarily, we'll take the symbols in the order they appear, and use upper case letters instead. This gives us:

AB CDE 18CD FEBCGHI, JHEEKLMNBM GMEO PAQ PEB FAPDEHM CN REEP CDEAH PHASLCE HEFNHOM

Thanks to the spaces and the numerals, we can make a good guess that "CD" means "th", giving 18th, and that therefore "CDE" means "the". If we try this out (using lower-case letters for the ones we've decoded), we can gradually fill in the missing pieces.

"AB CDE 18CD FEBCGHI, JHEEKLMNBM GMEO PAQ PEB FAPDEHM CN REEP CDEAH PHASLCE HEFNHOM".replace("C","t").replace("D","h").replace("E","e")
'AB the 18th FeBtGHI, JHeeKLMNBM GMeO PAQ PeB FAPheHM tN ReeP theAH PHASLte HeFNHOM'

There aren't many two-letter words beginning with "t", and we can guess that the fourth word could be "century", which fills in many more gaps:

secret = "AB CDE 18CD FEBCGHI, JHEEKLMNBM GMEO PAQ PEB FAPDEHM CN REEP CDEAH PHASLCE HEFNHOM"
secret.replace("C","t").replace("D","h").replace("E","e").replace("N","o").replace("B","n").replace("F","c")
.replace("G","u").replace("H","r").replace("I","y")
'An the 18th century, JreeKLMonM uMeO PAQ Pen cAPherM to ReeP theAr PrASLte recorOM'

Now we can guess the words "their" and "ciphers":

secret.replace("C","t").replace("D","h").replace("E","e").replace("N","o").replace("B","n").replace("F","c")
.replace("G","u").replace("H","r").replace("I","y").replace("A","i").replace("P","p").replace("M","s")
'in the 18th century, JreeKLsons useO piQ pen ciphers to Reep their priSLte recorOs'

And if we can assume the words "freemasons", "keep" and "records":

secret.replace("C","t").replace("D","h").replace("E","e").replace("N","o").replace("B","n").replace("F","c")
.replace("G","u").replace("H","r").replace("I","y").replace("A","i").replace("P","p").replace("M","s")
.replace("J","f").replace("K","m").replace("L","a").replace("O","d").replace("R","k")
'in the 18th century, freemasons used piQ pen ciphers to keep their priSate records'

Now we've only got the letters l, j, z, w, g, x, v, b, q left, and we can make more guesses at the words "pig" and "private". If we check the meaning of the term "Pigpen cipher", then we can easily confirm the correctness of our deciphering.

To make the translation a bit less clumsy, we can use a python dictionary to hold our mappings:

letterMap = {"C":"t", "D":"h", "E":"e", "N":"o", "B":"n", "F":"c", "G":"u", "H":"r", "I":"y", "A":"i",
"P":"p", "M":"s", "J":"f", "K":"m", "L":"a", "O":"d", "R":"k", "S":"v", "Q":"g"}
msg = secret
for k in letterMap.keys():
    msg = msg.replace(k, letterMap[k])
msg
'in the 18th century, freemasons used pig pen ciphers to keep their private records'

Whereas the simple shift cipher only had a maximum of 25 keys, we can see that a general substitution cipher (whether it uses letters or symbols) has in principle 26! possible keys, or over 1026, a truly immense number. This sounds terrifically secure, but we have also seen how we were able to break it even without knowing the details of the pigpen cipher. Obviously the spaces and punctuation helped us a lot, but even without this, if we had longer examples of enciphered text, we could use statistical techniques to greatly narrow down the probabilities.

Puzzle 5

81, 1, 68, 59, 68, 86, 53, 76, 105, 53, 24, 22, 89, 5, 57, 68, 77, 50, 89, 81, 85, 4, 113, 71, 95, 86, 47, 44, 45, 33, 11, 64, 99, 12, 63, 10, 73, 8, 87, 52, 67, 68, 24, 72, 63, 25, 77, 6, 13, 3, 68, 57, 63, 101, 99, 60, 43, 14, 76, 88, 64, 47, 7, 53, 50, 99, 66, 76, 60, 22, 1, 99, 5, 47, 62, 53, 106, 8, 9, 81, 2, 68, 53, 75, 89, 52, 8, 25, 77, 27, 28, 113, 42, 4, 63, 75, 34, 63, 71, 63, 27, 52, 88, 76, 11, 17, 8, 11, 26, 77, 32, 113, 45, 13, 52, 77, 76, 11, 14, 13, 11, 66, 44, 63, 6, 115, 44, 37, 77, 7, 31, 6, 67, 63, 42, 77, 17, 13, 57, 84, 45, 8, 15, 63, 86, 43, 77, 68, 62, 74, 68, 23, 63, 92, 14, 68, 66, 53, 22, 52, 8, 24, 44, 68, 13, 81, 63, 18, 17, 53, 46, 72, 68, 44, 83, 39, 92, 62, 77, 28, 31, 52, 67, 63, 53, 28, 77, 43, 53, 13, 3, 3, 68, 65, 43, 63, 45, 34, 8, 26, 73, 67, 63, 68, 3, 63, 42, 68, 60, 65, 21, 4, 92, 73, 52, 74, 8, 57, 68, 65, 43, 63, 44, 38, 20, 13, 10, 52, 5, 63, 92, 50, 68, 66, 74, 67, 13, 81, 33, 75, 68, 81, 80, 63, 70 ?

At first glance, it seems like each number corresponds to a single letter (of course that is not guaranteed!), but it is soon clear that the numbers cover a surprisingly wide range.

len(secret5)
245
min(secret5), max(secret5)
(1, 115)
len(set(secret5))
77

So if each number represents a single letter, the message is 245 characters long, and the numbers go from a minimum of 1 to a maximum of 115. There are 77 different numbers in the puzzle, so it is clearly not a simple monoalphabetic cipher. There are still various possibilities though - each number could represent a pair of characters, somehow, or it might use a polyalphabetic cipher, in which more than one translation table is used depending on the position.

Yet splitting up the text in groups of different lengths and looking at the number frequencies doesn't reveal any obvious patterns. Instead, a few hints from the puzzle were needed (about the key being "elementary" and referring to the periodic table), and the realisation that the numbers from 1 to 115 could correspond to the elements in the periodic table.

Trying this out, we can collect the first letter of each element's chemical symbol as follows:

elements = "hhlbbcnofnnmaspscakcstvcmfcnczggasbkrsyznmtrrpacisstixcblcpnpsegtdhetylhtwroipahtpbparfratpunpacbcefmnlrdsbhmdrcufuluu"

and then using this as a lookup for each of the numbers in the secret:

"".join([elements[i-1] for i in secret5])
'theperiodictableisatabulararrangementofthechemicalelementsorganisedonthebasisoftheiratomicnumberselectronconfigurations
andrecurringchemicalpropertiesweveusedittocreateacipherbyusingtheinitiallettersoftheelementsbuttwoletterscantbeusedwhatarethey'

Or, in more readable form: "The periodic table is a tabular arrangement of the chemical elements organised on the basis of their atomic numbers, electron configurations and recurring chemical properties. We've used it to create a cipher by using the initial letters of the elements but two letters can't be used, what are they?"

If the letters are taken from the first letters of the chemical symbols, then this cipher can only deal with letters which are present in the periodic table. In order to find out which letters are not covered, we can take the difference between the full alphabet, and the element list:

set("abcdefghijklmnopqrstuvwxyz").difference(set(elements))
{'q', 'j'}

So the answer is that the letters 'j' and 'q' can't be used, because there aren't any symbols in the periodic table beginning with those letters.

Puzzle 6a

22 4a 72 27 65 72 20 6e 79 79 20 7a 6e 71 20 75 72 65 72 2e 20 56 27 7a 20 7a 6e 71 2e 20 4c 62 68 27 65 72 20 7a 6e 71 2e 22 20 22 55 62 6a 20 71 62 20 6c 62 68 20 78 61 62 6a 20 56 27 7a 20 7a 6e 71 3f 22 20 66 6e 76 71 20 4e 79 76 70 72 2e 20 22 4c 62 68 20 7a 68 66 67 20 6f 72 2c 22 20 66 6e 76 71 20 67 75 72 20 50 6e 67 2c 20 22 62 65 20 6c 62 68 20 6a 62 68 79 71 61 27 67 20 75 6e 69 72 20 70 62 7a 72 20 75 72 65 72 2e

Here, these are not just numbers any more, but pairs of alphanumeric characters. In particular, digits from 0 to 9, and letters from a to f. So this smells strongly of hexadecimal numbers, and the repeated occurrences of "20" (the hexadecimal ASCII code for a space) makes it worth looking if they are all hexadecimal ASCII codes.

secret6a = "22 4a 72 27 65 72 20 6e 79 79 20 7a 6e 71 20 75 72 65 72 2e 20 56 27 7a 20 7a 6e 71 2e 20 4c 62 68 27 65 72 20 7a 6e 71 
 2e 22 20 22 55 62 6a 20 71 62 20 6c 62 68 20 78 61 62 6a 20 56 27 7a 20 7a 6e 71 3f 22 20 66 6e 76 71 20 4e 79 76 70 72 2e 20 22 4c 62
 68 20 7a 68 66 67 20 6f 72 2c 22 20 66 6e 76 71 20 67 75 72 20 50 6e 67 2c 20 22 62 65 20 6c 62 68 20 6a 62 68 79 71 61 27 67 20 75 6e
 69 72 20 70 62 7a 72 20 75 72 65 72 2e"

"".join([chr(int(i,16)) for i in secret6a.split(" ")])
'"Jr\'er nyy znq urer. V\'z znq. Lbh\'er znq." "Ubj qb lbh xabj V\'z znq?" fnvq Nyvpr. "Lbh zhfg or," fnvq gur Png, "be lbh jbhyqa\'g unir pbzr urer.'

This looks very promising, particularly with the punctuation, but it is obviously still scrambled. We can see repeated occurrences of "znq", "fnvq" and "urer", and in particular the word "V'z" looks like it could be "I'm" or "I'd". Also the first word "Jr'er", given that it's not "we'll" or "he'll", could well be "we're".

Note that the backslashes in python's output are just an artefact of how it prints strings containing both apostrophes and double quotes.

msg = _.upper()
msg.replace("V", "i").replace("J","w").replace("E","r").replace("R","e")
'"we\'re NYY ZNQ Uere. i\'Z ZNQ. LBH\'re ZNQ." "UBw QB LBH XABw i\'Z ZNQ?" FNiQ NYiPe. "LBH ZHFG Oe," FNiQ GUe PNG, "Br LBH wBHYQA\'G UNIe PBZe Uere.'

Now we can be pretty sure that "LBH" is "you" and "LBH're" is "you're". Also, "Uere" is probably "here". Which gives us:

msg.replace("V", "i").replace("J","w").replace("E","r").replace("R","e")
.replace("U","h").replace("B","o").replace("L","y").replace("H","u")
'"we\'re NYY ZNQ here. i\'Z ZNQ. you\'re ZNQ." "how Qo you XAow i\'Z ZNQ?" FNiQ NYiPe. "you ZuFG Oe," FNiQ Ghe PNG, "or you wouYQA\'G hNIe PoZe here.'

Now we can guess "Qo" must be "do" and "Oe" must be "be". And "wouYQA'G" must be "wouldn't", plus we can guess "said" and "know":

_.replace("Q","d").replace("O","b").replace("Y","l").replace("A","n").replace("G","t").replace("F","s").replace("N","a").replace("X","k")
'"we\'re all Zad here. i\'Z Zad. you\'re Zad." "how do you know i\'Z Zad?" said aliPe. "you Zust be," said the Pat, "or you wouldn\'t haIe PoZe here.'

Now we can see "must" and "alice", giving us "i'm" and "mad", so we can give the final deciphered quote:

"We're all mad here. I'm mad. You're mad."
"How do you know I'm mad?" said Alice.
"You must be," said the cat, "or you wouldn't have come here.

Note that the BBC gives the answer as ending with a double quote, but the last character code is 2e, which is a full stop.

It should be clear that using the hexadecimal ASCII codes of letters is far from being secure, and the subsequent monoalphabetic substitution did not add a great deal of complexity. Once again we see how useful word boundaries and punctuation can be for deciphering.

Puzzle 6b

Key: 3 8 1 0 8

Ciphertext: 1528262114512379959787446361667336365541049710185448490827733939750117578606349583824805 994668155766548948086204569455380471171904239315967452691

The first thought is to somehow combine each of the digits of the ciphertext with one of the digits of the key. But of course all these ciphertext digits are from 0 to 9, and the key digits from 0 to 8, so any sum could only be a maximum of 17, and therefore insufficient to represent all the letters. So that can't be it.

The repeated "8" in the key is a little odd, but not too unusual, but the main problem is that there is no obvious connection to individual letters. If it's not one digit plus one digit giving one letter, then what could it be?

Even after giving up and reading the BBC's answer (simply xor-ing the digits together), there still wasn't much light shed on how this works. Taking the first digits, 1 and 3, gives a 2 when xor-ed together, and the 5 and 8 give 13, and the 2 and 0 give 2. But that doesn't get us any closer to the answer, unfortunately. What letters could 2, 13, 2 be? And we'd still have the problem that we can only represent 18 letters this way.

The answer is to not treat the ciphertext as a string of digits, but as a real (rather large) decimal number. And the key, confusingly, must be treated first as a string, and repeated until it's the same length as the ciphertext string, but then reinterpreted as another (also rather large) decimal number. So this isn't the kind of deciphering that's easy to do in one's head.

len(secret6b)
145
key6b = "38108"
len(secret6b)/len(key6b)
29.0
int(secret6b) ^ int(key6b*29)
39731163911532973211211111111432115111114116321111023210910110911111412132116104971163211111010812132119
11111410711532989799107119971141001154639

So the key, which is 5 digits long, apparently needs to be repeated 29 times so that it is 145 digits long like the ciphertext is. Then they're both treated as decimal numbers and xor-ed together. This still doesn't look too promising though. We've xor-ed two very large numbers together, and got another very large number as the result. What now?

Again we have to rely on the BBC's answers page, which tells us that pairs or triplets of these decimal digits can be used as ASCII codes to give letters. Exactly where the splits must be made is a little ambiguous, but given that we know we only want character codes up to around 127 or so, and probably don't want codes below 30, we get the following:

msg6b = "39,73,116,39,115,32,97,32,112,111,111,114,32,115,111,114,116,32,111,102,32,109,101,109,111,114,121,32,116,104,97,116,32,111,110,
108,121,32,119,111,114,107,115,32,98,97,99,107,119,97,114,100,115,46,39".split(",")
"".join([chr(int(i)) for i in msg6b])
"'It's a poor sort of memory that only works backwards.'"

This puzzle was definitely more difficult to decipher, but it suffers from being ambiguous in the splitting up of the digits, and presumably if the message length were not a multiple of 5 then it would have to be padded somehow until it was. Plus it's not particularly convenient to do by hand, and even using a computer it would be very difficult to implement for longer messages for which the numbers just become inconveniently large. It's also fairly inefficient, using 145 characters of ciphertext to produce only 55 characters of message.

Interestingly, because this scheme doesn't work on a character-by-character basis, if you wanted to change one of the plaintext characters, even by a single bit like from 'H' to 'I', the effect will ripple through a large number of the decimal characters giving a significantly different ciphertext. Also, changing the case of a single character, like from a 'T' to a 't', could require one more digit in the ASCII representation, and then the decimal digit string wouldn't be a multiple of 5 digits any more. Finally, even if you have your plaintext of the required length, and you've chosen a key, it could be that the xor of these two values gives a ciphertext whose length in decimal representation is not exactly the same length as the digit string (it could be smaller or larger), and so the ciphertext length wouldn't be a multiple of the key length. So you'd have to have a rule to concatenate the key strings as many times as necessary until the length is equal to or greater than the length of the ciphertext.

Puzzle 6c

 
   
   
   
  
   
  
  

This appears to be a regular chessboard, with a scattering of black and white pieces on it. The shading of the squares doesn't appear to hold any information, because they are just alternating light and dark squares as on a regular chessboard. So the message must be somehow contained in the positioning of the pieces.

One idea would be that numbers are encoded in a base-3 format, so an empty square could be a zero, a black piece a one and a white piece a two (for example). Then groups of squares might combine these base-3 numbers into a letter code, for example 3 squares could give a number between 0 and 26, enough for one letter (or perhaps a space). Unfortunately the 64 squares on the board don't divide neatly into groups of 3, and it's also not clear how the groups of 3 might be formed.

We can quickly see that the distributions of these tokens is not even - there are only 12 white pieces, 19 empty squares, yet 33 black pieces. Which isn't what one would expect from a numerical encoding.

Again we have to cheat, and the BBC gives us the insight that we should think about Morse code. Even though the direction of reading the squares is entirely arbitrary, we try using our standard Western reading direction, starting from the top-left and reading rightwards, line by line. If we write a white piece as "W" and a black piece as "B", this gives us:

WWW.BBWB.BBWB..BWW.BB.W.BBBB..W.BBBB.B.BB.BWB..BBBB.B.BW.WBB.BBB

If we now know that this is Morse code, there are only two options left to us - either white means dot and black dash, or the other way around. So we try both:

SQQ DME? E?TMK ?TNWO
OFF WITH THEIR HEADS

So it seems that white pieces represent a dash and black pieces represent a dot, and the answer is "Off with their heads", yet another reference to Alice in Wonderland by Lewis Carroll.

It's an interesting way of hiding the information in what looks like a chess board, but obviously Morse code isn't a way to encrypt and hide information, it's a useful way to encode information precisely because it's standard and well-known. So in terms of cryptography and "cybersecurity" (as the BBC put it) it's a little obscure (and hence not completely trivial to crack), but not actually secure at all.

Solution