Foray into Cryptography

Until recently my main experience with cryptography was fitting my password to the arbitrary requirements of whatever website I’m signing up for. ‘Crypto’ keeps popping up randomly in web apps, so I bought a used version of Understanding Cryptography to understand… cryptography. The previous owner extensively highlighted the introductory chapter, but abruptly stopped at around page 22. I was about ready to give up too, but the next page has this fantastic quote in the “Lessons Learned” summary section:

“Never ever develop your own crypto algorithm unless you have a team of experienced cryptanalysts checking your design.”

I can definitely get behind something that not only recommends being lazy, but is also one of the major tenets of the field. Despite the brilliance of a single person or small team working in isolation, they will probably never beat the time-tested, heavily researched methods. I love that in the job of hiding information, the experts are extremely open about how to hide it.

One of the first exercises in the book is to crack a deciphered message that uses the infamous Caesar cipher.

The Caesar Cipher, or Shift Cipher, encrypts a document by assigning each letter a numerical value then ‘shifting’ it by some amount. For example, if the key is 3 then ‘A’ becomes ‘D’, ‘B’ becomes ‘E’, and so on. Once you get to ‘Z’ it loops around, so ‘Y’ becomes ‘B’, ‘Z’ becomes ‘C’, etc.

The encryption function e for our cleartext (normal) letter x, looks like this:

e(x) = (x + k) % 26

The decryption function for the encrypted letter y, looks like this:

d(y) = (y - k) % 26

where is the key (the number of spaces to shift) and is the modulo operator.

This encryption method isn’t very secure because the encrypted data retains characteristics of English (or whatever language the cleartext is written in). Certain letters and words are statistically more likely than others, so through analysis of the encrypted text you can determine the encryption key. Here’s the encrypted text given in the book:


Now the plan for decryption (in ruby):

  1. Write helper functions to convert words to numbers where A = 0, B = 1, C = 2, and vice versa.
  2. Write a function that will retrieve the letter frequency from the text
  3. Compare the most frequent letters to the most frequent letters in English to determine the key.

Admittedly, that’s a lot of work when you could just count the letters to skip right to step 3. But let’s pretend we had to do this on thousands of pages worth of data.

To turn letters into numbers I added a method to the String class named numerify:

class String
  #makes it so that 'a' is 0, 'b' is 1, etc.
  def numerify
    num = []
    self.downcase.each_char {|c| num << c.ord - 97}

Notice that returns an Array of numbers, so to reverse that function I had to modify the Array class:

class Array
  #takes an array of encrypted letters (as numbers) and turns into lowercase letters
  def denumerify
    letters = ""
    self.each {|n| letters += (n + 97).chr}

Now we write the encryption and decryption functions using the equations from above.

#input: string to be encrypted, key (int)
#output: encrypted string as a comma separated string
def encrypt(x, k)
  e = []
  x.numerify.each {|i| e << ((i + k) % 26)}
#input: takes in an encrypted string and the key,
#output returns the decrypted string
def decrypt(y, k)
  d = []
  y.numerify.each{|i| d << ((i - k) % 26)}

Finally, we have to count the letter frequency so we can actually decipher the message. Here was my original way of doing it, which returns a key => value hash (letter => count).

letter_freq = {}
encrypted_text.chars.each do |i|
  letter_freq[i].nil? ? letter_freq[i] = 1 : letter_freq[i] += 1

But there’s always a better way! I found this more rubyesque method on Stack Overflow which I could tack onto the ‘String’ class:

class String
  def frequency
    self.chomp.each_char.with_object( do |letter, hash|
      hash[letter.downcase] += 1
  #makes it so that 'a' is 0, 'b' is 1, etc.
  def numerify
    num = []
    self.downcase.each_char {|c| num << c.ord - 97}

Almost there. Now we can call the frequency method on our encrypted string to get the letter count, then sort by occurrence:

letter_freq = encrypted_string.frequency
#sort by occurance
sorted_letter_frequency = []
letter_freq.each_pair do |i|
  sorted_letter_frequency << i
  sorted_letter_frequency.sort! do |i, j|
    j[1] <=> i[1]

The most common letter in the encrypted text is t, then i, then x. Wikipedia has some nice graphs on letter frequency, but usually you can just assume the most common letter is ‘e’.

E(4) -> T(19) = 15 letters in between, so we print out the decryption with a key of 15, and there’s our answer.

p decrypt(encrypted_string, 15)

I’m not going to spoil this one, but let’s just say these crypto guys are a bit sick in the head. Anyways, it was fun pretending to be Jake Gyllenhaal in Zodiac and I hope to write more about encryption in the future.