Lab 02 - Shift and Vigenère ciphers

• format_funcs.py: the format functions from the previous lab;
• caesar.py: the implementation of the Caesar encryption and decryption from the previous lab;
• msg_ex2.txt: the text which needs to be decrypted for exercise 2;
• msg_ex3.txt: the text which needs to be decrypted for exercise 3;
• ex1.py: implementation of exercise 1;
• ex2.py: implementation of exercise 2;
• ex3.py: implementation of exercise 3.

You need to fill in the TODOs from ex1.py, ex2.py and ex3.py.

Exercise 1 (2p)

Alice sends Bob the following ciphertexts:

LDPWKHORUGBRXUJRG
DTZXMFQQSTYRFPJDTZWXJQKFSDLWFAJSNRFLJ
SIOMBUFFHINNUEYNBYHUGYIZNBYFILXSIOLAIXCHPUCH
ERZRZOREGURFNOONGUQNLGBXRRCVGUBYL
CJIJPMTJPMAVOCZMVIYTJPMHJOCZM
DTZXMFQQSTYRZWIJW
ZPVTIBMMOPUDPNNJUBEVMUFSZ
FVBZOHSSUVAZALHS
KAGETMXXZAFSUHQRMXEQFQEFUYAZKMSMUZEFKAGDZQUSTNAGD
MCIGVOZZBCHRSGWFSOBMHVWBUHVOHPSZCBUGHCMCIFBSWUVPCIF

Charlie manages to capture the ciphertexts and he finds that the cipher used for encryption is the shift cipher (each message possibly encrypted with a different key). Can you decrypt the messages ?

Charlie also knows that the plaintext consists only of the English letters A to Z (all capitals, no punctuation).

Hint: What do all the plain texts have in common? The answer is YOU.

Exercise 2 (4p)

Alice sends Bob another ciphertext, but much longer this time:

Charlie needs to decrypt this as well. Some colleagues tell him this is encrypted using the substitution cipher, and that again the plaintext consists only of the English letters A to Z (all capitals, no punctuation). Try to help Charlie to decrypt this.

Hint: use the frequency analysis mechanisms we discussed in class. Note that the frequency of each letter does not map precisely. In particular, the most frequent two letters do match well with the given table, but the others are sometimes mixed. However, Charlie knows that the most frequent bi-grams are the following (from most frequent to less frequent): TH, HE, IN, OR, HA, ET, AN, EA, IS, OU, HI, ER, ST, RE, ND

With this information, can you tell what the ciphertext is about?

Exercise 3 (4p)

Charlie manages to capture a last communication which turns out to be the most important, so it is crucial he decrypts it. However, this time Alice used the Vigenere cipher, with a key that Charlie knows has 7 characters.

The ciphertext is in the file attached. Try the method of multiplying probabilities as explained in class and see if you can decrypt the ciphertext. You can find details about this method here.

These are the known frequencies of the plaintext:

{'A': 0.07048643054277828,
'C': 0.01577161913523459,
'B': 0.012074517019319227,
'E': 0.13185372585096597,
'D': 0.043393514259429625,
'G': 0.01952621895124195,
'F': 0.023867295308187673,
'I': 0.06153403863845446,
'H': 0.08655128794848206,
'K': 0.007566697332106716,
'J': 0.0017594296228150873,
'M': 0.029657313707451703,
'L': 0.04609015639374425,
'O': 0.07679967801287949,
'N': 0.060217341306347746,
'Q': 0.0006382244710211592,
'P': 0.014357175712971482,
'S': 0.05892939282428703,
'R': 0.05765294388224471,
'U': 0.02749540018399264,
'T': 0.09984475620975161,
'W': 0.01892824287028519,
'V': 0.011148804047838086,
'Y': 0.023045078196872126,
'X': 0.0005289788408463661,
'Z': 0.00028173873045078196}

Bonus: Exercise 4 (3p)

In class we explained that the one time pad is malleable (i.e. we can easily change the encrypted plaintext by simply modifying the ciphertext). We have also discussed how the CRC was a very bad idea in the design of WEP due to its linearity.

You are given the following ciphertext in hexadecimal:

021e0e061d1694c9

which you know it corresponds to the concatenation of the message “floare” with its CRC-16 (in hexa “8E31”) obtained from this website: http://www.lammertbies.nl/comm/info/crc-calculation.html

If we need to modify the ciphertext so that a correct decryption outputs “albina” instead of “floare” and such that the CRC-16 calculation remains correct, what is the modification we need to perform?

Output the new ciphertext after the necessary modifications and show that it correctly leads to the plaintext “albina” and a correct computation of its CRC-16.

You might find this starting script useful:

ex4_draft.py
import sys
import random
import string
import operator

def strxor(a, b): # xor two strings (trims the longer input)
return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a, b)])

def hexxor(a, b): # xor two hex strings (trims the longer input)
ha = a.decode('hex')
hb = b.decode('hex')
return "".join([chr(ord(x) ^ ord(y)).encode('hex') for (x, y) in zip(ha, hb)])

def main():

#Plaintexts
s1 = 'floare'
s2 = 'albina'
G = '' #To find

#Obtain crc of s1
#See this site:
#http://www.lammertbies.nl/comm/info/crc-calculation.html
x1 = s1.encode('hex')
x2 = s2.encode('hex')
print "x1: " + x1
crc1 = '8E31' #CRC-16 of x1

#Compute delta (xor) of x1 and x2:
xd = hexxor(x1, x2)
print "xd: " + xd

if __name__ == "__main__":
main()

Use the property for CRC-16 that CRC(m XOR d) = CRC(m) XOR CRC(d).

If d = 'floare' XOR 'albina' and C = [C1 | C2] = [m XOR G1 | CRC(m) XOR G2], then C1' = C1 XOR d.