This is an old revision of the document!
Check-out this tutorial: here.
During the labs, you will often need to convert data from one format to another. The most used data formats are the following:
Finally, here you have some useful conversion functions and XOR operations for different data formats:
import base64 # CONVERSION FUNCTIONS def _chunks(string, chunk_size): for i in range(0, len(string), chunk_size): yield string[i:i+chunk_size] def hex_2_bin(data): return ''.join(f'{int(x, 16):08b}' for x in _chunks(data, 2)) def str_2_bin(data): return ''.join(f'{ord(c):08b}' for c in data) def bin_2_hex(data): return ''.join(f'{int(b, 2):02x}' for b in _chunks(data, 8)) def str_2_hex(data): return ''.join(f'{ord(c):02x}' for c in data) def bin_2_str(data): return ''.join(chr(int(b, 2)) for b in _chunks(data, 8)) def hex_2_str(data): return ''.join(chr(int(x, 16)) for x in _chunks(data, 2)) # XOR FUNCTIONS def strxor(a, b): # xor two strings, trims the longer input return ''.join(chr(ord(x) ^ ord(y)) for (x, y) in zip(a, b)) def bitxor(a, b): # xor two bit-strings, trims the longer input return ''.join(str(int(x) ^ int(y)) for (x, y) in zip(a, b)) def hexxor(a, b): # xor two hex-strings, trims the longer input return ''.join(hex(int(x, 16) ^ int(y, 16))[2:] for (x, y) in zip(_chunks(a, 2), _chunks(b, 2))) # BASE64 FUNCTIONS def b64decode(data): return bytes_to_string(base64.b64decode(string_to_bytes(data))) def b64encode(data): return bytes_to_string(base64.b64encode(string_to_bytes(data))) # PYTHON3 'BYTES' FUNCTIONS def bytes_to_string(bytes_data): return bytes_data.decode() # default utf-8 def string_to_bytes(string_data): return string_data.encode() # default utf-8
Check out the following examples:
text1 = "Ana are mere" text2 = b"Ana are mere" type(text1) # <class 'str'> type(text2) # <class 'bytes'>
Both texts store basically the same information. The difference is in how the data is internally 'encoded' into 2 different object types. During the labs, we will mostly work with str types, but some external libraries may require to transform the data from the string representation to a bytes object.
Decode the following strings:
C1 = "010101100110000101101100011010000110000101101100011011000110000100100001" C2 = "526f636b2c2050617065722c2053636973736f727321" C3 = "WW91IGRvbid0IG5lZWQgYSBrZXkgdG8gZW5jb2RlIGRhdGEu"
Find the plaintext messages for the following ciphertexts knowing that the cipher is the XOR operation (ciphertext = plaintext XOR key) and the key is “abcdefghijkl”.
C1 = "000100010001000000001100000000110001011100000111000010100000100100011101000001010001100100000101" C2 = "02030F07100A061C060B1909"
Let's start with a simple one:
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" def caesar_enc(letter): if letter < 'A' or letter > 'Z': print("Invalid letter") return else: return alphabet[(ord(letter) - ord('A') + 3) % len(alphabet)]
Create a new file named caesar.py containing the code above. To test the code, open the interpreter and try the following:
shell$ python >>> from caesar import * >>> print(alphabet) >>> alphabet[0] >>> ord('A') >>> len(alphabet) >>> ord('D') - ord('A') >>> 26 % 26 >>> 28 % 26 >>> -1 % 26 >>> caesar_enc('D') >>> caesar_enc('Z') >>> caesar_enc('B')
Add a 'caesar_dec' function to 'caesar.py', which decrypts a single letter encrypted using Caesar's cipher.
We'll now expand our function to take strings as input.
alphabet='ABCDEFGHIJKLMNOPQRSTUVWXYZ' def caesar_enc_string(plaintext): ciphertext = '' for letter in plaintext: ciphertext = ciphertext + caesar_enc(letter) return ciphertext
Test the above by starting a new interpreter:
linux$ python -i caesar.py >>> test = 'HELLO' >>> test + 'WORLD' >>> caesar_enc_string(test)
Another way to run things, which can be very useful in general, is to use a main() function and write your program script as follows:
alphabet='ABCDEFGHIJKLMNOPQRSTUVWXYZ' def caesar_enc(letter): if letter < 'A' or letter > 'Z': print('Invalid letter') return else: return alphabet[(ord(letter) - ord('A') + 3) % len(alphabet)] def caesar_enc_string(plaintext): ciphertext = '' for letter in plaintext: ciphertext = ciphertext + caesar_enc(letter) return ciphertext def main(): m = 'BINEATIVENIT' c = caesar_enc_string(m) print(c) if __name__ == "__main__": main()
Then you can simply run the program, or type the following in a terminal:
python test_caesar.py
Add the corresponding 'caesar_dec_string' function.
Python allows passing default values to parameters. We can use default parameter values to expand our 'caesar_enc' function to take the key as an additional parameter, without breaking compatibility with our previous code.
def caesar_enc(letter, k = 3): if letter < 'A' or letter > 'Z': print('Invalid letter') return None else: return alphabet[(ord(letter) - ord('A') + k) % len(alphabet)] def caesar_enc_string(plaintext, k = 3): ciphertext = '' for letter in plaintext: ciphertext = ciphertext + caesar_enc(letter, k) return ciphertext
To test the new functions, try the below:
shell$ python -i caesar.py >>> caesar_enc_string('HELLO') >>> caesar_enc_string('HELLO', 0) >>> caesar_enc_string('HELLO', 1)
Using default parameters, expand your shift cipher decryption functions to support arbitrary keys.
Decrypt the last ciphertext knowing that all the messages were encrypted with the same key using OTP.
1. Activate WSL (Windows Linux Subsystem) here. 2. Install Ubuntu from Windows Store 3. Open a terminal and type “ubuntu” 4. Wait