In this exercise we'll try to break a Linear Congruential Generator, that may be used to generate “poor” random numbers. We implemented such weak RNG to generate a sequence of bytes and then encrypted a plaintext message. The resulting ciphertext in hexadecimal is this:
a432109f58ff6a0f2e6cb280526708baece6680acc1f5fcdb9523129434ae9f6ae9edc2f224b73a8
You know that the LCG uses the following formula to produce each byte:
s_next = (a * s_prev + b) mod p
where both s_prev and s_next are byte values (between 0 and 255) and p is 257. Both a and b are values between 0 and 256.
You also know that the first 16 letters of the plaintext are “Let all creation” and that the ciphertext was generated by xor-ing a string of consecutive bytes generated by the LCG with the plaintext.
Can you break the LCG and predict the RNG stream so that in the end you find the entire plaintext ?
You may use this starting code:
import sys import random import string import operator #Parameters for weak LC RNG class WeakRNG: "Simple class for weak RNG" def __init__(self): self.rstate = 0 self.maxn = 255 self.a = 0 #Set this to correct value self.b = 0 #Set this to correct value self.p = 257 def init_state(self): "Initialise rstate" self.rstate = 0 #Set this to some value self.update_state() def update_state(self): "Update state" self.rstate = (self.a * self.rstate + self.b) % self.p def get_prg_byte(self): "Return a new PRG byte and update PRG state" b = self.rstate & 0xFF self.update_state() return b def strxor(a, b): # xor two strings (trims the longer input) return "".join([chr(ord(x) ^ ord(y)) for (x, y) in zip(a, b)]) def hexxor(a, b): # xor two hex strings (trims the longer input) ha = a.decode('hex') hb = b.decode('hex') return "".join([chr(ord(x) ^ ord(y)).encode('hex') for (x, y) in zip(ha, hb)]) def main(): #Initialise weak rng wr = WeakRNG() wr.init_state() #Print ciphertext CH = 'a432109f58ff6a0f2e6cb280526708baece6680acc1f5fcdb9523129434ae9f6ae9edc2f224b73a8' print "Full ciphertext in hexa: " + CH #Print known plaintext pknown = 'Let all creation' nb = len(pknown) print "Known plaintext: " + pknown pkh = pknown.encode('hex') print "Plaintext in hexa: " + pkh #Obtain first nb bytes of RNG gh = hexxor(pkh, CH[0:nb*2]) print gh gbytes = [] for i in range(nb): gbytes.append(ord(gh[2*i:2*i+2].decode('hex'))) print "Bytes of RNG: " print gbytes #Break the LCG here: #1. find a and b #2. predict/generate rest of RNG bytes #3. decrypt plaintext # Print full plaintext p = '' print "Full plaintext is: " + p if __name__ == "__main__": main()
Let's use the experiment defined earlier as a pseudorandom generator ($\mathsf{PRG}$) as follows:
a. Implement the frequency (monobit) test from NIST (see section 2.1) and check if a sequence generated by the above $\mathsf{PRG}$ (say $n=100$) seems random or not.
b. Run the test on a random bitstring (e.g. a string such as R used by the above $\mathsf{PRG}$), and compare the result of the test.
If the two results are different across many iterations, this test already gives you an attacker that breaks the $\mathsf{PRG}$.
import random def get_random_string(n): #generate random bit string bstr = bin(random.getrandbits(n)).lstrip('0b').zfill(n) return bstr
In this exercise we'll build a simple Linear Feedback Shift Register (LFSR). LFSRs produce random bit strings with good statistical properties, but are very easy to predict.
The register is a sequence of $n$ bits; a LFSR is defined by:
For example, given an $18$ bit LFSRm the polynomial $X^{18} + X^{11} + 1$ and the initial state:
state = '001001001001001001' * *
we generate a new bit $b$ by $\mathsf{xor}$-ing bits $11$ ($0$) and $18$ ($1$), thus obtaining $b = 1$. We then shift the whole register to the right (thus dropping the right-most bit, which is the bit we add to the generated random sequence) and insert $b$ to the left. Thus, the new state is:
state = '100100100100100100'
The process is repeated until the desired number of bits have been generated.
Using the above starting state and polynomial, generate $100$ random bits and run the monobit statistical test from the previous exercise to see if their frequency seems random.