This is an old revision of the document!


Laboratorul 03 - PRGs

Prezentarea PowerPoint pentru acest laborator poate fi găsită aici.

utils.py
import base64
 
# CONVERSION FUNCTIONS
 
def _chunks(string, chunk_size):
    for i in range(0, len(string), chunk_size):
        yield string[i:i+chunk_size]
 
def _hex(x):
    return format(x, '02x')
 
def hex_2_bin(data):
    return ''.join(f'{int(x, 16):08b}' for x in _chunks(data, 2))
 
def str_2_bin(data):
    return ''.join(f'{ord(c):08b}' for c in data)
 
def bin_2_hex(data):
    return ''.join(f'{int(b, 2):02x}' for b in _chunks(data, 8))
 
def str_2_hex(data):
    return ''.join(f'{ord(c):02x}' for c in data)
 
def bin_2_str(data):
    return ''.join(chr(int(b, 2)) for b in _chunks(data, 8))
 
def hex_2_str(data):
    return ''.join(chr(int(x, 16)) for x in _chunks(data, 2))
 
# XOR FUNCTIONS
 
def strxor(a, b):  # xor two strings, trims the longer input
    return ''.join(chr(ord(x) ^ ord(y)) for (x, y) in zip(a, b))
 
def bitxor(a, b):  # xor two bit-strings, trims the longer input
    return ''.join(str(int(x) ^ int(y)) for (x, y) in zip(a, b))
 
def hexxor(a, b):  # xor two hex-strings, trims the longer input
    return ''.join(_hex(int(x, 16) ^ int(y, 16)) for (x, y) in zip(_chunks(a, 2), _chunks(b, 2)))
 
# BASE64 FUNCTIONS
 
def b64decode(data):
    return bytes_to_string(base64.b64decode(string_to_bytes(data)))
 
def b64encode(data):
    return bytes_to_string(base64.b64encode(string_to_bytes(data)))
 
# PYTHON3 'BYTES' FUNCTIONS
 
def bytes_to_string(bytes_data):
    return bytes_data.decode()  # default utf-8
 
def string_to_bytes(string_data):
    return string_data.encode()  # default utf-8

Exercise 1

In this exercise we'll try to break a Linear Congruential Generator, that may be used to generate “poor” random numbers. We implemented such weak RNG to generate a sequence of bytes and then encrypted a plaintext message. The resulting ciphertext in hexadecimal is this:

a432109f58ff6a0f2e6cb280526708baece6680acc1f5fcdb9523129434ae9f6ae9edc2f224b73a8

You know that the LCG uses the following formula to produce each byte:

s_next = (a * s_prev + b) mod p

where both s_prev and s_next are byte values (between 0 and 255) and p is 257. Both a and b are values between 0 and 256.

You also know that the first 16 letters of the plaintext are “Let all creation” and that the ciphertext was generated by XOR-ing a string of consecutive bytes generated by the LCG with the plaintext.

Can you break the LCG and predict the RNG stream so that in the end you find the entire plaintext?

You may use this starting code:

'ex1_weak_rng.py'
from utils import *
 
#Parameters for weak LC RNG
class WeakRNG:
    "Simple class for weak RNG"
    def __init__(self):
        self.rstate = 0
        self.maxn = 255
        self.a = 0 #Set this to correct value
        self.b = 0 #Set this to correct value
        self.p = 257
 
    def init_state(self):
        "Initialise rstate"
        self.rstate = 0 #Set this to some value
        self.update_state()
 
    def update_state(self):
        "Update state"
        self.rstate = (self.a * self.rstate + self.b) % self.p
 
    def get_prg_byte(self):
        "Return a new PRG byte and update PRG state"
        b = self.rstate & 0xFF
        self.update_state()
        return b
 
 
def main():
 
    #Initialise weak rng
    wr = WeakRNG()
    wr.init_state()
 
    #Print ciphertext
    CH = 'a432109f58ff6a0f2e6cb280526708baece6680acc1f5fcdb9523129434ae9f6ae9edc2f224b73a8'
    print("Full ciphertext in hexa: " + CH)
 
    #Print known plaintext
    pknown = 'Let all creation'
    nb = len(pknown)
    print("Known plaintext: " + known)
    pkh = str_2_hex(pknown)
    print("Plaintext in hexa: " + pkh)
 
    #Obtain first nb bytes of RNG
    gh = hexxor(pkh, CH[0:nb*2])
    print(gh)
    gbytes = []
    for i in range(nb):
        gbytes.append(ord(hex_2_str(gh[2*i:2*i+2])))
    print("Bytes of RNG: ")
    print(gbytes)
 
    #Break the LCG here:
    #1. find a and b
    #2. predict/generate rest of RNG bytes
    #3. decrypt plaintext
 
    # Print full plaintext
    p = ''
    print("Full plaintext is: " + p)
 
 
if __name__ == "__main__":
    main()  

Exercise 2

Advantage. The purpose of this problem is to clarify the concept of advantage. Consider the following two experiments $\mathsf{EXP}(0)$ and $\mathsf{EXP}(1)$:

  • In $\mathsf{EXP}(0)$ the challenger flips a fair coin (probability $1/2$ for HEADS and $1/2$ for TAILS) and sends the result to the adversary $\mathsf{A}$.
  • In $\mathsf{EXP}(1)$ the challenger always sends TAILS to the adversary.

Let r = 0 for HEADS and r = 1 for TAILS. Then we have the experiment as shown below:

The adversary’s goal is to distinguish these two experiments: at the end of each experiment the adversary outputs a bit $0$ or $1$ for its guess for which experiment it is in. For $b = 0,1$ let $W_{b}$ be the event that in experiment $b$ the adversary output $1$. The adversary tries to maximize its distinguishing advantage, namely the quantity $\mathsf{Adv} = \left| \mathsf{Pr}\left[W_{0}\right] − \mathsf{Pr}\left[W_{1}\right] \right| \in \left[0, 1\right]$ .

The advantage $\mathsf{Adv}$ captures the adversary’s ability to distinguish the two experiments. If the advantage is $0$ then the adversary behaves exactly the same in both experiments and therefore does not distinguish between them. If the advantage is $1$ then the adversary can tell perfectly what experiment it is in. If the advantage is negligible for all efficient adversaries (as defined in class) then we say that the two experiments are indistinguishable.

a. Calculate the advantage of each of the following adversaries:

  • A1: Always output $1$.
  • A2: Ignore the result reported by the challenger, and randomly output $0$ or $1$ with even probability.
  • A3: Output $1$ if HEADS was received from the challenger, else output $0$.
  • A4: Output $0$ if HEADS was received from the challenger, else output $1$.
  • A5: If HEADS was received, output $1$. If TAILS was received, randomly output $0$ or $1$ with even probability.
  • A6: If HEADS was received, randomly output $0$ or $1$ with even probability. If TAILS was received, output $0$.

b. What is the maximum advantage possible in distinguishing these two experiments? Explain why.

You may want to compute the general formula, regardless of what the adversary is.

Exercise 3

One of the applications of computing the advantage for an adversary on a cryptographic scheme is to prove that a $\mathsf{PRG}$ is secure or not. Recall from the lecture that $G : K \rightarrow \{0, 1\}^n$ is a secure $\mathsf{PRG}$ if for any efficient (that runs in polynomial time) statistical test $\mathsf{A}$, the advantage $\mathsf{Adv_{PRG}[A, G]}$ is negligible. A statistical test is basically an algorithm that tries to determine whether the input seems random or not, therefore it can be defined as follows:

\begin{equation*} A(x) = \begin{cases} 1 & \text{if x is random}\\ 0 & \text{otherwise} \end{cases} \end{equation*}

You can associate the statistical test $\mathsf{A}$ with an adversary that tries to break the $\mathsf{PRG}$ (i.e. it can distinguish the output of the $\mathsf{PRG}$ from a truly random generator). In this case, we shall define the experiments as follows:

  • In $\mathsf{EXP}(0)$ the challenger sends the output $G(k)$ generated by the $\mathsf{PRG}$ to the adversary $\mathsf{A}$;
  • In $\mathsf{EXP}(1)$ the challenger sends the output $r$ generated by a truly random generator to the adversary.

Also, $W_{b}$ represents the event in which the adversary guesses that the input is random (i.e. $A(x) = 1$), in the experiment $b$. The idea behind this is that if the adversary can tell the difference between the $\mathsf{PRG}$ and a truly random generator, then the $\mathsf{PRG}$ is not secure. Thus, the advantage becomes: $\mathsf{Adv_{PRG}[A, G]} = \left|\underset{k \leftarrow K}{Pr}[A(G(k)) = 1] - \underset{r \leftarrow \{0, 1\}^n}{Pr}[A(r) = 1] \right|$.

Although the problem of determining whether there are provably secure $\mathsf{PRG}s$ is equivalent to solving the well-known problem $\mathsf{P}$ vs $\mathsf{NP}$, in practice there are some $\mathsf{PRG}s$ that are considered secure, by using some heuristics.

In this exercise, you are given a $\mathsf{PRG}$ $G : \{0, 1\}^s \rightarrow \{0, 1\}^n$ that is known to be secure and your assignment is to determine which of the following $\mathsf{PRG}s$ are also secure:

  • $G_{1}(k) = G(k) \oplus 1^n$
  • $G_{2}(k) = G(k) \| 0$
  • $G_{3}(k) = G(0)$
  • $G_{4}(k=(k_1, k_2)) = G(k_1) \| G(k_2)$
  • $G_{5}(k)=G(k) \| G(k)$

For strings $y$ and $z$ we use $y \| z$ to denote the concatenation of $y$ and $z$.

Some of these exercises may not require computing any advantage.

Exercise 4

Let's use the experiment defined earlier as a pseudorandom generator ($\mathsf{PRG}$) as follows:

  1. Set a desired output length $n$
  2. Obtain a random sequence $R$ of bits of length $n$ (e.g. using the Linear-congruential generator from Exercise 1)
  3. For each bit $r$ in the random sequence $R$ generated in the previous step, output a bit $b$ as follows:
  • if the bit $r$ is $0$, then output a random bit $b \in \{0, 1\}$
  • if the bit $r$ is $1$, then output $1$

a. Implement the frequency (monobit) test from NIST (see section 2.1) and check if a sequence generated by the above $\mathsf{PRG}$ (say $n=100$) seems random or not.

b. Run the test on a random bitstring (e.g. a string such as R used by the above $\mathsf{PRG}$), and compare the result of the test.

If the two results are different across many iterations, this test already gives you an attacker that breaks the $\mathsf{PRG}$.

You may use a function like this to generate a random bitstring

import random
 
def get_random_string(n): #generate random bit string
    bstr = bin(random.getrandbits(n)).lstrip('0b').zfill(n)
    return bstr

Also, in Python you may find the functions sqrt, fabs and erfc from the module math useful

Exercițiul 5 (Bonus)

In acest exercitiu, veti incerca sa verificati daca intr-adevar Salsa20 stream cipher este mai rapid decat RC4. Descarcati aceasta arhiva. Rulati programul folosind acelasi fisier de intrare atat pentru Salsa20 cat si pentru RC4. Comparati timpul de executie in ambele cazuri.

Pentru sistemele UNIX (si WSL), folositi scriptul prepare.sh pentru a complila sursele si a genera fisierele necesare. Daca nu folositi WSL pe Windows, scriptul prepare.ps1 genereaza fisierele necesare dar nu si compileaza sursele.

Folositi optiunea ”-h” pentru a afisa lista tuturor argumentelor folosite de program.

Uitati-va in fisierul salsa20.h. Unde se executa rundele de criptare?

Incercati sa implementati aceeasi functionalitate folosint OpenSSL. OpenSSL suporta RC4 dar nu si Salsa20. Folositi ChaCha20 in loc de Salsa20, aceasta fiind o varianta imbunatatita a algoritmului.

ic/labs/03.1633706368.txt.gz · Last modified: 2021/10/08 18:19 by philip.dumitru
CC Attribution-Share Alike 3.0 Unported
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0