The simplest of the encryption modes is the Electronic Codebook (ECB) mode (named after conventional physical codebooks). The message is divided into blocks, and each block is encrypted separately.
Since each block of plaintext is encrypted with the key independently, identical blocks of plaintext will yield identical blocks of ciphertext. Lots of people know that when you encrypt something in ECB mode, you can see penguins through it.
The vulnerability happens when:
For the next exercises, we will use the following code stub.
The first step in attacking a block-based cipher is to determine the size of the block. Feed identical bytes of your-string to the function 1 at a time - start with 1 byte (“A”), then “AA”, then “AAA” and so on. Discover the block size of the cipher. You know it, but do this step anyway.
How does the message length relates to the number of cypher blocks?
We give some chosen plaintext of increasing length to the oracle. When we detect a block that does not change with the addition of one more byte of chosen plaintext, this means this block only contains prefix and chosen plaintext. Eg:
RRTT TT RRXT TTT RRXX TTTT RRXX XTTT T *detected that first block did not change* RRXT TTT
Using R to denote the random prefix, X for the input we would give to the oracle (hereafter called the chosen plaintext) and T for target.
Now we know the pad length required to align the target to blocks.
Suppose we have a block cipher that takes a 16 byte plaintext and produces a 16 byte ciphertext. We use this block cipher to encrypt two blocks worth of unknown data, call them m1 and m2. Additionally we are allowed to prepend some data to these two blocks, let's call it m0 (we control this data). Note that in this scheme nothing prevents us from choosing an m0 that is 16 bytes long. This means we effectively have an encryption oracle for a full block, since the first block returned in this case would be Enc(m0) if ECB mode is being used. This means we can get the encryption of arbitrary blocks of data, which will come in handy. We can set m0 equal to 15 known bytes, and if we have an encryption oracle we can brute force the last byte:
Block 1 Block 2 Block 3 |RRXXXXXXXXXXXXX?|?......?|?......?| |----known----||--m1---|
We just have to send all 256 possible guesses for Block 1 to the encryption oracle and see which one matches the output. Let's say we get a match on the byte encoding “w”. We then repeat the process with a one byte shorter m0 to get the next byte in the same fashion:
Block 1 Block 2 Block 3 |RRXXXXXXXXXXXXw?|?......?|?......?| |----known----|
We can repeat this process for each byte until we have the whole first block m1, which let's say is “we attack at daw”. Unfortunately at this point we can't reduce m0 by any more bytes since m0 would be 0 bytes and we would simply get:
M1 M2 |we attack at daw|?......?| |----known-----|
But we since we now know all of m1 we can use the sort of attack we used to recover the first byte of m1 to recover the first byte of m2. Suppose we again choose m0 to be of length 15 bytes:
Block 1 Block 2 Block 3 |RRXXXXXXXXXXXXXw|e attack at daw?|?......?| |------------known-------------|
There's only one unknown byte in Block 2 so all we have to do is again submit all 256 guesses to the encryption oracle, except this time for Block 2 instead of Block 1! This process can be repeated to decrypt an arbitrary amount of ciphertext that is ECB encrypted as long as we can prepend data to the plaintext and have access to an encryption oracle.