Keyed Permutations
AES, like all good block ciphers, performs a “keyed permutation”. This means that it maps every possible input block to a unique output block, with a key determining which permutation to perform.
Using the same key, the permutation can be performed in reverse, mapping the output block back to the original input block. It is important that there is a one-to-one correspondence between input and output blocks, otherwise we wouldn’t be able to rely on the ciphertext to decrypt back to the same plaintext we started with. What is the mathematical term for a one-to-one correspondence?
Solution
flag: crypto{bijection}
Resisting Bruteforce
If a block cipher is secure, there should be no way for an attacker to distinguish the output of AES from a random permutation of bits. Furthermore, there should be no better way to undo the permutation than simply bruteforcing every possible key. That’s why academics consider a cipher theoretically “broken” if they can find an attack that takes fewer steps to perform than bruteforcing the key, even if that attack is practically infeasible.
It turns out that there is an attack on AES that’s better than bruteforce, but only slightly – it lowers the security level of AES-128 down to 126.1 bits, and hasn’t been improved on for over 8 years. Given the large “security margin” provided by 128 bits, and the lack of improvements despite extensive study, it’s not considered a credible risk to the security of AES. But yes, in a very narrow sense, it “breaks” AES.
Finally, while quantum computers have the potential to completely break popular public-key cryptosystems like RSA via Shor’s algorithm, they are thought to only cut in half the security level of symmetric cryptosystems via Grover’s algorithm. This is one reason why people recommend using AES-256, despite it being less performant, as it would still provide a very adequate 128 bits of security in a quantum future.
What is the name for the best single-key attack against AES?
Solution
flag: crypto{biclique}
Structure of AES
To achieve a keyed permutation that is infeasible to invert without the key, AES applies a large number of ad-hoc mixing operations on the input. This is in stark contrast to public-key cryptosystems like RSA, which are based on elegant individual mathematical problems. AES is much less elegant, but it’s very fast.
At a high level, AES-128 begins with a “key schedule” and then runs 10 rounds over a state. The starting state is just the plaintext block that we want to encrypt, represented as a 4x4 matrix of bytes. Over the course of the 10 rounds, the state is repeatedly modified by a number of invertible transformations.
Here’s an overview of the phases of AES encryption:
- KeyExpansion or Key Schedule
From the 128 bit key, 11 separate 128 bit “round keys” are derived: one to be used in each AddRoundKey step.
- Initial key addition
AddRoundKey - the bytes of the first round key are XOR’d with the bytes of the state.
- Round - this phase is looped 10 times, for 9 main rounds plus one “final round”
a) SubBytes - each byte of the state is substituted for a different byte according to a lookup table (“S-box”).
b) ShiftRows - the last three rows of the state matrix are transposed–shifted over a column or two or three.
c) MixColumns - matrix multiplication is performed on the columns of the state, combining the four bytes in each column. This is skipped in the final round.
d) AddRoundKey - the bytes of the current round key are XOR’d with the bytes of the state.
Included is a
bytes2matrix
function for converting our initial plaintext block into a state matrix. Write amatrix2bytes
function to turn that matrix back into bytes, and submit the resulting plaintext as the flag.Challenge files:
- matrix.py
Resources:
file: matrix.py
|
|
Solution
|
|
> print(matrix2bytes(matrix))
crypto{inmatrix}
Alternative solution(s):
|
|
flag: crypto{inmatrix}
Round Keys
We’re going to skip over the finer details of the KeyExpansion phase for now. The main point is that it takes in our 16 byte key and produces 11 4x4 matrices called “round keys” derived from our initial key. These round keys allow AES to get extra mileage out of the single key that we provided.
The initial key addition phase, which is next, has a single AddRoundKey step. The AddRoundKey step is straightforward: it XORs the current state with the current round key.
AddRoundKey also occurs as the final step of each round. AddRoundKey is what makes AES a “keyed permutation” rather than just a permutation. It’s the only part of AES where the key is mixed into the state, but is crucial for determining the permutation that occurs.
As you’ve seen in previous challenges, XOR is an easily invertible operation if you know the key, but tough to undo if you don’t. Now imagine trying to recover plaintext which has been XOR’d with 11 different keys, and heavily jumbled between each XOR operation with a series of substitution and transposition ciphers. That’s kinda what AES does! And we’ll see just how effective the jumbling is in the next few challenges.
Complete the
add_round_key
function, then use thematrix2bytes
function to get your next flag.Challenge files:
- add_round_key.py
file: add_round_key.py
|
|
Solution
|
|
or
|
|
> add_round_key(state, round_key)
[[99, 114, 121, 112],
[116, 111, 123, 114],
[48, 117, 110, 100],
[107, 51, 121, 125]]
> print(matrix2bytes(add_round_key(state, round_key)))
crypto{r0undk3y}
flag: crypto{r0undk3y}
Confusion through Substitution
The first step of each AES round is SubBytes. This involves taking each byte of the state matrix and substituting it for a different byte in a preset 16x16 lookup table. The lookup table is called a “Substitution box” or “S-box” for short, and can be perplexing at first sight. Let’s break it down.
In 1945 American mathematician Claude Shannon published a groundbreaking paper on Information Theory. It identified “confusion” as an essential property of a secure cipher. “Confusion” means that the relationship between the ciphertext and the key should be as complex as possible. Given just a ciphertext, there should be no way to learn anything about the key.
If a cipher has poor confusion, it is possible to express a relationship between ciphertext, key, and plaintext as a linear function. For instance, in a Caesar cipher,
ciphertext = plaintext + key
. That’s an obvious relation, which is easy to reverse. More complicated linear transformations can be solved using techniques like Gaussian elimination. Even low-degree polynomials, e.g. an equation likex^4 + 51x^3 + x
, can be solved efficiently using algebraic methods. However, the higher the degree of a polynomial, generally the harder it becomes to solve – it can only be approximated by a larger and larger amount of linear functions.The main purpose of the S-box is to transform the input in a way that is resistant to being approximated by linear functions. S-boxes are aiming for high non-linearity, and while AES’s one is not perfect, it’s pretty close. The fast lookup in an S-box is a shortcut for performing a very nonlinear function on the input bytes. This function involves taking the modular inverse in the Galois field 2**8 and then applying an affine transformation which has been tweaked for maximum confusion. The simplest way to express the function is through the following high-degree polynomial:
diagram showing S-Box equation
To make the S-box, the function has been calculated on all input values from 0x00 to 0xff and the outputs put in the lookup table.
Implement
sub_bytes
, send the state matrix through the inverse S-box and then convert it to bytes to get the flag.Challenge files:
- sbox.py
file: sbox.py
|
|
Solution
|
|
> sub_bytes(state, inv_s_box)
b'crypto{l1n34rly}'
flag: crypto{l1n34rly}
Diffusion through Permutation
We’ve seen how S-box substitution provides confusion. The other crucial property described by Shannon is “diffusion”. This relates to how every part of a cipher’s input should spread to every part of the output.
Substitution on its own creates non-linearity, however it doesn’t distribute it over the entire state. Without diffusion, the same byte in the same position would get the same transformations applied to it each round. This would allow cryptanalysts to attack each byte position in the state matrix separately. We need to alternate substitutions by scrambling the state (in an invertible way) so that substitutions applied on one byte influence all other bytes in the state. Each input into the next S-box then becomes a function of multiple bytes, meaning that with every round the algebraic complexity of the system increases enormously.
The ShiftRows and MixColumns steps combine to achieve this. They work together to ensure every byte affects every other byte in the state within just two rounds.
ShiftRows is the most simple transformation in AES. It keeps the first row of the state matrix the same. The second row is shifted over one column to the left, wrapping around. The third row is shifted two columns, the fourth row by three. Wikipedia puts it nicely: “the importance of this step is to avoid the columns being encrypted independently, in which case AES degenerates into four independent block ciphers.”
MixColumns is more complex. It performs Matrix multiplication in Rijndael’s Galois field between the columns of the state matrix and a preset matrix. Each single byte of each column therefore affects all the bytes of the resulting column. The implementation details are nuanced; this page and Wikipedia do a good job of covering them.
We’ve provided code to perform MixColumns and the forward ShiftRows operation. After implementing
inv_shift_rows
, take the state, runinv_mix_columns
on it, theninv_shift_rows
, convert to bytes and you will have your flag.Challenge files:
- diffusion.py
file: diffusion.py
|
|
Solution
|
|
> inv_mix_columns(state)
> state
[[99, 111, 102, 125],
[116, 102, 82, 112],
[49, 51, 121, 100],
[115, 114, 123, 85]]
> inv_shift_rows(state)
> state
[[99, 114, 121, 112],
[116, 111, 123, 100],
[49, 102, 102, 85],
[115, 51, 82, 125]]
> matrix2bytes(state)
b'crypto{d1ffUs3R}'
flag: crypto{d1ffUs3R}
Bringing It All Together
Apart from the KeyExpansion phase, we’ve sketched out all the components of AES. We’ve shown how SubBytes provides confusion and ShiftRows and MixColumns provide diffusion, and how these two properties work together to repeatedly circulate non-linear transformations over the state. Finally, AddRoundKey seeds the key into this substitution-permutation network, making the cipher a keyed permutation.
Decryption involves performing the steps described in the “Structure of AES” challenge in reverse, applying the inverse operations. Note that the KeyExpansion still needs to be run first, and the round keys will be used in reverse order. AddRoundKey and its inverse are identical as XOR has the self-inverse property.
We’ve provided the key expansion code, and ciphertext that’s been properly encrypted by AES-128. Copy in all the building blocks you’ve coded so far, and complete the
decrypt
function that implements the steps shown in the diagram. The decrypted plaintext is the flag.Yes, you can cheat on this challenge, but where’s the fun in that?
The code used in these exercises has been taken from Bo Zhu’s super simple Python AES implementation, so we’ve reproduced the license here.
Challenge files:
- aes_decrypt.py
- LICENSE
Resources:
file: aes_decrypt.py
|
|
Solution
|
|
> print(decrypt(key,ciphertext))
b'crypto{MYAES128}'
flag: crypto{MYAES128}
Modes of Operation Starter
The previous set of challenges showed how AES performs a keyed permutation on a block of data. In practice, we need to encrypt messages much longer than a single block. A mode of operation describes how to use a cipher like AES on longer messages.
All modes have serious weaknesses when used incorrectly. The challenges in this category take you to a different section of the website where you can interact with APIs and exploit those weaknesses. Get yourself acquainted with the interface and use it to take your next flag!
|
|
Solution
- Visit http://aes.cryptohack.org/block_cipher_starter
- Visit https://aes.cryptohack.org//block_cipher_starter/encrypt_flag/
{"ciphertext":"1b36a55b687f21f73fe0bed721c1a5c305716a9a1c1745d50a39e0ae8f2fb9ba"}
- Decrypt ciphertext
4. Hex Decode
flag: crypto{bl0ck_c1ph3r5_4r3_f457_!}
Passwords as Keys
It is essential that keys in symmetric-key algorithms are random bytes, instead of passwords or other predictable data. The random bytes should be generated using a cryptographically-secure pseudorandom number generator (CSPRNG). If the keys are predictable in any way, then the security level of the cipher is reduced and it may be possible for an attacker who gets access to the ciphertext to decrypt it.
Just because a key looks like it is formed of random bytes, does not mean that it necessarily is. In this case the key has been derived from a simple password using a hashing function, which makes the ciphertext crackable.
For this challenge you may script your HTTP requests to the endpoints, or alternatively attack the ciphertext offline. Good luck!
Solution
|
|
flag: crypto{k3y5__r__n07__p455w0rdz?}
ECB Oracle
ECB is the most simple mode, with each plaintext block encrypted entirely independently. In this case, your input is prepended to the secret flag and encrypted and that’s it. We don’t even provide a decrypt function. Perhaps you don’t need a padding oracle when you have an “ECB oracle”?
source code:
|
|
Solution
This problem was a bit difficult for me to solve. The first step in understanding it was looking more into how the pad
function actually works in the backend of pycryptodome
. This is more easily demonstrated through an example.
> from Crypto.Util.Padding import pad
> [pad(b'?'*i, 16) for i in range(1,17)] # We want to see 1-16, so we set the range to 17 since it doesn't include the last value.
[b'?\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f\x0f',
b'??\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e\x0e',
b'???\r\r\r\r\r\r\r\r\r\r\r\r\r',
b'????\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c\x0c',
b'?????\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b\x0b',
b'??????\n\n\n\n\n\n\n\n\n\n',
b'???????\t\t\t\t\t\t\t\t\t',
b'????????\x08\x08\x08\x08\x08\x08\x08\x08',
b'?????????\x07\x07\x07\x07\x07\x07\x07',
b'??????????\x06\x06\x06\x06\x06\x06',
b'???????????\x05\x05\x05\x05\x05',
b'????????????\x04\x04\x04\x04',
b'?????????????\x03\x03\x03',
b'??????????????\x02\x02',
b'???????????????\x01',
b'????????????????\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10']
When the amount of ?
’s we provide is less than the block_size
of 16, padding will be added. However, if 16 bytes (or ?
’s) are provided, the pad
function will create a new block (2 blocks of 16 bytes, totalling to 32 bytes in length). Therefore, if we give a bunch of garbage, we can leak the length of the flag.
TL;DR: we can measure the amount of bytes we send in alongside the amount of blocks that get generated to determine the flag length. So, if we send in x-1
bytes and have 2 blocks (32 bytes total), then send in x
bytes and have 3 blocks (48 bytes total), we know:
flag_length = 2 blocks * 16 bytes/block - x bytes -> 32 bytes - x bytes
Time to script…
|
|
Garbage (1 bytes): ?
Ciphertext (32 bytes): 341dd0bf293efbc386baa0450a9f7a121d91b08cbd0a3ff55d6225e7f2cb1fe1
Garbage (2 bytes): ??
Ciphertext (32 bytes): c7404a9325a0b5bd6f663638f86f6d14133cfd98ad547f6a1dae2cdccac36bda
Garbage (3 bytes): ???
Ciphertext (32 bytes): 1e46817be36af1d0d263fed68ab2b3b440b0f25961b3330f4880effcdd1d9372
Garbage (4 bytes): ????
Ciphertext (32 bytes): 700b70280d82306f6b577da0e70914003775fe4513f275eeb4a20548db2b1372
Garbage (5 bytes): ?????
Ciphertext (32 bytes): 043d72cd0c91d09ee654031196f6a203f0aaaa9734688f65f110768d242965f2
Garbage (6 bytes): ??????
Ciphertext (32 bytes): dd90d7562c5d17c5ff323f1a024483749f1d9ea23fc3f0857e80d9254d053b99
Garbage (7 bytes): ???????
Ciphertext (48 bytes): d32166eeaa47575cf04ae32526b6006c1f227170511203bbb211a703905f9e5128ae38bc1312435b814108836328262a
Garbage (8 bytes): ????????
Ciphertext (48 bytes): 9eaa653b150e6218b56d887fb99d00f21206ed1975a928fe0813952f43171080b6800d8b95758d3bf16d0f75ca9f38e8
Garbage (9 bytes): ?????????
Ciphertext (48 bytes): 7a987178f47d51a9d650fdd312580bf50e5eaa21119c631ae8304d9d2c1ef310593c7d830f153c8a4b7f41c116065d73
Garbage (10 bytes): ??????????
Ciphertext (48 bytes): 66a3abc4b4b82020586064a647d7fa75e1434b90c1f8633f9818265dfff40e08c9dd5d6a2703d3beb1def6e688083f3a
Garbage (11 bytes): ???????????
Ciphertext (48 bytes): 0fd571c8e551af19d186a30c9b3c02034cc4a08218ed3d4aead3c49d2c49e5b5ed8db991b61458ce267a94b891a472fb
Garbage (12 bytes): ????????????
Ciphertext (48 bytes): 12d380f787bdbc1bc7a4b8619f45609af2f1b66e09e912eff09384648f453f3db11abf8572ab7a34347ada3026bf0911
Garbage (13 bytes): ?????????????
Ciphertext (48 bytes): ae7db02691622c4562866713fa013c5cee268318401e6c194260a8ffa3d2df71190f2b09607b5764d577d4d5569b9059
Garbage (14 bytes): ??????????????
Ciphertext (48 bytes): 5f26e6ffabb6962705a174ac4b463bcc7bd036db83a337f9f4f867dd5691e1f7fd105ad78e0e6fa84f694743dd59cc94
Garbage (15 bytes): ???????????????
Ciphertext (48 bytes): eed4350b17297b157330c401581ac453a6734bcc83eace3a107321af7775026b7d715d5c670622e24c462ae108288f25
Garbage (16 bytes): ????????????????
Ciphertext (48 bytes): da572df43b1a6bd8ad66da297d64c445bea177bc81b326eef475195dee42ba6c2eb3958aa4a3fa0d49789f5152a4eed2
As shown above, we can see that when 7 bytes of garbage are sent in, a new block is made. This meas our flag must be 32-7 = 25 bytes
.
> flag_len = [len(i.hex())//2 - x - 2 for x, (i,j) in enumerate(zip(ciphers,ciphers[1:])) if len(j.hex())>len(i.hex())][0]
25
From here, we know that the flag will need be held within 2 blocks. If we want to leak the flag, we will need 4 blocks total (2 for garbage and leaking + 2 for holding the flag and padding) totalling to 64 bytes. To make more sense of how this works, let’s start with leaking a few bytes manually.
From previous challenges, let’s assume the flag does adhere to the format crypto{...}
. When scripting, its easy to check all possible bytes, but since this is an example and its manual, let’s be smart and guess the first byte is c
. From above, we can see:
Garbage (15 bytes): ???????????????
Ciphertext (48 bytes): eed4350b17297b157330c401581ac453a6734bcc83eace3a107321af7775026b7d715d5c670622e24c462ae108288f25
As we remember, each block is 16 bytes. Since we only sent in 15 bytes and this plaintext is prepended to the flag, we know that the next byte (16th byte) has to be the first byte of the flag. We also know that each block is independent and will have its own ciphertext. This means that if the 16th byte is the same as the first byte of the flag, we will get the same ciphertext for block 1. For example:
Garbage (15 bytes): ???????????????
Ciphertext (48 bytes): eed4350b17297b157330c401581ac453 a6734bcc83eace3a107321af7775026b 7d715d5c670622e24c462ae108288f25
Garbage (15 bytes) + 'c' (1 byte): ???????????????c
Ciphertext (48 bytes): eed4350b17297b157330c401581ac453 bea177bc81b326eef475195dee42ba6c 2eb3958aa4a3fa0d49789f5152a4eed2
As we can see, the first block of ciphertext for each payload is the same. eed4350b17297b157330c401581ac453 == eed4350b17297b157330c401581ac453
Let’s try it again for the sake of clarity. Now that we know the first letter of the flag is c
, we need to reduce the amount of garbage we send in to 14 bytes (14 + len('c') = 15
) so there is only byte we need to guess. We can try r
due to the flag format.
Garbage (14 bytes) + 'c' (1 byte): ??????????????c
Ciphertext (48 bytes): 5f26e6ffabb6962705a174ac4b463bcc 7bd036db83a337f9f4f867dd5691e1f7 fd105ad78e0e6fa84f694743dd59cc94
Garbage (14 bytes) + 'c' (1 byte) + 'r' (1 byte): ??????????????cr
Ciphertext (48 bytes): 5f26e6ffabb6962705a174ac4b463bcc bea177bc81b326eef475195dee42ba6c 2eb3958aa4a3fa0d49789f5152a4eed2
That’s it basically. Just script this process!
|
|
flag: crypto{p3n6u1n5_h473_3cb}
ECB CBC WTF
Here you can encrypt in CBC but only decrypt in ECB. That shouldn’t be a weakness because they’re different modes… right?
source code:
|
|
Solution
To solve this challenge, we first need to look at the differences between CBC and ECB.
This boils down to the following:
ECB
Encryption: c1=p1^key c2=p2^key
Decryption: p1=c1^key p2=c2^key
CBC
Encryption: c1=p1^key^iv c2=c1^p2
Decryption: p1=c1^key^iv p2=c1^c2
Since iv
in this case is random, we will not be able to decrypt p1
. However, we can decrypt p2
and p3
.
|
|
> print(flag)
crypto{3cb_5uck5_4v01d_17_!!!!!}
Alternate solution(s):
flag: crypto{3cb_5uck5_4v01d_17_!!!!!}
Flipping Cookie
You can get a cookie for my website, but it won’t help you read the flag… I think.
source code:
|
|
Solution
To solve this problem, turning CBC decryption into a system of equations makes the rest trivial. As we previously covered (see the images above), CBC decryption can be broken down like so:
Encryption:
p1 ^ iv ^ key = c1
p2 ^ c1 ^ key = c2
p3 ^ c2 ^ key = c3
Decryption:
c1 ^ key ^ iv = p1
c1 ^ d(c2) = p2
c2 ^ d(c3) = p3
From the source code, we are given iv
, the ciphertext
, and the plaintext
. However, it turns out we only really need iv
and p1
to solve. This is because our goal is only to change admin=False
to admin=True
through our own malicious iv
payload. Let’s change these terms to something more readily understandable:
1.
p1 ^ iv ^ key = c1 -> given_pt ^ given_iv ^ tmp_key = tmp_ct
c1 ^ key ^ iv = p1 -> tmp_ct ^ tmp_key ^ iv_payload = pt_payload
________________________________________________________________________________
2. (given_pt ^ given_iv ^ tmp_key) ^ tmp_key ^ iv_payload = pt_payload
________________________________________________________________________________
3. given_pt ^ given_iv ^ iv_payload = pt_payload
________________________________________________________________________________
4. iv_payload = pt_payload ^ given_iv ^ given_pt
So all we have to do is XOR our desired plaintext admin=True...
with the given iv
and the given plaintext admin=False...
to get our iv_payload
. Then, we can just send this payload to the server and get the flag!
|
|
flag: crypto{4u7h3n71c4710n_15_3553n714l}
Symmetry
Some block cipher modes, such as OFB, CTR, or CFB, turn a block cipher into a stream cipher. The idea behind stream ciphers is to produce a pseudorandom keystream which is then XORed with the plaintext. One advantage of stream ciphers is that they can work of plaintext of arbitrary length, with no padding required.
OFB is an obscure cipher mode, with no real benefits these days over using CTR. This challenge introduces an unusual property of OFB.
source code:
|
|
Solution
After checking Wikipedia, I realized the encryption and decryption methods are the same for OFB
! We’ve got all we need.
|
|
flag: crypto{0fb_15_5ymm37r1c4l_!!!11!}
Bean Counter
I’ve struggled to get PyCrypto’s counter mode doing what I want, so I’ve turned ECB mode into CTR myself. My counter can go both upwards and downwards to throw off cryptanalysts! There’s no chance they’ll be able to read my picture.
source code:
|
|
Solution
The trick to this solution is realizing that we know plaintext
(png header) and ciphertext
(given) of the first block. We can use this to calculate the key and then ultimately decrypt the rest of the ciphertext.
|
|