Mask

ångstromCTF 2022 (200 points)

Don’t forget to wear your mask…

nc challs.actf.co 31501

If I had a nickel for every CTF challenge I’ve done that involves understanding the internal structure of a QR code, I would have two nickels. Which isn’t a lot, etc etc. That previous challenge probably helped me get first blood on this.

The source code is wonderfully short:

Basically, the flag is 17 bytes, we specify a 17-byte string, and we get the left half of a QR code of the flag xored with our string. However, I actually didn’t use the xor at all; I passed in an all-zero bytestring to get this partial QR code and just solved it from there.

█▀▀▀▀▀█ ▄ ▀
█ ███ █ ▀▀█
█ ▀▀▀ █ █▀▀
▀▀▀▀▀▀▀ █▄█
▀█ ▀▄ ▀▀ ▄▀
▀▀▄▀▄█▀▄ ▄ 
▀ ▀   ▀ ▄█▄
█▀▀▀▀▀█ ▀█ 
█ ███ █ ▄▄█
█ ▀▀▀ █ ▄██
▀▀▀▀▀▀▀ ▀▀ 

Once more to the spreadsheet

Conveniently, Wikipedia’s QR Character Placement diagram features a QR code with exactly the right size, length, and encoding mode.

A diagram of a QR code showing the fixed patterns in the corners, the format info near the corners in red, and various gray pixels encoding some control information around the string 'www.wikipedia.org' followed by seven error-correction bytes E1 through E7.
(CC0 from Wikimedia Commons)

Instead of doing anything smarter, I wrote a little code to convert the Unicode QR code into bits, copied it into a spreadsheet, and manually colored and transcribed bits from there.

The partial QR code from the challenge with analogous regions colored
The partial QR code from the challenge. The fixed dark module is in magenta, the (partial) fixed “timing pattern” is in red, one complete copy of the format information is in blue, and the other partial copy of the format information is in cyan.

From there, I referred to Thonky.com’s QR Code Tutorial a lot, just like I did in the 2018 PlaidCTF challenge I linked earlier. The Module Placement in Matrix page explains the 2D structure of a QR code. The first thing we have to handle is the Mask Pattern, a repeating pattern chosen from a small list of possibilities that’s xored with the rest of the QR code to minimize certain patterns that might be an obstacle for visual parsing. Looking up the format information string in the Format version tables shows that we have Mask Pattern 7, which flips the modules satisfying ( ((row + column) mod 2) + ((row * column) mod 3) ) mod 2 == 0. We can undo the mask with a spreadsheet formula:

The non-fixed data bits in the partial QR code after applying the mask
Unmasked QR bits

Then, we can transcribe the last few bits of the data by reading in the somewhat unnatural “boustrophedon” order:

To get any further, though, we will need to understand how QR error correction codes work quite precisely.

The structure of QR codes

To simplify this writeup, I will focus on how the specific QR code in this challenge works, with only brief comments on how QR codes of other sizes work. After fixed patterns, format info, and masking, there are 208 “modules” that can be black or white, expressing 208 bits, structured as follows:

  • The first four bits specify the encoding mode.
  • The next byte specifies the length of the payload. (Note that because we started with four bits, this byte and the following bytes aren’t aligned to the byte boundaries of the outer message.)
  • The next 18 bytes encode the payload.
  • There are four bits that form an “end” marker. (Messages with other modes or lengths will have more complicated padding here.)
  • Finally, there are 7 bytes of error correction. These bytes are computed after all the preceding data has been computed and concatenated, and is agnostic to their inner structure.

We know the QR code’s encoding mode, how long the flag is, and its first few characters, so by following Thonky’s tutorial forward (or just trying to generate a QR code ourselves) we discover that our QR code’s data must begin with the following 6½ bytes:

0b01000001
0b00010110
0b00010110
0b00110111
0b01000110
0b01100111
0b1011????

Just by visually comparing the partial QR code we have to the diagram, we see that we get all 7 error correction bytes from there, so now we need to understand how they’re computed and we can do with them. Thonky’s page on error-correction coding is really detailed, so you can read that if you want such a treatment; here, I’m just going to assume you know some abstract algebra and plow through.

When computing the error-correction bytes, all bytes are treated as elements of the finite field \(\mathbb{F}_{2^8}\), specifically as generated by a generator \(\alpha\) satisfying \(\alpha^8 = \alpha^4 + \alpha^3 + \alpha^2 + 1\) (in binary 0b100011101). That is, the byte consisting of the bits \(\overline{b_7b_6b_5b_4b_3b_2b_1b_0}\) is identified with the field element \(b_7\alpha^7 + b_6\alpha^6 + \cdots + b_0\alpha^0\). So to “add” bytes, you XOR them. To multiply bytes, you can do a painstaking bit-by-bit convolution and xor out 0b100011101 as it appears, but in practice, it’s simpler to go through precomputed log and antilog tables.

Together, the data bytes and error correction bytes comprise 26 bytes, which we will index in decreasing order because they’ll be the coefficients of a polynomial \(c_{25}, c_{24}, \ldots, c_0 \in \mathbb{F}_{2^8}\). The first 19 bytes \(c_{25}, \ldots, c_7\) are determined by the data. Then, the error-correction bytes \(c_6, \ldots, c_0\) are chosen such that the polynomial \[c_{25}x^{25} + c_{24}x^{24} + \cdots + c_1x^1 + c_0x^0\] is divisible by the generator polynomial \(g(x) := \prod_{i=0}^6 (x - \alpha^i)\). (Equivalently, since every polynomial is its own negation, you can generate the error-correction bytes by dividing the polynomial \[c_{25}x^{25} + c_{24}x^{24} + \cdots + c_7x^7\] by the generator polynomial and taking the coefficients of the remainder.)

This is a BCH code, one perspective of a Reed-Solomon code, which can correct up to 7 missing bytes: that is, given any sequence of bytes correctly constructed from the above process, if any 7 bytes are erased (while preserving the information of which positions the bytes were erased at), all the erased bytes can be reconstructed. It’s not obvious why this is always possible, nor is the general argument that useful to us anyway, so I’ll just give a rough proof sketch in a footnote.1

The actual reconstruction

Though reconstructing arbitrary erased bytes is somewhat mathematically difficult, for this challenge we only need to reconstruct adjacent erased bytes, which is a lot easier. To put this in context of what we know, here are all the bits at the start and the end of the QR code data that we’re sure of. (Technically, we can fill in a bunch more bits in the middle because the set of characters that can be in our flag is so small, but it’s annoying to use that information.)

We have 15½ of the 26 bytes, and can error-correct any 7. So we can just brute force 3½ bytes — let’s just say the first 3½ bytes of ?’s above — and try to reconstruct the rest using the error-correction to see if the result is a flag. That is, given coefficients \(c_{25}, c_{24}, \ldots, c_{16}\) and \(c_{8}, \ldots, c_{0}\), we want to pick the unknown coefficients \(c_{15}, c_{14}, \ldots, c_{9}\) such that the polynomial \[c_{25}x^{25} + c_{24}x^{24} + \cdots + c_1x^1 + c_0x^0\] is divisible by \(g(x) = \prod_{i=0}^6 (x - \alpha^i)\). Equivalently, we want a polynomial of the form \(c_{15}x^{15} + \cdots + c_9x^9\) that’s congruent to the polynomial with the coefficients we know, \[P(x) := c_{25}x^{25} + c_{24}x^{24} + \cdots + c_{16}x^{16} + c_8x^8 + c_7x^7 + \cdots c_0x^0,\] modulo \(g(x)\). (Again, there’s no need for a minus sign since our field has characteristic 2.)

This would be very easy if we were reconstructing the bottom 7 coefficients: we can just take the remainder of \(P\) when divided by \(g\), in the normal long division way. Even though we’re reconstructing a different set of coefficients, we can start by doing this division anyway to get a polynomial \(Q\) with degree less than 7. The process would look something like this (where, again, all the integer coefficients are actually representations of elements of \(\mathbb{F}_{2^8}\), so when we add/subtract them, we xor the integers).

                        ______________________________
x⁷ + 127x⁶ + ... + 117 / 65x²⁵ +  22x²⁴ +  22x²³ + ...
                         65x²⁵ + 201x²⁴ + 145x²³ + ... ←  65g(x)x¹⁸
                         _____________________________
                                 223x²⁴ + 135x²³ + ...
                                 223x²⁴ + 219x²³ + ... ← 223g(x)x¹⁷
                                 _____________________
                                           92x²³ + ...
                                           92x²³ + ... ←  92g(x)x¹⁶
                                           ___________
                                                   ...

Now, we can zero out the bottom coefficient of \(Q\) while giving it a nonzero \(x^7\) coefficient as follows: if the bottom coefficient of \(Q\) be \(Q_0\) and the bottom coefficient of \(g\) is \(g_0\), then add (or subtract) \((Q_0/g_0)g\) to \(Q\). Then, we can zero out the \(x^1\) coefficient while giving it a nonzero \(x^8\) coefficient by adding (or subtracting) \((Q_1/g_0)x^1g\). And so on. An example might look like this:

               251x⁶ +  19x⁵ + ... + 218x² + 170x¹ + 154x⁰
        51x⁷ + 121x⁶ + 134x⁵ + ... + 208x² + 144x¹ + 154x⁰ ←  (154/g₀)g(x)
      ____________________________________________________
        51x⁷ + 130x⁶ + 149x⁵ + ... +  10x² +  58x¹
125x⁸ + 15x⁷ + 155x⁶ + 218x⁵ + ... + 223x² +  58x¹         ←   (58/g₀)g(x)x
__________________________________________________________
125x⁸ + 60x⁷ +  25x⁶ +  79x⁵ + ... + 213x²
...

Eventually we get to a \(Q\) with nonzero coefficients for only \(x^{15}\) through \(x^9\), which is exactly what we want. (This is basically “backwards” long division — you can view it as dividing \(Q(x)/x^{15}\) by \(g(x)/x^{7}\) when treating both as polynomials of \(1/x\).)

Now that we know how to reconstruct 7 bytes, the remaining space of 3½ bytes has size \(256^{3^1\!/_2} = 2^{28}\) and is still a bit large to brute force, but since we know the flag only contains underscores and lowercase letters, our actual brute force surface can be more like \(3 \cdot 27^3 = 59049\). Below I am too lazy to worry about boundaries and actually use a \(3 \cdot 48^3 = 331776\) brute force, which is still plenty fast.

Onto the implementation: To avoid thinking too hard, I copied some code over from python-qrcode, the exact QR code library used by the challenge, and modified it. (Original code © 2011, Lincoln Loop, under BSD-3, with its own lineage.) Here’s the code that’s useful for us to work in \(\mathbb{F}_{2^8}[x]\):

EXP_TABLE = list(range(256))

LOG_TABLE = list(range(256))

for i in range(8):
    EXP_TABLE[i] = 1 << i

for i in range(8, 256):
    # α^i = α^(i - 8) × α^8
    #     = α^(i - 8) × (α^4 + α^3 + α^2 + 1)
    #     = α^(i - 4) + α^(i - 5) + α^(i - 6) + α^(i - 8)
    EXP_TABLE[i] = (
        EXP_TABLE[i - 4] ^ EXP_TABLE[i - 5] ^ EXP_TABLE[i - 6] ^ EXP_TABLE[i - 8]
    )

for i in range(255):
    LOG_TABLE[EXP_TABLE[i]] = i

def glog(n):
    if n < 1:
        raise ValueError(f"glog({n})")
    return LOG_TABLE[n]

def gexp(n):
    return EXP_TABLE[n % 255]

class Polynomial:
    def __init__(self, num, shift):
        # Leading coefficient first, so num[i] is coefficient of x^(len(num)-1-i)
        if not num:
            raise Exception(f"{len(num)}/{shift}")

        offset = 0
        for offset in range(len(num)):
            if num[offset] != 0:
                break

        self.num = num[offset:] + [0] * shift

    def __getitem__(self, index):
        return self.num[index]

    def __iter__(self):
        return iter(self.num)

    def __len__(self):
        return len(self.num)

    def __add__(self, other):
        num = [0] * max(len(self), len(other))

        for i, item in enumerate(self):
            num[i + len(num) - len(self)] ^= item
        for i, item in enumerate(other):
            num[i + len(num) - len(other)] ^= item

        return Polynomial(num, 0)

    def __mul__(self, other):
        if isinstance(other, int):
            other = Polynomial([other], 0)
        num = [0] * (len(self) + len(other) - 1)

        for i, item in enumerate(self):
            for j, other_item in enumerate(other):
                num[i + j] ^= gexp(glog(item) + glog(other_item))

        return Polynomial(num, 0)

    def __mod__(self, other):
        difference = len(self) - len(other)
        if difference < 0:
            return self

        ratio = glog(self[0]) - glog(other[0])

        num = [
            item ^ gexp(glog(other_item) + ratio)
            for item, other_item in zip(self, other)
        ]
        if difference:
            num.extend(self[-difference:])

        # recursive call
        return Polynomial(num, 0) % other

    def __lshift__(self, n):
        # multiply by x^n
        return Polynomial(self.num + [0] * n, 0)

    def __rshift__(self, n):
        # divide by x^n
        num = self.num[:]
        while n > 0:
            assert num.pop() == 0
            n -= 1
        return Polynomial(num, 0)

    def __str__(self):
        return f"Polynomial({self.num})"

We could compute the generator polynomial, but we could also copy its coordinates from another file:

In the final solve script, we first precompute \(g/g_0\) for ease of performing our error correction. We are also rather lazy and leave garbage from previous iterations in the coefficients \(c_{15}, \ldots, c_{9}\) we’re solving for, but it’s fine because we always compute the right polynomial we want to add to the full polynomial with the garbage included and it cancels out. The rest is as described above, with our \(3 \cdot 48^3 = 331776\) brute force into error correction.

Maybe a minute later we have our flag:

actf{not_quadres}

  1. This is equivalent to claiming that no two distinct valid codewords (correctly error-corrected byte sequences) differ in 7 or fewer bytes. By subtracting those two hypothetical codewords’ polynomials, this is equivalent to claiming that no nonzero polynomial exists in \(\mathbb{F}_{2^8}\) that has degree at most 25, has at most 7 nonzero coefficients, and evaluates to 0 at all 7 of \(\alpha^0, \alpha^1, \ldots, \alpha^6\). By viewing such a hypothetical polynomial as a linear combination of its monomials, this is equivalent to claiming that every 7 distinct vectors of the form \((\alpha^k, \alpha^{2k}, \ldots, \alpha^{7k})\) for \(k = 0, 1, \ldots, 25\) are linearly independent. This turns out to be true because any such 7 vectors would form a Vandermonde matrix, whose determinant has a neat formula, so we’re done.

    This is covered, slightly obscurely, in the “Properties section” of the BCH code Wikipedia article.

(note: the commenting setup here is experimental and I may not check my comments often; if you want to tell me something instead of the world, email me!)