## Category → CS

One thing many mathematically-inclined programmers run into when implementing mathematical algorithms, particularly number-theoretic ones, is that the modulo operation doesn’t behave how they expect or prefer.

In many languages, this operator is denoted %. Concretely, one might prefer that, if the second argument is positive, then the modulo operation would always give a nonnegative result. Under this behavior, the expression (-5) % 3 would evaluate to 1 rather than -2. This is a lot more useful for number theory because then for positive integers n, the % n operation actually maps integers to exactly the n canonical representatives for the residue classes. As a result, $$a \equiv b \mod n$$ if and only if a % n == b % n. You can also do things like index into a length-n array with a % n and know that the index will be in-bounds. Finally, there are also optimization opportunities: modding by a power of 2 becomes equivalent to a simple bitwise AND, which is really fast on modern computers.

A few programming languages, notably Python, do implement % this way. However, the majority of languages today, including pretty much everything remotely descended from C, do not; instead, (-5) % 3 is -2. This post attempts to track down why.

The first thing to note is that there is a more important number-theoretic identity we’d like to have:

$\texttt{a} = (\texttt{a / b}) \cdot \texttt{b} + (\texttt{a \% b}). \tag{1}$

In words, the integer division and modulo operators should give a breakdown of a into a sum of some copies of b plus a remainder. Note that this equation also implies that specifying the rounding behavior of division is equivalent to specifying the sign behavior of the modulo operation, which will come up repeatedly later.

It’s also very uncontroversial that that remainder should have no copies of b, positive or negative, left over, which gives the constraint:

$|\texttt{a \% b}| < |\texttt{b}|. \tag{2}$

Every programming language I can think of satisfies these two constraints.1 So far so good. However, these two constraints don’t uniquely determine the values of a % b when a isn’t divisible by b; there are two possible values for a % b, one positive and one negative. Concretely, we could express $$-5$$ as either $$(-1) \cdot 3 + (-2)$$ or $$(-2) \cdot 3 + 1$$, so (-5) % 3 could be -2 or 1.

It’s still mostly uncontroversial that, if a and b are both positive, then a % b should be nonnegative as well; we could call this constraint (3).2 However, if a is negative and b is positive, programming languages start to diverge in their behavior. Why?

It’s December, so it’s time for a lot of things, including Advent of Code. I will not be able to be as competitive as I was last year, and already lost a lot of points to a really silly mistake on day 1, but I’ll be playing when I can and golfing the problems when I have time (so far: 7 + 14 bytes).

As one might expect, Day 1 is not too complex, but the second part can be analyzed to some mathematical depth and was discussed a bit on Reddit; plus, it occurred to me recently that I set up KaTeX on my blog but never used it, so I was looking for an excuse to write some equations anyway.

The problem statement for part 2, in brief: We are tasked with calculating the total mass of fuel required to launch a rocket module of a given mass. For something of mass $$m$$, one can compute the directly required mass of fuel by dividing $$m$$ by 3, rounding down, and subtracting 2; if the result is negative, it is taken to be 0 instead. However, the directly required fuel also requires fuel itself, calculated from its own mass by the same procedure, and that required fuel requires fuel based on its own mass, and so on until you reach fuel with 0 requirement.

Call me maybe? nc rev.chal.csaw.io 1001

A rev with a nasty binary. There are so many functions. I do not like this binary.

### Static Analysis

After staring at the sea of functions in IDA for a little bit, I gave up and tried dumb things instead.

come get me

http://web.chal.csaw.io:1003

This was a web challenge with a few pages. The “User” page displayed some user information:

Name: Alice
Email: [email protected]
Group: CSAW2019
Intro: Alice is cool

Name: Bob
Email: [email protected]
Group: CSAW2019
Intro: Bob is cool too

The “About” page simply told us, “Flag is located at /flag.txt, come get it”. The most interesting page was “Upload”, where we could view an example users XML file:

Welcome to pwn.

nc pwn.chal.csaw.io 1005

Ahhh, CSAW CTF. Amidst all the other CTFs where we’re competing with security professionals who probably have decades of experience and who follow security developments for a living or whatever, there remains a competition where scrubs like me can apply our extremely basic CTF skills and still feel kinda smart by earning points. Now that I’ve graduated and am no longer eligible, our team was pretty small and I didn’t dedicate the full weekend to the CTF, but it means I got to do the really easy challenges in the categories that I was the worst at, by which I mean pwn.

baby_boi is pretty much the simplest possible modern ROP (the modern security protections NX and ASLR are not artificially disabled, but you get everything you need to work around them). We even get source code.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv[]) {
char buf[32];
printf("Hello!\n");
printf("Here I am: %p\n", printf);
gets(buf);
}

So there’s nothing novel here for experienced pwners, but I feel like there is a shortage of tutorials that walk you through how to solve a textbook ROP the way you’d want to solve it in a CTF, so here is a writeup.

We interrupt the irregularly scheduled philosophical posts for some programming memes:

Over the last few days, the Internet has divided itself over what the value of the expression 8÷2(2+2) should be. Some say it should be evaluated as (8÷2)×(2+2) = 16. Some say it should be evaluated as 8÷(2×(2+2)) = 1.

At the risk of belaboring the obvious, the core dispute here is not really mathematical. There is not some sequence of mathematical operations that produces some number, where mathematicians disagree about what number it produces. Instead, this is a dispute about mathematical notation: what sequence of mathematical operations the expression corresponds to the way it’s written. Specifically, it is a dispute about whether multiplication written as juxtaposition (how “2” is written right next to “(2+2)”) has strictly higher precedence than division. It is closer to a linguistic or typographical dispute than a purely mathematical one, and the correct answer to the dispute is that whoever wrote the expression that way should learn to write math better.

This debate is not even new. The internet had fun arguing over 48÷2(9+3) and 6÷2(1+2), which are functionally identical ambiguous expressions, eight years ago. I don’t know why the debate is resurging now and why we still haven’t gotten tired of it.

But life is short, so since we’re here anyway, let’s make some additional memes.

$python3 Python 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> 8/2(2+2) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'int' object is not callable http://111.186.63.17/perf.data.gz Environment: Ubuntu 16.04+latex In this challenge, we get a gzipped file called perf.data and a minimal description of an environment. Googling this reveals that perf.data is a record format of the perf tool, a Linux profiler. Installing perf allows us to read perf.data and see some pretty interactive tables of statistics in our terminal describing the profiling results, from which we can see some libraries and addresses being called, but they don’t reveal much about what’s going on. One hacky way to see more of the underlying data in a more human-readable way (and to see just how much of it there is) is perf report -D, which dumps the raw data in an ASCII format, but this is still not that useful. (One might hope that one could simply grep for the flag in this big text dump, but it’s nowhere to be seen.) Still, from this file, we can definitely read off all the exact library versions that the perf record was run against. A more accurate but less informative title for this post would be “How I wish React and Redux were explained to me”. Note that this does not imply that this method of explanation is suitable for anybody else. I suspect it won’t be for most people. I had to learn React and Redux the past summer for my internship at MemSQL, and there were hundreds of articles that explain React and Redux in addition to the (fine) built-in documentation, but none of them scratched the itch; I wanted to know what was going on completely, including some of the technical details and the philosophy I ought to be following, as well as efficiently. I did not need another explanation about how to think functionally, in JavaScript types or with immutable data. React’s chapter on Conditional Rendering, for example, felt so inefficient — I know what if statements and conditional expressions are, and I know how to refactor complicated subexpressions into variables… So here’s the guide I wish I had. I think. It’s been months since I started it (as usual, for posts on this blog) and it is probably incomplete. However, I haven’t written React/Redux deeply in a while, so I didn’t have much motivation to continue to investigate the incomplete bits; and the perfect is the enemy of the good, so here it is. We’ve burrowed ourselves deep within the facility, gaining access to the programable logic controllers (PLC) that drive their nuclear enrichment centrifuges. Kinetic damage is necessary, we need you to neutralize these machines. You can access this challenge at https://wargames.ret2.systems/csaw_2018_plc_challenge A much belated post. This is a pwn challenge on a custom online wargaming platform. We are provided with the assembly of what’s ostensibly a programmable logic controller (PLC) for a centrifuge in a nuclear reactor. The challenge looks like it’s still up, so you can take a look and follow along. This was the first ROP (okay, spoiler, it’s a ROP) I ever pulled off live during an actual CTF, which I was pretty excited about. The web platform meant I had to worry less about setup, and even though some of the tools it provided were a little lacking (no gdb shortcuts like until, no pwntools utilities for packing/unpacking numbers, … no one_gadget), I think they ultimately made the whole thing a lot more educational for me, so kudos to the folks behind it. I’ve included a brief description of all the exploit techniques that lead up to ROP when we get to that, so hopefully this post will be useful even if you don’t know much about pwning binaries. The prerequisites would be some knowledge with x86 assembly, how executables are loaded into memory, and how to use gdb (or fictionalized web knockoffs thereof). The villains are communicating with their own proprietary file format. Figure out what it is. $ nc proprietary.ctfcompetition.com 1337
We get a server that will talk to us on a port and a flag.ctf file that’s definitely not a binary. It’s a black-box reversing challenge! I was @-mentioned as the person who might want to due to solving bananaScript (CSAW CTF Quals 2017) as a black box, although that gave a binary that it was possible in theory to reverse. Here black-box reversing is the only option.
For the first few lines of input that the server wants, it responds with quite helpful error messages to help you appease it. If the first line you give it is not P6, it complains: