### Signed Modulo

One thing many mathematically-inclined programmers run into when implementing mathematical algorithms, particularly number-theoretic ones, is that the modulo operation doesn’t behave how they expect or prefer.

In many languages, this operator is denoted `%`

. Concretely, one might prefer that, if the second argument is positive, then the modulo operation would always give a nonnegative result. Under this behavior, the expression `(-5) % 3`

would evaluate to `1`

rather than `-2`

. This is a lot more useful for number theory because then for positive integers `n`

, the `% n`

operation actually maps integers to exactly the `n`

canonical representatives for the residue classes. As a result, \(a \equiv b \mod n\) if and only if `a % n == b % n`

. You can also do things like index into a length-`n`

array with `a % n`

and know that the index will be in-bounds. Finally, there are also optimization opportunities: modding by a power of 2 becomes equivalent to a simple bitwise AND, which is really fast on modern computers.

A few programming languages, notably Python, do implement `%`

this way. However, the majority of languages today, including pretty much everything remotely descended from C, do not; instead, `(-5) % 3`

is `-2`

. This post attempts to track down why.

The first thing to note is that there is a more important number-theoretic identity we’d like to have:

\[\texttt{a} = (\texttt{a / b}) \cdot \texttt{b} + (\texttt{a \% b}). \tag{1}\]

In words, the integer division and modulo operators should give a breakdown of `a`

into a sum of some copies of `b`

plus a remainder. Note that this equation also implies that specifying the rounding behavior of division is equivalent to specifying the sign behavior of the modulo operation, which will come up repeatedly later.

It’s also very uncontroversial that that remainder should have no copies of `b`

, positive or negative, left over, which gives the constraint:

\[|\texttt{a \% b}| < |\texttt{b}|. \tag{2}\]

Every programming language I can think of satisfies these two constraints.^{1} So far so good. However, these two constraints don’t uniquely determine the values of `a % b`

when `a`

isn’t divisible by `b`

; there are two possible values for `a % b`

, one positive and one negative. Concretely, we could express \(-5\) as either \((-1) \cdot 3 + (-2)\) or \((-2) \cdot 3 + 1\), so `(-5) % 3`

could be `-2`

or `1`

.

It’s still mostly uncontroversial that, if `a`

and `b`

are both positive, then `a % b`

should be nonnegative as well; we could call this constraint (3).^{2} However, if `a`

is negative and `b`

is positive, programming languages start to diverge in their behavior. Why?