Marking Maths, Part 2

We’ve worked on entering mathematical equations; that’s straightforward, though very hard to do in a way that feels natural on a computer. Harder still is marking such an equation; by which we mean comparing it to a known answer to see if it’s equal.

‘Equal’ here has to be defined carefully and is sometimes split into two categories. There’s syntactically equal expressions - where we say $x + y$ is equal to $y + x$, because they are both just adding $x$ and $y$ together and the order in addition doesn’t matter. Then there’s semantically equal expressions - where $(x + y)^2$ equals $x^2 + 2xy + y^2$ say, or $\frac{1}{\pi}$ equals $\frac{2\sqrt{2}}{9801}\sum_{k=0}^{\infty}\frac{(4k)!(1103+26390k)}{k!^{4}(396^{4k})}$.

Checking semantic equality is hard; in fact it’s known to be impossible in general (see Richardson’s Theorem) – but we’ve never been ones to listen to that sort of thing!

We use SymPy to do the checking; it’s an open-source computer algebra library for Python with excellent support for precisely the equality checking we want. There are quite a few hurdles we had to cross to integrate SymPy into Isaac: it’s Python but Isaac is Java; it can be quite slow sometimes; it checks Python expressions, not LaTeX or a syntax tree directly; and worst of all it’s far too good at what it does (see the image below). It’s amazing at semantic equality; but sometimes we just want syntactic equality!

A question on Isaac asking for 'F/m' as an answer, but saying that a complicated expression involving G's, sqrt(3)'s and (sqrt(F/m))^2 is correct!
SymPy recognises that the expression simplifies down to $\frac{F}{m}$: but that's not what we wanted here!

Actually, as a fun aside, SymPy didn’t match the complicated entry in the image to the expected answer of $\frac{F}{m}$: it says they’re not equal. Not because it couldn’t check it, it turns out - but because it assumes everything is a complex number and so $\sqrt{x^{2}} \neq x$. This highlights another issue: Does SymPy say two expressions aren’t equal because they really aren’t equal – or because it can’t simplify them well enough? So I added code to sample the functions at random points of their inputs[1] and see if the results matched up to within a very small numerical error. It won’t be right every time; but it will flag up possible matches for a human to review.

We then built into the system a way to check syntactic equality too. This turns out to be really useful when we want to make sure someone has factorised an expression – the factorised correct answer is (semantically) equal to the wrong non-factorised answer! Computers don’t yet understand these subtleties.

It’s being slowly rolled out: there are some beta questions which use the checker, and we hope for lots more questions on Isaac to use it soon enough!


[1] If you’re paying close attention you’ll notice that the expected answer contains the variables $F$ and $m$, but that the thing submitted contains $G$ too. It’s obvious to a person that $G$ cancels out immediately; but not to the computer. So we have to be able to sample $n$ and $m$ dimensional space and compare. Fun!


Blog post author photo

James Sharkey

James works on both the Physics and Computing sides of the Isaac Physics project, having previously worked on the Dynamics and Maths questions.