The Inspiration for Mathematician’s Assistant

Photo by Roman Mager on Unsplash

In high school, math is largely pattern matching: if the polynomial’s highest exponent is two, use the quadratic formula. As an undergraduate, the game changes. There’s a bigger element of problem solving. You now have a whole collection of tools, and it’s not obvious what you should use. Integration by parts? Make a substitution? If so, which substitution?

Mathematics is a game played according to certain simple rules with meaningless marks on paper.

David Hilbert

We create our Computer Algebra Systems largely as pattern matching engines, with no understanding of our goal. Which is why they sometimes give answers that are correct but useless, and often need significant simplification or factoring to be useful.

Still, this is workable, and very useful when the pattern matching rules are known. But when a working mathematician or physicist is exploring a new area, there are no rules yet. They need to discover them, through insight and reflection, guided by trial and error. They give themselves problems, and try to solve them. They try a few things, hit some roadblocks, then try to reflect on the nature of the roadblock.

What would a program look like that’s more like the undergraduate: some new rules are defined, and the game is to get to a goal, but it isn’t told how? Could it follow roughly the path that the undergraduate does? What does an undergraduate do anyway?

A simple test case: knowing only simple laws of arithmetic, what’s needed to derive the quadratic formula, if you’ve never seen a polynomial before?

  1. Your goal is manipulate the equation ax^2+bx+c=0 to get x alone on the left hand side. In other words, you’re given a goal, that can be stated syntactically: a single x on the left hand side.
  2. The simple rules are mostly the distributivity, associativity and commutativity of addition and multiplication, and some similar rules for subtraction and division.
  3. To get x alone on the LHS, you first have to get x alone. (Spoiler: this will lead to completing the square.) In other words, you create subgoals as markers along the way to the result.
  4. You use your experience with simpler problems to guide you here. For example, you know that (x+y)^2=x^2+2xy+y^2, and that right hand side looks a lot like your initial expression. Can you turn it into it somehow?

The general structure is:

  1. A goal that can be stated as a syntactic constraint on an expression, the meaningless marks.
  2. The simple rules, like the distributive rule.
  3. Some experience to decide which rules to use in which scenarios.

Of course, how to encode these into a computer isn’t straight forward. How to represent goals? Perhaps some idea of predicates on trees, could start with a generalization of regular expressions to trees. How to represent distance to the goal to guide the search? Could use some generalized idea of edit distance, where you can add new operations as you “chunk” sequences of rules into a new, single rule. How to express which rules are likely to work in which scenarios? Modern machine learning holds promise, how different is it than learning which move to make in a game like Go?

It’s still helpful to have an external entity to feed in example problems. It’s still useful to learn from other solutions. So this is collaborative with a person. That’s the reason behind the name “Mathematician’s Assistant.” It assists a person, rather than working alone.

My favourite description of my Ph.D. thesis was by Ian Horswill, “you’ve automated the Master’s thesis!” What could you do if you had a fast but not particularly clever Masters’ student? Or a whole collection of them? How much more of a mathematical space could you explore in a month or a year?

As a Ph.D. student or working mathematician, new skills come into play. A big one is coming up with definitions. This is a very different skill than solving undergraduate problem sets. It’s somewhat implicit in the “chunking” I mentioned above, finding patterns that seem to be useful in multiple contexts, and generalizing them the right way. For example, while Werner Heisenberg was working on Quantum Mechanics, he re-invented the concept of the matrix, calling it a “table”, because linear algebra wasn’t well know in the 1920s.

Leave a comment