Research

Many difficult computation tasks can be boiled down to one word: Search. Be it program generation, theorem proving, satisfiability modulo theory (SMT), or playing board games, the core idea is to search for a configuration or an expression that satisfies a list of simple but expressive criteria. For theorem proving, this is the axiomatic type system used in a proof assistant. For board games, it is the rule of the game. In highly specialized tasks such as SMT and playing board games, mechanical programs can already achieve super-human performance. This includes CVC5 (for SMT), or Stockfish (for Chess).

However, the progress of automatically searching for proofs of mathematical statements has been slow due to the unbounded search space of this problem. One reason is the generality of theorem proving: SMT, Chess, Program generation, and natural language reasoning problems can all be expressed in terms of mathematical statements. The reward for developing such an algorithm is also immense and could solve major problems in today’s artificial intelligence systems.

On the other hand, machine learning methods have achieved incredible success in some intellectually challenging tasks. This includes AlphaGo for playing chess, GPT for text generation, Stable Diffusion for image generation, and others. It naturally leads to a question: Can we use machine learning methods be applied to the difficult task of theorem proving?

Research Interests

My main research interest is to use a hybrid of formal methods and machine learning in theorem proving. Formal methods have achieved success in problems with limited domain (e.g. Linear Programming, SAT, Cryptography). I and Dr. Barrett coined the term machine-assisted theorem proving (MATP) for this hybrid approach. The term MATP was coined to avoid confusion with an existing term, Automated Theorem Proving (ATP) which refers to first order logic only. It also emphasises on the secondary nature of machine learning compared to formal methods.

I created Pantograph / PyPantograph, an interface for Lean 4 with the main target being facilitating the theorem proving search process. Pantograph has been developed from the ground up in Lean to enable the machine to search for any proof a human operator can write down. It also overcomes a couple of design limitations of the previous works, Lean REPL, LeanGym, and LeanDojo.

I’m working on Trillium, which is a platform for training and evaluating hybrid theorem proving agents. Due to the resource intensive nature and unreliability of large language models, I am testing theorem proving consortia consisting of formal method and neural network agents. By placing neural networks at a place of secondary importance, we can try using existing, more transparent automated reasoning methods such as SMT solvers to make progress on a proof. We only consult the neural network if the automated reasoning method fail.

Prospective Collaborators

Please send me an email at aniva at stanford dot edu and we can discuss about collaborations. Undergraduate and master students are welcome.

My current research project (open to collaboration) is to use Language Models with my tool Pantograph. We can now port many algorithms that were previously implemented in Isabelle and try to improve theorem proving performance in Lean. In PyPantograph, I have an example of running the Draft-Sketch-Prove experiment on Lean. This example is still in the early stage and there is room for drastic performance improvements.


Created and Designed by Leni Aniva based on Tokiwa