diff --git a/blog/bitter-algebra.md b/blog/bitter-algebra.md index 3da0bfd..dbad0e7 100644 --- a/blog/bitter-algebra.md +++ b/blog/bitter-algebra.md @@ -18,7 +18,7 @@ There is a well-developed [theory](https://en.wikipedia.org/wiki/Confluence_(abs If a natural heuristic isn't available, we must simply create one, by using deep learning to approximate the functions we wish we could have. The approach used to significant success in [AlphaProof](https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/) is, according to the scant public descriptions, based on [AlphaZero](https://arxiv.org/abs/1712.01815). It works over formal proofs (in [Lean](https://lean-lang.org/)) rather than expression rewrites, and bootstraps a policy neural net (which generates a distribution over next steps) and value neural net (which rates probability of success) by using MCTS to "amplify" the policy. This was, apparently, quite computationally expensive, and is perhaps not very practical for cheap home use. However, our potential design space is broader than this. -Lean is a complex and heavy programming language, so executing a Lean program is expensive compared to neural net forward passes, so AlphaProof has to use a relatively large neural net to match up inference time and proof validation time[^4]. This may not be true of a well-written symbolic expression manipulator. There's a clever and little-known program called [RIES](https://www.mrob.com/pub/ries/index.html), which finds equations given their solutions (the opposite of a normal numerical evaluator). Using caching (roughly) and a heavily optimized expression generator/evaluator which essentially brute-forces short [RPN](https://en.wikipedia.org/wiki/Reverse_Polish_notation) programs sorted by complexity, it can generate and evaluate several billion expressions a second (and differentiate top candidates). While a flexible pattern matcher would have to be slower, it is likely possible to outperform a standard CAS with explicit design for performance and externalization of ease-of-use features. Larger-scale search with a worse neural-net heuristic trades expensive GPU time for cheaper GPU time and might still be competitive, as StockFish is in chess versus more heavily neural-net-driven engines (though this is down to properties of chess's search trees and is not, to my knowledge, true in e.g. Go). +Lean is a complex and heavy programming language, so executing a Lean program is expensive compared to neural net forward passes, so AlphaProof has to use a relatively large neural net to match up inference time and proof validation time[^4]. This may not be true of a well-written symbolic expression manipulator. There's a clever and little-known program called [RIES](https://www.mrob.com/pub/ries/index.html), which finds equations given their solutions (the opposite of a normal numerical evaluator). Using caching (roughly) and a heavily optimized expression generator/evaluator which essentially brute-forces short [RPN](https://en.wikipedia.org/wiki/Reverse_Polish_notation) programs sorted by complexity, it can generate and evaluate several billion expressions a second (and differentiate top candidates). While a flexible pattern matcher would have to be slower, it is likely possible to outperform a standard CAS with explicit design for performance and externalization of ease-of-use features. Larger-scale search with a worse neural-net heuristic trades expensive GPU time for cheaper CPU time and might still be competitive, as StockFish is in chess versus more heavily neural-net-driven engines like [LC0](https://lczero.org/) (though this is down to properties of chess's search trees and is not, to my knowledge, true in e.g. Go). There has also been some work in the other direction - more neural net use and less search. [Deep Learning for Symbolic Mathematics](https://arxiv.org/abs/1912.01412) trains standard sequence-to-sequence models to do integration and solve ODEs with apparently good results (compared with traditional computer algebra systems). However, it relies on external methods to generate good training datsets (e.g. for integrals, differentiating random expressions and integration by parts). [PatternBoost](https://arxiv.org/abs/2411.00566) constructs extremal objects by repeatedly retraining an autoregressive model on the best of its outputs augmented with greedy search, and was good enough to solve an open problem despite little compute, somewhat cursed tokenization, and treating the problem as unconditionally sampling from "reasonably good" examples. Some [recent work in reinforcement learning](https://arxiv.org/abs/2408.05804) has also demonstrated directed exploration emerging from a simple contrastive algorithm which could be useful - PatternBoost-style construction could be reformulated as an RL problem amenable to this[^5], and a contrastive goal expression/current-expression-and-rule encoder is almost exactly the necessary distance heuristic I mentioned earlier.