DeepMind's game-playing AI has broken a computing record that was previously held for 50 years
Through the use of its AI built for playing board games, AlphaZero, DeepMind has broken a record that had stood for more than 50 years by finding a faster way to solve a fundamental math problem in computer science.
Matrix multiplication is a fundamental operation in a wide variety of fields, from visual display to physical simulation. It also serves as a cornerstone of the machine learning process. Accelerating this calculation could have far-reaching effects on the efficiency and cost of thousands of common computer tasks.
"This is a really amazing result," says François Le Gall, a mathematician at Nagoya University in Japan who was not involved in the study. He claims that matrix multiplication is ubiquitous in engineering. The use of matrices is commonplace for solving numerical problems.
Despite its widespread use, this calculation is poorly understood. A matrix is merely a numeric grid used to represent anything you like. When multiplying two matrices, it is common practice to multiply the rows of one matrix by the columns of the other. In high school, students learn the fundamentals of a solution to the issue. The leader of DeepMind's AI for Science team, Pushmeet Kohli, describes it as "the ABC of computing."
However, finding a quicker approach adds complexity. The best algorithm for solving it is unknown, says Le Gall. One of the most significant unsolved issues in the field of computer science.
One reason for this is that the number of possible combinations when multiplying two matrices is larger than the number of atoms in the universe (10 to the power of 33, for some of the cases the researchers looked at). An engineer at DeepMind named Thomas Hubert remarked, "The number of possible actions is almost infinite."
The solution involved transforming the issue into a game similar to a three-dimensional board game called TensorGame. Each piece on the board represents a digit in a multiplication problem, and each move is the next logical step in solving that problem. A game's sequence of actions can be thought of as an algorithm.
They developed a new version of AlphaZero, called AlphaTensor, and trained it to play the game. AlphaTensor was trained to find the optimal sequence for multiplying matrices, rather than Go or chess. The game was over and it had won with as few moves as possible, so it was rewarded for that.
Among the primary researchers on AlphaZero is Hubert, who explains how they turned the project into a game using their preferred framework.
Today, Nature published a paper in which the researchers detail their findings. Specifically, AlphaTensor found a way to multiply two four-by-four matrices together that is faster than a method developed by German mathematician Volker Strassen in 1969, which had previously been unbeatable. The standard high school approach has 64 steps, while Strassen's only has 49. It took AlphaTensor 47 steps to find a solution.
As a whole, AlphaTensor outperformed the state-of-the-art algorithms for more than 70 unique matrix sizes. Multiplying two 9x9 matrices now takes only 498 operations, down from 511, and multiplying two 11x11 matrices now takes only 896, down from 919. Many other cases had the best existing algorithm rediscoverable by AlphaTensor.
Researchers were taken aback by the wide variety of correct algorithms found by AlphaTensor for each matrix size. DeepMind researcher Hussein Fawzi finds the fact that there are at least 14,000 ways to multiply four-by-four matrices mind-boggling.
After identifying the theoretically fastest algorithms, the DeepMind team wanted to know which ones would also perform well in practice. Since many computer chips are optimized for performing only a specific type of computation, different algorithms may fare better on various hardware. AlphaTensor was used by the DeepMind team to search for algorithms optimized for the Nvidia V100 GPU and the Google TPU processors. These are two of the most popular chips for training neural networks. The discovered algorithms improved upon the usual methods of matrix multiplication with those chips by 10% to 20%.
Computer scientist Virginia Williams from MIT's Computer Science and Artificial Intelligence Laboratory is thrilled with the findings. For quite some time, she says, people have used computational methods to discover new algorithms for matrix multiplication; many of the fastest algorithms currently in use were developed in this way. To no avail, however; no one has been able to top Strassen's long-standing results.
Williams claims that the new approach "does something completely different" than previous ones. It would be great to know if the new approach replaces the old ones or if the two can be combined to yield a more powerful result.
AlphaTensor will be used by DeepMind going forward as the company seeks out new algorithm designs. Kohli describes it as "a novel approach to the study of computing."