MuZero, DeepMind's new artificial intelligence, is capable of becoming a master at various games without initially knowing their rules

DeepMind is a subsidiary of Alphabet. This company gained notoriety in 2016 when the AlphaGo artificial intelligence system it developed, which teaches self-taught millions of virtual games, reached a level of skill in the Chinese game of go that the world champion in that game could not seriously resist. In 2018, AlphaZero’s more advanced artificial intelligence system achieved similar results simultaneously in three types of logic games: go, chess, and Japanese chogi-shogi. And now DeepMind presents its next creation, the MuZero system, which doesn’t even require a basic knowledge of the rules of the game to learn and acquire skills in any game.

Learning the MuZero system begins with the system taking the first step (or steps) and exploring the possibilities that the rules of the game allow. However, the system analyzes the “bonuses” provided by the game for the correct action, in the case of the game “Pac-Man” it is the yellow points spent, and in the case of chess it is the decisive approach of the winner. The system then begins to develop its abilities, constantly attacking the enemy and trying to get more bonuses.

The simultaneous learning of the rules and increasing the level of play allows the MuZero system to show a huge advantage in efficiency and “economy” of data usage over previous versions of the system. However, there is a disadvantage in this, as the MuZero system requires quite large computational resources for full training. However, once trained, the system requires a small amount of computing resources capable of making the right decisions quickly, even with very limited hardware, and not with the most powerful smartphones in existence.

The self-learning method implemented is already close enough to DeepMind’s main goal – to create an artificial intelligence system capable of self-learning, just like small children. Moreover, this learning method is ideal for teaching artificial intelligence in an environment where the end goal or task as a whole cannot be described accurately and clearly enough. And most of the problems that artificial intelligence must solve in the real world in the future belong to this class.

In parallel with learning games, DeepMind professionals have begun their first attempts to use artificial intelligence for practical purposes. “We are now studying MuZero’s performance in video compression and other areas that, for many reasons, could not be used with previous generations of systems such as AlphaZero,” said Thomas Hubert, lead researcher.

Other practical applications for such versatile artificial intelligence systems include autonomous driving of robotic cars developed by another Alphabet subsidiary, Waymo, the biochemical field in which AlphaFold’s sister program has recently shown impressive results, and much more.