The platform uses diffusion and autoregressive models to simulate every possible outcome in a given scenario and presents it as synthetic video content. (Image credit: Getty Images/VICTOR de SCHWANBERG/SCIENCE PHOTO LIBRARY)
LAS VEGAS — Scientists have developed a new “multiverse simulation” platform that can generate massive amounts of data to train advanced, self-learning robots powered by artificial intelligence (AI).
The tool, called Cosmos, enables researchers to build “world foundation models” — neural networks that replicate real-world conditions and laws of physics to predict reliable outcomes, according to Nvidia, the platform’s creator. These generative AI models can create synthetic data to train embodied or physical AI systems, such as autonomous vehicles (AVs) or humanoid robots.
Training AI systems requires vast amounts of data, but researchers predict that the total amount of available data could run out by 2026. AI systems that interact with the environment, such as robots, often require real-world footage, which is extremely difficult to create and expensive to acquire.
However, creating truly valuable synthetic data is challenging, and one study has previously warned that using misinterpreted synthetic data can lead to misunderstandings. Cosmos is designed to address these issues by allowing scientists to quickly produce massive amounts of artificial video content based on real physics principles.
“Today’s humanoid developers have hundreds of operators who perform thousands of repeated demonstrations just to learn a few skills,” said Rev. Lebaredian, vice president of Omniverse and simulation technologies at Nvidia, at a virtual press conference Monday (Jan. 6) at CES 2025 in Las Vegas. “Autonomous car developers need to drive millions of miles; processing, filtering, and annotating huge amounts of data is even more resource-intensive; and physical testing is risky. Humanoid developers are wasting a lot of time and resources when a single robot prototype can cost hundreds of thousands of dollars.”
Simulation of the multiverse
The core element of the platform is a multiverse simulation, in which Cosmos connects with Nvidia’s Omniverse software system to generate all possible future outcomes in a given scenario. This data is then used to train a robot or self-driving car.
The platform uses diffusion models used to generate images — machine learning algorithms that create data by adding “noise” (noisy specifications) to a data set and then learn to remove that noise — as well as autoregressive models, which are statistical methods used to predict the next step in a process. Together, the platform can take text, images, or video and then generate frames that predict what will happen next in a given scenario in real time.
“This is robotics’ ChatGPT moment. Like big language models, world foundation models are fundamental to the development of robots and drones, but not all developers have the knowledge and resources to train their own,” Jensen Huang, founder and CEO of Nvidia, said in a statement. “We built Cosmos to make physical AI more accessible and make general robotics easier for every developer.”
World foundation models created with Cosmos are also available under an open source license.
Sourse: www.livescience.com