How synthetic data generation is helping train AI

00:00 Speaker A
Now one of the most important things we need to do with physical AI is to create the data that will train the AI in the first place. Where does this data come from?
00:09 Speaker A
Instead of having languages because we create a set of texts that we think are the basic facts that AI can learn, how can we teach AI the basic facts of physics?
00:20 Speaker A
There are many, many videos, but not enough to capture the variety and type of interaction we need.
00:30 Speaker A
This is where great minds come together and turn what used to be calculations into data.
00:43 Speaker A
Now, using synthetic data generation, grounded and conditioned by the laws of physics, grounded and conditioned by fundamental facts,
00:54 Speaker A
We can now selectively and intelligently generate data that we can use to train AI. For example, here’s what happens to this AI, the Cosmos AI world model on the left:
01:06 Speaker A
The output of a traffic simulator.
01:10 Speaker A
This traffic simulator is no longer enough for an artificial intelligence to learn.
01:17 Speaker A
We can take that and put it into a Cosmos base model and create physically based and physically plausible surround video that the AI can now learn from.




