Origin Lab bets $8M on the gaming industry as the next frontier for physical AI training

Table of Contents
The data bottleneck for physical AI
For the last few years, the AI gold rush has been fueled by the open internet—vast swathes of Common Crawl data and Reddit threads that taught Large Language Models how to mimic human conversation. But as the industry shifts toward ‘world models’—AI designed to understand physics, spatial awareness, and the tangible rules of the physical environment—the internet is no longer enough. You cannot teach a robot how to navigate a cluttered kitchen or how an object bounces off a wall simply by scraping text.
This gap has created a desperate scramble among AI labs to find high-fidelity, structured data that maps physical interactions. Enter Origin Lab, a startup that has just secured $8 million in seed funding to turn the video game industry into a massive, licensed data farm for the next generation of AI.
Mining the virtual for the physical
The premise is straightforward but technically ambitious: video games are essentially highly sophisticated physics simulators. Whether it is the lighting engines of Unreal Engine 5 or the rigid-body dynamics in a racing sim, game developers have already spent decades perfecting how digital objects move, collide, and react to gravity. To a world-model researcher, this is a goldmine of pre-labeled, controllable environment data.
Origin Lab is positioning itself as the critical infrastructure layer between these two worlds. Rather than AI labs attempting to scrape copyright-infringing footage from Twitch or YouTube—a practice that led to early controversies with OpenAI’s Sora—Origin Lab creates a legal, structured marketplace. They provide the tools to convert game assets and simulated walkthroughs into training sets that labs like Fei-Fei Li’s World Labs or Yann LeCun’s AMI Labs can ingest.
“The AI systems that are being built now need to understand how the physical world works and how things move,” co-CEO and co-founder Anne-Margot Rodde explained. According to Rodde, the infrastructure to connect AI researchers with the gaming industry simply didn’t exist until now. By acting as the bridge, Origin Lab allows game studios to monetize their digital assets in a way that doesn’t interfere with their primary product: the game itself.
The ‘Scale AI’ Effect
The investment, led by Lightspeed Ventures with participation from SV Angel, Eniac, Seven Stars, and FPV, signals a broader trend in the AI ecosystem. The industry is moving away from ‘big data’ (quantity) toward ‘smart data’ (quality). We’ve seen this play out with the meteoric rise of Scale AI, which became a powerhouse by providing the human-in-the-loop labeling necessary to make LLMs viable.
Faraz Fatemi, a partner at Lightspeed, notes that the revenue scaling for data vendors is becoming incredibly sharp because data is now the primary bottleneck for well-capitalized AI labs. When the compute is available and the architectures are stable, the only thing holding back a model’s performance is the quality of the ground-truth data it’s fed.
Legal safeguards in a scrape-heavy era
The timing of Origin Lab’s launch is particularly relevant given the current legal climate. AI companies are increasingly facing lawsuits over the unauthorized use of intellectual property. The previous era of ‘scrape everything and ask for forgiveness later’ is hitting a wall, especially with high-value visual assets.
By focusing on licensed data, Origin Lab avoids the ‘Sora problem,’ where early versions of the video generator were seen regurgitating recognizable game footage. Instead of accidental leakage, Origin proposes a transactional model where game companies are paid for their contribution to AI development. This transforms the relationship between the gaming industry and AI labs from adversarial to symbiotic.
The operational side of the business will likely involve a mix of automated rendering runs and the creation of synthetic datasets—essentially filming thousands of hours of virtual interactions under controlled parameters to ensure the AI learns the ‘correct’ physics of a scene without the noise of a real-world recording.