Updated: May 05 2025 07:12AI Summary: At a recent NVIDIA GTC panel, industry leaders discussed the rapid acceleration of robotics, driven by converging advancements in AI models, simulation data, and hardware. Key themes included a shift towards learning-based approaches, exemplified by NVIDIA GR00T project for humanoids, and the grounding of robots in physical reality, preventing AI hallucination. Panelists predict specialized robots delivering value soon, with broader adoption and more general-purpose capabilities emerging within the next decade, marking a pivotal time for the field.
The recent NVIDIA GTC panel brought together industry leaders to discuss where we are and where we're headed in this exciting field. The panel featured an impressive lineup: Bernt Bornich (1X), Deepak Pathak (Skild AI), Pras Velagapudi (Agility Robotics), Aaron Saunders (Boston Dynamics), and Jim Fan (NVIDIA Gear Lab), moderated by Tiffany Janzen of Tiffentech. Let's dive into the key insights from this fascinating discussion.
Why Robotics is Accelerating Now
For decades, robotics was considered the oldest application of AI, yet it progressed slower than other AI domains. What's changed? According to the panel, several critical factors have converged:
- Advanced AI Models: Foundation models like large language models (LLMs) have provided better reasoning capabilities and multimodal understanding that are crucial prerequisites for robotics
- Data Generation at Scale: GPU-accelerated simulation now enables generating 10 years of training data in just hours
- Hardware Improvements: Robot hardware has become both better and more affordable (from $1.5M for NASA's Robonaut in 2001 to around $40K today)
- Closing the Sim-to-Real Gap: Better physics simulation that can run faster than real-time
- Component Commoditization: Global supply chain advancements in batteries, sensors, and compute components
As Jim Fan put it: "The rest of models are becoming really, really good that we can start to tackle robotics a lot more systematically."
Learning by Experience: A Fundamental Shift
One of the most significant paradigm shifts has been moving away from traditional control theory toward learning-based approaches. As Deepak Pathak explained:
Robotics so far has been the field of controls... Controls was not designed for robotics. It had its limelight during World War II for flying planes, missiles... But it's not in the same spirit [as Turing's vision]. It's not childlike learning. In a child, you are not teaching them calculus first to learn how to walk... You are learning by experience.
This shift from programming experience to learning by experience represents a fundamental change in how we approach robotics. Bernt Bornich added that this is about creating a "data flywheel" where robots can learn in the real world, which is where true intelligence emerges.
The GR00T Strategy: NVIDIA's Approach to Humanoid Intelligence
Jim Fan shared insights into NVIDIA's GR00T project, which aims to build a foundation model for humanoid robots. Their strategy follows two key principles:
- Make the model as simple as possible - "Photons to actions," where the model takes pixels from cameras and directly outputs continuous floating-point control values for motors, with no intermediate steps.
- Create a sophisticated data pipeline organized as a pyramid. Top: High-quality real robot data (limited by physics to 24 hours per robot per day). Middle: Simulation data using physics engines like ISAAC. Bottom: Internet-sourced multimodal data and neural simulations
This approach resulted in the GR00T N1 model, recently open-sourced as "the world's first open humanoid robot foundation model" at just 2 billion parameters.
The Hardware Challenge: Cross-Embodiment
One fascinating challenge discussed was "cross-embodiment" - the ability of a single AI model to control different robot bodies. As Aaron Saunders noted, hardware matters significantly:
The reality is that mostly what we want to build these machines for extends beyond the simple tabletop tasks... If you want to be lifting big, heavy complex objects or working with something hot... hardware really does matter and it has to co-evolve.
Jim Fan pointed out that even robots of the same model can vary due to manufacturing tolerances, creating a cross-embodiment challenge even within one generation. However, he also noted that humans excel at cross-embodiment (like adapting to control different characters in video games), suggesting this problem is solvable. Bernt Bornich added an intriguing perspective that robots need to be learning engines that can adapt on the fly:
What humans have is not a system which can do many things. It's a system which can learn to do many things. What we are carrying in our head is a learning engine.
Robotics and Hallucination
In a particularly insightful exchange, the panel discussed how robotics differs from other AI applications in its relationship with reality. Deepak pointed out that unlike LLMs, robots cannot "hallucinate" in the same way because they interact with the physical world:
Robots cannot hallucinate. Because if I have to hallucinate what will happen if I push this bottle from here to there? I can just try it. It will drop. I can see... Interaction is the enemy of hallucination.
Burnt shared a practical example of this grounding in reality - a robot they programmed to check and close toilet seats. By interacting with the real world, the robot could verify its perceptions, providing clear feedback on what was correct.
The Timeline: What's Next?
When asked about the next 2-5 years, the panelists gave nuanced perspectives:
- Bernt Bornich (1X): While full realization might take 10 years, in 5 years robots will be widely adopted. "Three to five years, it is pretty much out there amongst most people, and even if not everyone has a robot, people know someone who has a robot."
- Deepak Pathak (Skild AI): Task-specific robots will come much sooner than general ones. "The robot that solves all tasks everywhere, that may be farther. But we will start seeing robots that solve few tasks... And even they are super useful."
- Pras Velagapudi (Agility Robotics): The expectation that robots can be multi-purpose rather than single-purpose is the new waterline. "That's what really drives it - people wanting those things really drives investment and focus."
- Aaron Saunders (Boston Dynamics): We should focus on the rate of progress and "beachheads" - specific use cases where robots deliver value. "Specialist robots that are delivering value in a commercial setting, I think we're going to have that in the next one or two years."
- Jim Fan (NVIDIA): In the short term (2-5 years), we'll fully understand the "embodied scaling laws" for robotics AI. Long term (20 years), robots will accelerate scientific research and even build the next generation of robots. "Everything that moves will be autonomous."
The excitement around humanoid robotics is palpable. We're experiencing a convergence of AI advances, hardware improvements, and paradigm shifts in learning approaches that are accelerating progress at an unprecedented rate. While general-purpose humanoid robots may still be some years away, the panel's insights suggest we're on the cusp of seeing increasingly capable robots performing valuable tasks in our homes, workplaces, and research labs.
As Jim Fan eloquently put it:
We as a generation were born too late to explore the earth, were born too early to travel to other galaxies, were born just in time to solve robotics.
Recent Posts