Loft Orbital and NASA JPL Successfully Deploy Google DeepMind’s Gemma 3 in Orbit

Table of Contents
A Paradigm Shift in Orbital Intelligence
For decades, the workflow of Earth observation has been a linear, cumbersome process: a satellite captures massive amounts of raw imagery, beams it back to a ground station, and waits for human analysts or terrestrial servers to parse the data. This ‘store-and-forward’ model creates a significant bottleneck, often delaying critical insights by hours or even days.
That model was fundamentally challenged in April when the Yam-9 satellite, operated by space infrastructure firm Loft Orbital, successfully identified specific targets autonomously. This wasn’t a simple object-detection algorithm—the kind that looks for a specific pixel pattern of a ship or a plane. Instead, the spacecraft utilized a Vision-Language Model (VLM) to interpret complex natural language queries and find corresponding visual data directly in orbit.
- Real-Time Triage: Satellites can now decide what data is important enough to send back, drastically reducing bandwidth waste.
- Natural Language Interface: Operators can query satellites using human language rather than complex coordinate-based tasking.
- Edge Compute Milestone: The success proves that sophisticated AI like Google DeepMind’s Gemma 3 can operate within the harsh power and thermal constraints of space.
This milestone, achieved through a collaboration between Loft Orbital and NASA’s Jet Propulsion Laboratory (JPL), signals the transition from ‘dumb’ sensors to intelligent agents capable of independent reasoning while orbiting the planet.
The Technical Engine: Gemma 3 and NAVI-Orbital
The intelligence powering Yam-9 is Gemma 3, a lightweight vision-language model developed by Google DeepMind. Unlike traditional LLMs that process only text, a VLM can ‘see’ and ‘read’ simultaneously, allowing it to understand the relationship between a textual description (e.g., “infrastructure around a railway hub”) and a visual image.
However, deploying a model designed for data centers into the vacuum of space presents immense technical hurdles. Space-grade hardware cannot match the TFLOPS of a terrestrial H100 cluster. To bridge this gap, the team utilized the Nvidia Jetson Orin AGX GPU, a high-performance system-on-module designed for edge AI. Even with this hardware, the software required significant optimization.
Juan Delfa Victoria, a technical leader within NASA JPL’s AI group, spearheaded the development of NAVI-Orbital. This software package acted as the operational ‘harness’ for Gemma 3. The engineering challenge was not the model itself—which is available off-the-shelf—but the stripping away of unnecessary libraries and memory overhead to ensure the model could run within the tight RAM and power envelopes of the Yam-9 platform.
The Execution of the Demonstration
During the April tests, researchers tasked the VLM with identifying complex environmental intersections. Rather than searching for a specific GPS coordinate, they queried the model to find areas where “natural environments meet human development.” The VLM analyzed the sensor stream in real-time, identified the corresponding visual signatures, and flagged them as areas of interest.
This capability transforms the satellite from a camera into an analyst. By processing data at the edge, the Yam-9 avoids the ‘data deluge’—the phenomenon where satellites collect more information than their downlink pipes can possibly transmit.
What This Means for the Space Economy
The implications of this breakthrough extend beyond technical curiosity; they impact the valuation and utility of space-based assets. Currently, the value of a satellite is often tied to its sensor resolution and orbit. With onboard AI, the value shifts toward the intelligence layer.
Reducing the Downlink Bottleneck
Most satellites are limited by their X-band or Ka-band downlink speeds. When a satellite captures a high-resolution image of a city, it might generate gigabytes of data, only 1% of which is actually relevant to the user. By implementing ‘orbital triage,’ a VLM can discard the 99% of irrelevant data (like cloud cover or empty fields) and transmit only the critical findings, effectively multiplying the usable bandwidth of the constellation.
The Move Toward ‘Always-On’ Patrol Layers
Paul Lasserre, head of AI at Loft Orbital, suggests that this opens the door to autonomous patrol layers. In a traditional setup, a user must request an image of a specific border. In an AI-driven setup, the user provides a logic-based instruction: “Monitor this border and alert me immediately when suspicious activity occurs.”
This creates a proactive rather than reactive intelligence system. Instead of reviewing images after the fact, the AI provides real-time triggers, allowing ground operators to respond to events as they happen.
The Competitive Landscape: Planet Labs and Kepler
Loft Orbital is not alone in the pursuit of orbital compute. The industry is rapidly pivoting toward edge processing to maintain a competitive edge.
| Company | Hardware/Approach | Current AI Capability |
|---|---|---|
| Loft Orbital | Nvidia Jetson Orin AGX | Vision-Language Models (VLM) for autonomous querying. |
| Planet Labs | Jetson Orin Processors | Primarily object detection; currently researching VLMs. |
| Kepler Communications | Large-scale GPU clusters in space | Undisclosed use cases via partner NDAs. |
Planet Labs, which operates one of the largest constellations of Earth-imaging satellites, has confirmed it is researching VLM applications. Meanwhile, Kepler Communications has positioned itself as the ‘cloud provider’ of space, operating the largest group of GPUs in orbit. While they have not explicitly confirmed VLM deployment due to non-disclosure agreements, they have noted multiple ‘undisclosed use cases’ for their compute environments since January.
Beyond Earth: The Path to Deep Space Assistants
While the current focus is on Earth observation, the DNA of the NAVI-Orbital project is rooted in deep space exploration. The original conceptualization by JPL researcher Taran Cyriac John was aimed at assisting astronauts on the Moon or Mars.
The physical constraints of extravehicular activity (EVA) are severe. Astronauts in pressurized suits cannot use keyboards or complex touchscreens. An interactive, voice-and-vision-capable AI assistant—similar to those seen in science fiction—would allow an astronaut to ask, “What is this rock formation?” or “Where is the nearest oxygen leak?” and receive an immediate, visually contextualized answer without needing to communicate back to Earth, where signal latency can be up to 20 minutes for Mars.
Technical Limitations and Engineering Trade-offs
Despite the success, the road to a fully autonomous constellation is fraught with hardware limitations. The deployment of Gemma 3 on Yam-9 highlights three primary constraints:
- Thermal Management: GPUs generate significant heat. In the vacuum of space, there is no air to carry heat away via convection, meaning all thermal management must rely on radiation and conduction. Running a VLM at full tilt can quickly overheat a spacecraft.
- Radiation Hardening: High-energy particles in orbit can cause ‘bit flips’ (Single Event Upsets) in memory. While the Jetson Orin is powerful, it is not as radiation-hardened as older, slower processors, requiring sophisticated software error-correction.
- Power Budget: Every watt used by the GPU is a watt taken away from the sensors or the propulsion system. Finding the equilibrium between AI inference and spacecraft longevity is a constant balancing act.
Frequently Asked Questions
What is a Vision-Language Model (VLM)?
A Vision-Language Model (VLM) is a type of artificial intelligence that combines a vision encoder (to understand images) with a language model (to understand text). Unlike standard AI that can only label an image as “dog” or “cat,” a VLM can understand complex prompts like “Find the building with the red roof next to the river” and locate it within an image.
How does this differ from existing satellite AI?
Most existing satellite AI uses Discriminative AI, which is trained to recognize specific, pre-defined patterns (e.g., spotting a specific type of cargo ship). The VLM on Yam-9 uses Generative/Reasoning AI, which allows it to interpret new, natural language queries without needing to be specifically retrained for every new object it looks for.
Why is the Nvidia Jetson Orin used in space?
The Nvidia Jetson Orin provides the necessary tensor cores and GPU acceleration required to run deep learning models. It offers a high performance-per-watt ratio, which is critical for satellites that rely on limited solar power.
Can this AI be used for military purposes?
While the Yam-9 demonstration focused on civilian infrastructure and environment, the ability to autonomously detect “suspicious activity” or specific military assets in real-time has significant implications for intelligence, surveillance, and reconnaissance (ISR) capabilities.
Does this mean satellites can now ‘think’ for themselves?
Not in the sense of consciousness. The AI is performing complex pattern matching and probabilistic reasoning based on its training data. It is following a set of logical instructions provided by humans, but it is doing so without needing a human to guide it through every single image.