What if every AI query you made carried a hidden environmental cost? It’s not science fiction—it’s reality. As AI systems power everything from customer support chatbots to medical diagnostics, their energy, water, and carbon footprints are growing. At Google, we set out to measure that impact—not with hypothetical models, but with real data from our production AI serving infrastructure.
For years, industry estimates of AI’s environmental cost relied on idealized lab conditions or partial calculations. They missed crucial variables: hardware utilization, cooling demands, network latency, and even the water needed to cool data centers. We realized that without accurate baselines, we couldn’t optimize for sustainability. So we built our own framework.
Our research, published today, reveals the first-ever real-world measurements of AI inference at scale. The median Gemini text prompt now uses just 0.24 watt-hours of energy—the equivalent of watching TV for less than nine seconds. It emits 0.03 grams of CO2e and consumes 0.26 milliliters of water, roughly five drops. These figures may seem small, but when multiplied by billions of daily queries, they become monumental.
This isn’t just a win for Google—it’s a breakthrough for the entire AI ecosystem. Over the past 12 months, we reduced the energy and carbon footprint per Gemini prompt by 33x and 44x, respectively. How? Not through one magic upgrade, but through relentless optimization across the full stack: more efficient model architectures, smarter routing of queries, improved chip utilization, better cooling systems, and data center location choices aligned with renewable energy grids.
Consider the real-world impact. A financial services firm using an AI-driven fraud detection system might process 10 million queries a day. Before optimization, that could mean using roughly 190 kWh of electricity daily—enough to power a home for over six weeks. After applying similar efficiency techniques, the same volume could drop to under 6 kWh. That’s not just cost savings—it’s measurable climate impact.
Another example: a healthcare startup relying on AI to analyze radiology images. Efficient AI inference means faster diagnosis, lower operational costs, and reduced reliance on fossil-fueled power grids. By choosing infrastructures with higher renewable penetration, they further shrink their carbon footprint—demonstrating that sustainability isn’t a trade-off for performance, but an accelerator.
We’re sharing this methodology openly because sustainability in AI can’t be an afterthought. It must be engineered in. Whether you’re a startup scaling your first ML model or an enterprise running complex inference pipelines, you can start today:
-
Measure what matters: Track energy, carbon, and water per inference—not just latency or accuracy. Use tools like the Green Algorithms calculator or build custom telemetry into your pipeline.
-
Optimize holistically: Efficient models are only part of the solution. Ensure your cloud provider uses renewable energy, adopt dynamic scaling, and reduce over-provisioning. Even small tweaks like batching queries together can cut energy use by 20-40%.
-
Demand transparency: Ask your AI vendors for emissions data per request. The more organizations ask, the more accountable the industry becomes.
The future of AI doesn’t need to be carbon-intensive. With thoughtful engineering and collective accountability, we can build systems that are not just intelligent—but responsible. The numbers tell us it’s possible. Now it’s up to all of us to make it standard.