Interactive explainer

How much energy does a Gemini prompt use?

Unpacking Google's methodology for measuring AI inference energy, emissions, and water, and what "median prompt" actually means.

DraftElsworth et al., Aug 2025arXiv:2508.15734
Draft note: this explainer is still in progress and may change as I refine the framing, calculations, and sourcing.

In August 2025, Google published the first detailed, first-party measurement of AI inference environmental costs in production. The headline: a median Gemini text prompt uses 0.24 Wh of energy, less than nine seconds of TV. But what goes into that number, and how is "median" defined?

01

Reproduce their numbers

Google's methodology measures four energy components for each prompt, then applies conversion factors to get emissions and water use. Adjust the sliders to see how each piece contributes. The defaults match Google's May 2025 numbers.

Energy components
Active AI Accelerator0.14 Wh
TPU/GPU power during inference
0.01 Wh0.5 Wh
CPU & DRAM0.06 Wh
Host system power for active machines
0.01 Wh0.2 Wh
Idle Machines0.02 Wh
Reserved capacity for availability
0 Wh0.1 Wh
Conversion factors
PUE1.09
Power Usage Effectiveness. 1.0 means zero cooling overhead. Google fleet: 1.09. Industry average: about 1.55.
11.6
Grid emission factor94 gCO2e/kWh
Google market-based: 94. US average: about 380. Coal-heavy grids: 500 and above.
0 gCO2e/kWh600 gCO2e/kWh
WUE1.15 L/kWh
Water Usage Effectiveness. Google: 1.15 L/kWh.
0 L/kWh3 L/kWh
Energy
0.24 Wh
about 9s of TV
Emissions
0.023 g CO2e
Water
0.25 mL
about 5 drops
Energy breakdown
The math:
Eactive = 0.14 + 0.06 + 0.02 = 0.22 Wh
Etotal = 0.22 x 1.09 = 0.24 Wh
CO2e = 0.24 Wh x 94 gCO2e/kWh / 1000 = 0.023 g
Water = 0.22 Wh x 1.15 L/kWh = 0.25 mL
02

The median trick

Google reports the energy of the median prompt, but this is not what many readers expect. Models are ranked by energy efficiency, then Google finds which model serves the 50th-percentile prompt by volume. If a cheap model handles most traffic, the headline median can stay cheap even when expensive models exist in the mix.

Models
Prompts/day5000
10020000
Energy/prompt0.08 Wh
0.01 Wh3 Wh
Prompts/day3000
10020000
Energy/prompt0.18 Wh
0.01 Wh3 Wh
Prompts/day1500
10020000
Energy/prompt0.45 Wh
0.01 Wh3 Wh
Prompts/day500
10020000
Energy/prompt1.20 Wh
0.01 Wh3 Wh
Median prompt
0.08 Wh
via Flash-Lite
Mean prompt
0.22 Wh
177% higher than median
Cumulative prompt distribution (ranked by energy)
Flash-Lite
Flash
Pro
p50
0% (cheapest)100% (most expensive)
Energy/prompt by model
Why it matters: The median is Flash-Lite at 0.08 Wh, but the volume-weighted mean is 0.22 Wh. The 177% gap means heavy users who trigger expensive models are not reflected in the headline figure. Try shifting prompt volume toward the expensive models to see the median jump.
Source: Elsworth et al., "Measuring the environmental impact of delivering AI at Google Scale" (arXiv:2508.15734, Aug 2025). Numbers are based on Table 1 and Section 3 of the paper. This explainer is a simplified interactive reproduction; the actual methodology involves fleet-wide telemetry across thousands of machines.