Green AI isn’t about saying “no” to intelligence — it’s about asking it to do more with less.
Fewer tokens, right-sized models, cleaner runs. Think of it as performance engineering for cognition,
with a carbon budget.
A single request might seem trivial — but multiplied by billions, it stacks into datacentres, grids,
water systems and silicon supply chains. Design, infra and product choices now shape whether AI
becomes sustainable or just another energy sink.
What Green AI Means
Token sobriety: say less, mean more. Structure prompts; avoid dumping full docs.
Right-sized models: match model size to the real complexity of the task.
Fewer, better calls: cache stable outputs; don’t re-run just to “see another take.”
Cleaner runs: prefer greener regions/hours; monitor PUE
and WUE; amortise training impacts over time.
Measure & share: track token budgets and show estimated kWh / CO₂ / H₂O
per 1k tokens in product documentation.
Do / Don’t
Do: Keep context concise (3–5 bullets), define the output format (JSON, bullet list, schema, etc.).
Do: Reuse a stable system prompt; inject only the delta per request.
Do: Cap output (max_tokens) and stop once the useful answer appears.
Don’t: Paste long PDFs “just in case.” Use RAG (chunk + threshold) instead.
Don’t: Spin up the largest model for spell-check — that’s like taking a rocket to buy bread.
Don’t: Re-roll endlessly. Log, compare, pick, move on.
Pick the Right Model
Tiny / small LLMs: tasks like grammar, extraction, or short rewrites can run on models such as
Gemma 2B, Llama 3 8B, or Mistral 7B.
Mid-size: for reasoning on short contexts or structured outputs, use models like
Claude 3 Haiku or GPT-4 Mini.
Large / flagship: for long-context synthesis, multi-step reasoning or safety-critical work,
reserve models such as Claude 3 Opus, GPT-4 Turbo or Gemini 1.5 Pro.
Ultra-heavy or multi-model platforms: use only when the cross-modality adds proven value;
combine via orchestrators like Mammouth AI to reduce duplicate infrastructure calls.[9]
Rule of thumb: if a regular expression or a 7B-model could solve it, don’t wake a 175B-parameter giant.
GPU use accounts for ~80 % of an AI server’s environmental footprint.[8]
AI Imagery & Video
Generating visuals and video is where AI turns from clever to energy-hungry.
A single 1-minute generative video using models such as Sora or Kling AI
involves dozens of high-resolution diffusion passes — roughly equivalent to running a microwave
for 1.5–2 days non-stop.[10]
A single 1024×1024 image from Midjourney or ChatGPT Image can consume as much energy
as sending several hundred emails with attachments.
The problem scales fast: rendering iterations, upscales, and “prompt roulette”
multiply compute load by factors of 10–100. Behind every aesthetic experiment lies GPU time,
cooling water and silicon wear.
How to Create Responsibly:
Batch and limit: generate several variations at once, review offline, then refine —
don’t re-prompt endlessly.
Work small first: test at 512 px, upscale only the selected outputs.
Prefer efficient tools: use platforms offering model-selection and low-energy backends
(e.g. Leonardo AI Eco Mode, Hugging Face Diffusers Lite).
Cache and reuse assets: keep base compositions and re-edit locally instead of regenerating.
For professionals: measure cost-per-asset (tokens, Wh, CO₂) and include it in project reporting.
A transparent footprint becomes part of good creative practice.
Artistic exploration isn’t the enemy — waste is. Fewer, deliberate generations are both greener and more creative.
Beyond Prompts
Greener runs: choose cleaner-grid regions or off-peak hours for scheduled tasks.
Amortise training: allocate training impact over real, long-term use.
Governance: set token budgets per feature; log & review outliers monthly.
Platform efficiency: prefer tools that pool multiple models efficiently
— e.g. Mammouth AI, which integrates text, vision and reasoning models in a unified interface,
allowing selection of the smallest capable model for each task.[9]
UX matters: don’t ask users to “try again” 5 times — that’s carbon with extra clicks.
References
Washington Post (26 Aug 2025) — “ChatGPT is an energy guzzler.”
washingtonpost.com
Washington Post (19 Jun 2025) — “ChatGPT isn’t great for the planet.”
washingtonpost.com