Code Venus

Green AI isn’t about saying “no” to intelligence — it’s about asking it to do more with less. Fewer tokens, right-sized models, cleaner runs. Think of it as performance engineering for cognition, with a carbon budget.

A single request might seem trivial — but multiplied by billions, it stacks into datacentres, grids, water systems and silicon supply chains. Design, infra and product choices now shape whether AI becomes sustainable or just another energy sink.

What Green AI Means

Token sobriety: say less, mean more. Structure prompts; avoid dumping full docs.
Right-sized models: match model size to the real complexity of the task.
Fewer, better calls: cache stable outputs; don’t re-run just to “see another take.”
Cleaner runs: prefer greener regions/hours; monitor PUE and WUE; amortise training impacts over time.
Measure & share: track token budgets and show estimated kWh / CO₂ / H₂O per 1k tokens in product documentation.

Do / Don’t

Do: Keep context concise (3–5 bullets), define the output format (JSON, bullet list, schema, etc.).
Do: Reuse a stable system prompt; inject only the delta per request.
Do: Cap output (max_tokens) and stop once the useful answer appears.
Don’t: Paste long PDFs “just in case.” Use RAG (chunk + threshold) instead.
Don’t: Spin up the largest model for spell-check — that’s like taking a rocket to buy bread.
Don’t: Re-roll endlessly. Log, compare, pick, move on.

Pick the Right Model

Tiny / small LLMs: tasks like grammar, extraction, or short rewrites can run on models such as Gemma 2B, Llama 3 8B, or Mistral 7B.
Mid-size: for reasoning on short contexts or structured outputs, use models like Claude 3 Haiku or GPT-4 Mini.
Large / flagship: for long-context synthesis, multi-step reasoning or safety-critical work, reserve models such as Claude 3 Opus, GPT-4 Turbo or Gemini 1.5 Pro.
Ultra-heavy or multi-model platforms: use only when the cross-modality adds proven value; combine via orchestrators like Mammouth AI to reduce duplicate infrastructure calls.^[9]

Rule of thumb: if a regular expression or a 7B-model could solve it, don’t wake a 175B-parameter giant. GPU use accounts for ~80 % of an AI server’s environmental footprint.^[8]

AI Imagery & Video

Generating visuals and video is where AI turns from clever to energy-hungry. A single 1-minute generative video using models such as Sora or Kling AI involves dozens of high-resolution diffusion passes — roughly equivalent to running a microwave for 1.5–2 days non-stop.^[10] A single 1024×1024 image from Midjourney or ChatGPT Image can consume as much energy as sending several hundred emails with attachments.

The problem scales fast: rendering iterations, upscales, and “prompt roulette” multiply compute load by factors of 10–100. Behind every aesthetic experiment lies GPU time, cooling water and silicon wear.

How to Create Responsibly:

Batch and limit: generate several variations at once, review offline, then refine — don’t re-prompt endlessly.
Work small first: test at 512 px, upscale only the selected outputs.
Prefer efficient tools: use platforms offering model-selection and low-energy backends (e.g. Leonardo AI Eco Mode, Hugging Face Diffusers Lite).
Cache and reuse assets: keep base compositions and re-edit locally instead of regenerating.
For professionals: measure cost-per-asset (tokens, Wh, CO₂) and include it in project reporting. A transparent footprint becomes part of good creative practice.

Artistic exploration isn’t the enemy — waste is. Fewer, deliberate generations are both greener and more creative.

Beyond Prompts

Greener runs: choose cleaner-grid regions or off-peak hours for scheduled tasks.
Cache & reuse: embed once, reuse often. Memoize stable outputs server-side.
Amortise training: allocate training impact over real, long-term use.
Governance: set token budgets per feature; log & review outliers monthly.
Platform efficiency: prefer tools that pool multiple models efficiently — e.g. Mammouth AI, which integrates text, vision and reasoning models in a unified interface, allowing selection of the smallest capable model for each task.^[9]
UX matters: don’t ask users to “try again” 5 times — that’s carbon with extra clicks.

References

Washington Post (26 Aug 2025) — “ChatGPT is an energy guzzler.” washingtonpost.com
Washington Post (19 Jun 2025) — “ChatGPT isn’t great for the planet.” washingtonpost.com
IEA (2025) — Energy and AI. iea.org
IEA (10 Apr 2025) — Data-centre demand could more than double by 2030. iea.org
Li et al. (2023) — “Making AI Less ‘Thirsty’.” arxiv.org
The Guardian (22 May 2025) — AI could reach ~49 % of datacentre power by end-2025. theguardian.com
The Verge (May 2025) — AI may surpass Bitcoin by end-2025. theverge.com
Green IT (2025) — Impacts environnementaux et sanitaires de l’intelligence artificielle. greenit.fr
Veille Techno IT (2025) — “Mammouth AI: a multi-model platform with growing eco considerations.” veilletechno-it.info
RIFS / Helmholtz Zentrum (2025) — “Environmental cost of multimodal AI systems.”

Figures remain indicative; generative imagery and video workloads have among the highest per-output energy intensities in computing.