in Opinion

New research indicates methods that could reduce the substantial energy consumption of artificial intelligence.(Image credit: J Studios via Getty Images)
- X
Share this article 0Join the conversationFollow usAdd us as a preferred source on GoogleSubscribe to our newsletter
While I enjoy my coffee in my Berlin dwelling and pose a question to Google’s AI chatbot, Gemini, it’s simple to overlook the power required to generate a response. Once the signal reaches my router, it travels, I presume, through copper or fiber-optic conduits to one of Google’s data center facilities. Within the data center’s intricate rows of stacked processors, my query is transformed into numerical data and subjected to billions of calculations to ascertain context and meaning. The reply, once formulated, swiftly returns, in the blink of an eye.
Data centers — the vital core of the internet, enabling everything from email to web searches — have been around for decades. However, with the escalating popularity of AI for generating text, images, and video, their energy consumption is reaching unprecedented levels. According to Google’s own assessments, processing a text prompt of median length with its AI assistant, Gemini, uses approximately 0.24 watt-hours.
These individual amounts, though small — 0.24 watt-hours is comparable to watching television for roughly nine seconds — are rapidly accumulating. In March 2026, OpenAI projected that over 900 million individuals utilize its AI chatbot, ChatGPT, weekly, resulting in billions of queries daily.
The precise electricity consumed by data centers, whether globally or within the United States, which hosts more than any other country, is not uniformly disclosed by all technology firms, according to Eric Masanet from the University of California, Santa Barbara, who studies data center sustainability. Nevertheless, based on the most recent evaluations by the International Energy Agency, US data centers consumed approximately 224 terawatt-hours of electricity in 2025 — exceeding 5 percent of the nation’s total electricity consumption. This represents a substantial increase from an estimated 1.9 percent usage in 2018, well before the widespread adoption of generative AI.
This electricity demand is poised for a significant surge. In their pursuit of market dominance for generative AI products, corporations such as Google, Meta, Amazon, OpenAI, Anthropic, Microsoft, and Oracle are committing tens to hundreds of billions of dollars to construct AI-centric data centers. In contrast to data centers of the pre-AI era, which might consume, for instance, 100 megawatts of electricity — sufficient to power 83,000 homes with average usage — the newer facilities are frequently “hyperscale” and can demand a gigawatt or more, equivalent to roughly one-tenth of the electrical capacity of Los Angeles.
Masanet and fellow experts have expressed concern over the fact that much of this demand is being met by power plants fueled by fossil fuels, such as natural gas, the combustion of which releases carbon dioxide that contributes to global warming. A primary factor is that data centers are often established in locations lacking abundant sources of renewable energy, including hydropower, geothermal, solar, or wind power.
Technology companies often mitigate emissions by investing in renewable energy projects elsewhere. However, unless these clean energy installations generate more power than the data centers consume, this approach — at best — maintains the CO2 emissions of the centers at a static level rather than reducing them to zero, which is crucial for halting climate change. “For every megawatt for which we install fossil fuel power,” Masanet observes, “it sets us back on our progress.”
Furthermore, this analysis does not account for the resources utilized in the production of the hardware that outfits new data centers, nor the repercussions for communities situated near them, which frequently contend with air and noise pollution from gas facilities and potential strain on local water supplies, utilized for cooling the data centers.

According to a non-exhaustive database from the International Energy Agency, numerous data centers in the US are situated in the Virginia region.
(Image credit: IEA / ENERGY AND AI OBSERVATORY 2025. CC BY 4.0)
While projections for AI’s energy footprint remain exceptionally difficult to ascertain, particularly given the uncertainty surrounding the returns on AI investments, it is evident to experts that strategies for energy conservation are critically needed. Without them, one 2025 projection suggests that US data centers could soon emit the equivalent of 24 to 44 megatons of CO2 annually, with the latter figure comparable to Norway’s yearly emissions.
Consequently, computer scientists and engineers are re-evaluating some of the power-intensive hardware and software that drive AI. They are engaged in developing energy-efficient algorithms and processor designs, and are meticulously considering the placement and construction methods of data centers.
“AI’s energy cost is not an accident: This is basically a product of how our systems are built,” states Fengqi You, an authority on energy systems at Cornell University. However, he adds, “we could really reshape the trajectory” with the appropriate combination of solutions.
The origins of AI’s energy challenge
To grasp the energy expenditure of AI, it is beneficial to understand large language models (LLMs) — the foundation of AI text generation tools like chatbots and AI assistants — specifically, those based on a framework introduced in 2017 by the machine-learning laboratory Google Brain. This framework, known as transformer architecture, can process text with remarkable speed by concurrently analyzing each word and its relationship to every other word present. It “learns” word associations by calculating the strength of connection between each word and all other words within a given text, examining each word in numerous contexts. (A comparable design is employed for AI image and video generators.)
Computationally, this is achieved by converting words or word segments into numerical values and executing arithmetic operations (additions and multiplications) between them. A key factor in the speed is the ability to perform these calculations in parallel, facilitated by graphics processing units (GPUs) — primarily manufactured by NVIDIA — initially developed for rapid 3D rendering of visuals during video gaming.

Manufacturers of the processing chips powering AI calculations are focusing on enhancing chip energy efficiency; examples include the latest AI-specific chips developed by NVIDIA.
(Image credit: NVIDIA)
The initial training of an LLM, necessary to acquire all these relationships, demands substantial energy. Because each word it learns from must be compared against all others within a given text segment, the number of computations the model performs — and consequently, the energy required — increases quadratically with the text length (meaning doubling the text length quadruples the computation count). This escalates rapidly, given that most LLMs are trained on vast amounts of publicly accessible internet text. Some estimates suggest that training GPT-4 — the 2023 iteration of ChatGPT — consumed between 50 and 60 gigawatt-hours of electricity, enough to power San Francisco for three to four days.
However, experts are more concerned about the energy costs associated with using the models to generate data post-training, a process known as inference. “You train once, then you inference for a billion people in the world,” remarks Mosharaf Chowdhury, an AI systems specialist at the University of Michigan, who has been monitoring the electricity consumption of several large language models made available to the public.
This procedure is notably inefficient: Each time transformer models produce a word — by selecting the one with the highest probability of continuing the sequence, based on context — they subject the query and the partially formed response to the model. In doing so, they apply all the parameters calculated during training to interpret language patterns — numbering in the hundreds of billions or even trillions.
“The fact that you have to perform numerous calculations for a single word to be added — that’s a problematic aspect,” states Günter Klambauer, an AI expert at Johannes Kepler University in Austria.
Adjusting AI software for energy savings
This realization has spurred interest in smaller language models optimized for specific functions. These are trained with a more focused scope, possess fewer parameters — perhaps in the tens or hundreds of millions — and execute considerably less computation than their larger counterparts. In a 2025 UNESCO publication, computer scientist Ivana Drobnjak of University College London and her colleagues contrasted the energy usage of Meta’s Llama-3.1 language model with smaller AI models designed for particular tasks: DistilBART and t5-small-xsum for summarization, and others for translation or question answering. When employed for their intended purposes, the smaller models consumed over 90 percent less energy than Llama 3.1 for the same tasks.
Consequently, computer scientists have been motivated to incorporate a similar task specialization into LLMs themselves. In “mixture of experts” models, only specific segments of a large model are activated for distinct tasks. These segments “learn to handle different patterns in language,” according to Drobnjak.
This approach is believed to be a contributing factor to R1, an LLM developed by the Chinese company DeepSeek, reportedly exhibiting significantly lower energy consumption than other models (though independent experts have expressed skepticism regarding these figures). Udit Gupta, an expert in electrical and computer engineering at Cornell Tech, notes that LLMs like Gemini or ChatGPT similarly route inquiries to more specialized sub-models. “There’s a lot of work being done on how to assess the complexity of the query or task that’s coming from users and then find the right model,” Gupta explains. (While Google spokesperson Ralf Bremer confirms that the current 0.24 watt-hours for processing median-length Gemini prompts is 33 times more efficient than in 2024, some analysts suspect that processing queries with an LLM still demands more energy than a comparable web search.)
Researchers are also investigating alternative LLM architectures to overcome what Klambauer terms the “quadratic curse” inherent in transformer models.
One alternative, known as a long short-term memory (LSTM) model, circumvents this drastic energy increase by temporarily retaining a summary of the user’s input prompt and the text generated thus far, analogous to remembering key plot points rather than the entire narrative. This enables it to process only the summary, instead of all preceding text, each time it generates a new word. Klambauer indicates that this prevents LSTM’s energy demands from escalating sharply during query responses, consuming approximately 50 percent less energy than transformer-type models for processing texts of around 8,000 words.
LSTM models were developed in the 1990s but were sidelined because transformers offered faster training times. However, Klambauer suggests that recent advancements have enhanced LSTM performance, now referred to as xLSTM. He is collaborating with the Austrian startup NXAI to further develop and optimize xLSTM, “because we think it’s worth it for energy efficiency,” he states.
Nevertheless, major technology corporations have invested substantial time and resources into advancing transformer-based models, making a transition to alternative architectures economically challenging, according to Wolfgang Maaß, an AI and business informatics researcher at the German Research Center for Artificial Intelligence. “We have to see whether this becomes as dominant, or whether it finds a niche in the whole market.”
Computing with wafers and light
Although experts contend that the most immediate energy savings will arise from software optimizations, some are also targeting the power-intensive processing chips that underpin AI computations. Engineers have progressively improved chip efficiency by increasing the computational capacity within individual processors, thereby reducing the energy needed to transfer data between cooperating chips performing AI calculations. This has been achieved by shrinking the physical size of transistors — microscopic electrical switches that process data — embedded within the chips.
However, as engineers approach the physical limitations of transistor miniaturization, “we need to think of alternate ideas to improve the designs,” asserts computer architect Ajay Joshi of the Boston University Photonics Center.
One strategy involves increasing chip dimensions. “Wafer-scale chips,” which are the size of dinner plates, can accommodate approximately 70 times more transistors than a standard postage-stamp-sized GPU and require 143 times less electricity for communication compared to equivalent GPUs, according to computer engineer Rakesh Kumar of the University of Illinois Urbana-Champaign. Commercially produced by Cerebras, a California-based company, these wafer-scale chips present challenges, such as an elevated risk of damage during manufacturing. However, owing to their energy-saving and other advantageous attributes, “they would be very attractive to many hyperscalers and AI companies,” Kumar notes.

One approach to enhancing processor efficiency involves increasing their size to accommodate more transistors, the fundamental components of computers. “Wafer scale” chips, such as those developed by the California-based manufacturer Cerebras, reduce the energy expenditure associated with transferring information between individual chips.
(Image credit: CEREBRAS SYSTEMS)
Many technology firms have improved energy efficiency by creating their own processors specifically designed for AI computations — examples include Amazon Web Service’s Trainium2 chip or Google’s Ironwood Tensor Processing Units — as stated by these companies. Regarding NVIDIA, Josh Parker, the company’s head of sustainability, reports that its AI-specialized GPUs have evolved significantly from those used for gaming and are now engineered for optimal AI task efficiency; additional innovations, such as enhancing the interconnectivity between GPUs, have also contributed. “Over the past eight years, NVIDIA GPUs have improved 45,000 [times] in energy efficiency for large language model workloads,” he states.
Engineers are also exploring alternative computational methodologies. Conventional AI processors perform calculations by encoding numbers in a binary system of ones and zeros, achieved by switching transistors on and off (for instance, representing the number 5 requires four transistors to form the code 0101). However, transistors can function beyond simple binary switches, allowing or blocking electron flow; they can also operate as analog regulators, maintaining intermediate voltage levels that represent different numerical values. This requires fewer transistors, and consequently less energy, for computations. “People have known for decades that doing certain things in analog … can be a lot more energy efficient,” Kumar remarks.
For instance, electrical engineer Paul Manea of the German research institute Forschungszentrum Jülich and his colleagues are developing devices termed “gain cells” that feature transistors operating in this analog manner. Crucially, gain cells possess the capability to both store the data necessary for processing a query and perform the computation of the answer. This addresses a significant energy bottleneck in conventional computing systems, where data storage and computation occur on separate hardware components.
This is particularly problematic for transformer-based LLMs, as each word generation necessitates transferring the query and the partially constructed response from memory to a processor. Manea and his team estimate that utilizing gain cells instead of traditional GPUs can reduce the energy consumption of a highly energy-intensive component of transformer-based LLMs by four orders of magnitude. However, further refinement is required before they can be more widely implemented, Manea indicates.
The concept of devices that both store and process information is central to “neuromorphic” computing, an emerging field of computer engineering inspired by the human brain’s exceptionally low energy consumption. Another brain-inspired innovation involves chips that encode information not through continuous data streams but, akin to human neurons, via the timing of voltage “spikes” that propagate through the system. Allowing components to remain dormant until activated “could potentially translate to less energy,” suggests Eleni Vasilaki, an expert in bioinspired machine learning at the University of Sheffield in England.
Maaß, for example, is involved in a project that secured approximately $5.8 million from the German government to evaluate neuromorphic chips, among other strategies, for reducing the energy demands of AI models. Some brain-inspired chips are already commercially available, but the technology is still far from being suitable for mainstream computing, according to nanoelectronics expert Tony Kenyon of University College London, whose team recently received $17 million from the UK government for neuromorphic computing development.
Other scientists are developing chips that process information not using electrons but through the interaction of photons — light particles — with matter (fiber-optic cables, which encode and transmit data as light pulses, are utilized globally). With photons, greater volumes of information can be transmitted simultaneously, and signals can be manipulated much more rapidly, states Elena Goi, a photonic computing researcher at Friedrich Schiller University Jena in Germany.
Several companies have created chips capable of performing certain AI computations using optical methods, according to Joshi; he recently estimated that manufacturing optical chips could require up to an order of magnitude less energy than producing conventional chips of comparable size. Joshi anticipates that, “in 10 years, we would have a practical solution that can be deployed pervasively across the data centers.”
Transforming AI’s energy footprint
Even without fundamentally altering how computers operate, significant progress can be made in mitigating AI’s impact not only on energy consumption but also on the water resources used for cooling data centers. Crucially, technology companies should re-evaluate the siting of these centers, according to energy systems expert You. Currently, existing US centers are concentrated in northern Virginia, a region with constrained water and renewable energy capacity compared to, for example, the Midwest. You recently projected that optimized siting — combined with energy-efficient hardware and software — could decrease the future carbon and water footprints of US data centers by 73 percent and 86 percent, respectively.

Data centers, and the gas plants often constructed to supply them, can contribute to air and noise pollution, as well as place additional strain on local water reserves, prompting many communities to object to their development.
(Image credit: SARA DIGGINS / THE AUSTIN AMERICAN-STATESMAN VIA GETTY IMAGES)
Masanet further suggests that technology firms with existing data centers nationwide could at least direct their model training activities to strategic locations. “Some companies like Google have been doing this: They shift their loads to follow renewables,” he notes. He also emphasizes the need to address the electricity and resources consumed in manufacturing processors for new data centers, as well as the electronic waste generated by the frequent replacement of outdated technology, he adds.
Minimizing electronic waste by extending hardware lifespan and recycling obsolete electronics is one of Amazon’s sustainability strategies, according to a statement provided to Knowable Magazine; this is complemented by designing data centers for energy and water efficiency and investing in a range of renewable and nuclear energy projects. “We’ll continue to implement solutions that benefit our customers and the communities we operate in,” states Brandon Oyer, Amazon Web Services’ head of energy and water in the Americas.
In the meantime, a Microsoft press representative highlights several sustainability initiatives undertaken by the company, including novel cooling technologies, investments in renewable energy, and waste reduction efforts. Google spokesperson Ralf Bremer underscored the company’s objective of achieving net-zero emissions across its operations by 2030 and replenishing 120 percent of the fresh water consumed by its offices and data centers by the same year. An OpenAI representative referred to a press release detailing efforts to minimize water usage and plans for solar energy generation at one of its facilities. Anthropic, Meta, and Oracle did not provide comments by the deadline.
While technology companies are incorporating sustainability considerations, their primary focus remains on rapidly expanding data center capacity, according to computer engineer Benjamin Lee of the University of Pennsylvania. He anticipates that, eventually, they will need to intensify efforts to enhance energy efficiency to decrease costs. Governments should assist in accelerating this transition, Masanet advises. To date, he and his team have identified nearly 220 policies proposed to address data center sustainability at the US state level, 18 at the federal level, and additional proposals from other countries, although not all were ultimately enacted.
“It’s clear that governments around the world are beginning to take action,” he observes. However, he adds, “we also see some state and local governments with proposed policies that mostly aim to incentivize and accelerate data center builds.”

The Industrial Sustainability Analysis Laboratory at the University of California, Santa Barbara, has been documenting state and federal policies pertaining to data centers. The overwhelming majority of these policies address data center sustainability, though some also include tax incentives. This compilation may not be exhaustive.
(Image credit: Knowable Magazine)Related stories
- What’s the biggest bottleneck to building better AI? It’s no longer the lack of computing resources — it’s generating enough energy to feed it
- MIT’s chip stacking breakthrough could cut energy use in power-hungry AI processes
- Scientists build specialist ‘AGI processor’ that they believe will power the next wave of AI agents
AI’s energy consumption will ultimately involve a trade-off: Will its problem-solving capabilities, applied to fields ranging from medical advancements to logistical improvements, conserve more resources than they require? However, while developing more efficient and energy-saving AI is important, so is carefully considering the necessity of AI applications, according to Kenyon. Is the world genuinely improved, for instance, by nonhuman “AI agents” handling customer service inquiries?
“I think it’s a common mistake, when a new technology comes in, to suddenly think, ‘Well, everything has to adopt that new technology,'” he states. “That approach really isn’t doing us any favors.”
This article was originally published in Knowable Magazine, an independent publication committed to making scientific knowledge accessible to everyone. Subscribe to Knowable Magazine’s newsletter.
Sourse: www.livescience.com
