Every time I post about the insanity of AI data center water consumption I invariably get a variation on the following theme:
Why aren’t they using a closed loop system?
For the uninitiated, the focus of this discussion is GPU cooling methods. AI is powered by GPU’s and GPU’s run really hot. Like insanely hot. For example, the chip that is powering Grok3, NVIDIA’s H100, runs at an average temperature of 87C/188 degrees Fahrenheit and has a max operating temperature of 98C/208F (NVIDIA H100 Product Brief). The chips are structured to slow down their processing when they reach a 95C/203F, and shut down when they reach 98C/208F. If they run too long at the hottest temperatures, they start to wear and tear and don’t last as long. So in order to keep the chips running as efficiently as possible and get the most return on investment, data center operators are required to implement systems to cool the chips.
So how do you cool a chip from 95C/203F to 87C/188F? Definitely not with an air conditioner. Have you ever walked into a party or event where the host forgot to turn up the air conditioner before everyone got there? It’s a little cool when you first walk in but it gets progressively hotter as more people show up and the night goes on. The air conditioner can’t keep up. But if the place is freezing when you walk in, the temperature stays comfortable no matter how many people show up. That’s because it’s easier to keep a cold place cool than to cool a hot place.
Data centers have traditionally used air to cool servers. They use a hot and cold aisle configuration (What are hot and cold aisles). The servers are located in the cold aisle, where they are surrounded by cold air. The air they heat up is then vented into a hot aisle. By pulling heat out of the cold aisles, this configuration allows the air conditioner to keep up with the demand - for traditional workloads.
AI workloads run on chips that require pretty extreme cooling. And they don’t run on a few chips. Grok3 is operating on a cluster of 200,000 chips. That each need to be cooled from 95C/203F to 87C/188F. Air conditioning just can’t keep up. So data centers are transitioning to liquid cooling to support AI workloads (When to move from air cooling to liquid cooling). Which is where our key question comes in.
The key component of the closed loop system is the second loop, the heat exchanger loop. That loop is open.
The most commonly used liquid cooling system is the closed loop system. In a closed loop system, cold liquid is circulated past the server, the server heats it, the heated water is circulated through a heat exchanger where it is cooled, then the cold water is circulated past the server again (Understanding liquid cooling). That loop is closed. No liquids gained, no liquids lost. But that isn’t the only loop. The key component of the closed loop system is the second loop, the heat exchanger loop. That loop is open.
This diagram shows a complete closed loop liquid cooling system. The system has 2 loops. The inner loop we discussed earlier that circulates cold liquid to the server and hot liquid to the heat exchanger. It also has an outer loop, which is highlighted. The outer loop cycles cold water through the heat exchanger and hot water to the water cooling tower (The immersion cooling technology). That is the loop that is responsible for data centers consuming 60% of the water they withdraw (Taps Run Dry).
There are two main technologies for cooling the liquids in a closed loop system - chillers and water cooling towers. Water cooling towers are favored for industrial-level cooling because of their energy efficiency (Cooling Tower v. Chiller). While both systems can be utilize air for cooling, air cooling is only effective when they’re operated in significantly cold environments. In most environments, both systems utilize evaporative cooling. Evaporative cooling systems pass the heated water through a cool water sprayer or over a cold water-soaked medium, and the heat evaporates some portion of the water into the air, which carries away heat and cools the remaining water. Water towers require a steady stream of water to function, so they are constantly evaporating. Water cooled chillers require less water to function, but they are often used in conjunction with water towers in large scale operations (How cooling towers and chillers work together).
This is why data centers are such massive consumers of water. Because only one of the loops is closed. The other is in a constant battle to cool massive amounts of super hot GPUs and is evaporating millions of gallons of water a day to keep up with the demand. In xAI’s case - 12 million gallons of drinking water a day (xAI Memphis data center water usage estimate). The closed loop isn’t closed folks. It never was.