In This Insight

We are entering a new era in data center architecture. Gone are the days when we can exploit every bit of power at our leisure and think nothing of the consequences. Moore's Law has held up and CPUs have become more densely packed with computational power, causing an exponential increase in power consumption. As power consumption increases, the first law of thermodynamics holds its end of the bargain, causing an increase in heat. The increase in heat is forcing us to not only concern ourselves with the traditional server issues of performance and density, now we must also concern ourselves with heat management. 

Heat management has always been an issue, but never something that required much more than functional fans and a good data center cooling system. My first job in IT included the administration of two data centers cooled with window style AC units. Add cooling strategy to the list of questions I should have asked in the interview. 

Common sense tells us that the increase in expelled heat will also increase the strain on the AC unit until it either must be replaced or face overheating. Overheating can manifest its ugly head in many ways, but it will start with the servers throttling the CPUs to a slower speed, hindering performance. The performance hit may impact the user experience, which at first glance may indicate a need for more servers. An increase in servers will generate more heat, which will require even more cooling.

What Can Be Done?

There is a silver lining in our data center clouds. We can reduce server's power consumption and ease the load on our data center cooling systems. This is especially helpful if there are still data centers cooling with window AC units.

How do we take this step into more efficient data center cooling? We were told liquid cooling is the solution. What we really needed was firm numbers to back up this claim that liquid-cooled servers cool more efficiently than air-cooled. To get these numbers, we measured the cumulative power draw of 33 servers over 24 hours with a 99% CPU load while being cooled with liquid and then in a separate test air. We then compared the results and were impressed by the findings.

The Naysayers and Bookkeepers

We all know the naysayers' negative perceptions of change. Some of these nay-saying precautions are valid, such as "change costs money," and "complexity is the enemy of execution." Other precautions are simply fear-mongering so we need to be cautious. This caution leads the IT industry, and much of society, to stick with the status quo until a problem gets so large that we can't ignore it anymore, like the great garbage avalanche of 2505. Change may mean additional expenses for training or hardware, but change isn't always bad. Some even say that change is the one constant in life. The sooner we embrace the inevitability of change, the better. 

Getting past the bookkeepers is often a challenge. How do we convince the financial sector to invest in liquid-cooled servers? Convincing the bookkeepers is easy with ZutaCore because the switch to liquid-cooled systems saves electricity. It really is that simple. Less power is required to cool the servers, resulting in less heat generation. Less heat generation means less electricity is required to cool the data center. We like to call that a twofer because the savings show up in two places. Bookkeepers love savings, especially the twofer!

Math is Fun

In order to convince the bookkeepers to buy you must first speak their language, math and money. For the sake of this article, we will only account for ongoing operational costs associated with electricity. Our testing looked at the difference in electricity usage between liquid and air-cooled systems. We will not speculate on ROI over a specific time frame, though a good chunk of that equation can be ascertained by our testing results coupled with a quote from your WWT representative. 

The Hardware

Servers

This test ran on 33 Dell servers. 32 of them were identical Intel systems but we also had a server with AMD CPUs. It is important to note the difference in the expected power draw of the CPUs because our results were very different for the systems. We believe the savings difference is due to the CPUs power consumption difference and not the manufacture.

The other server was slightly different but similar 

ZutaCore Platform

The ZutaCore system is made up of various components. The Smart Heat Rejection Unit, or S-HRU, used in this POC is a 6U device that resembles a basic radiator. Hot liquid comes and then it travels through a condenser. There are fans on the back (see below) to pull air over the condenser. The cooled liquid is pumped to the cool side of the Smart Refrigerant Distribution Unit, or S-RDU. From there it travels to the Enhanced Nucleation Evaporator, or ENE, which is just a fancy name for the CPU heat sink. The CPU heats the liquid to a vapor where it then travels back to the S-RDU hot side and then back to the S-HRU for cooling. The ZutaCore cooling system is very simple. 

Left: S-HRU front 
Right: S-HRU back
Left: Server with ENEs installed 
Right: S-RDU with servers connected
The Liquid

It is important to recognize that the ZutaCore product does not use water. The liquid is dielectric, non-corrosive, non-toxic and fire-retardant. What that really means is it is perfect for use on electronics. This fact was demonstrated when my manager sprayed a generous portion on my laptop keyboard from what appeared to be a standard water bottle. I did not know what was going on and you can imagine my reaction. My head went straight to, "no worries, everything important can be retrieved and I need a new laptop anyway." This was his way of announcing that I will conduct testing with the ZutaCore product line. Certainly this grabbed my attention. What also got my attention is that my laptop did not skip a beat! The liquid evaporated quickly leaving no trace and the laptop unfortunately kept running.

Other ZutaCore Details Worth Noting

The ZutaCore system has a lot of smart features. The hot and cold connectors only connect to the hot or cold side. This is an important feature that will prevent inadvertently plugging the wrong port into the wrong spot. Another key feature is the connections to the S-RDU use a quick disconnect brass fitting that allows for easy removal of the coolant lines for server maintenance. During our testing, the servers were removed from the rack several times without issues. We did not have to purge, fill or burp the system. Once the coolant lines were reconnected the servers were ready to power up. The ZutaCore system also looks great in a rack! The tubing colors are bright and bold and everything had a clean organized appearance.

The Software

The servers had RHEL 8.3 installed with the minimum software packages. The servers also had the ZutaCore SDC agent installed. A SDC server drove the SDC agent to use the CPU at about 99%. The SDC server also collects telemetry data we used to help in the results. Note that approximately 1% of the CPU is left unused to allow for data collection from the agent to the server. No significant load was placed on the RAM or on hard drives during testing.

The Test 

Our testing was completed in two phases over a twenty-four-hour duration. Phase 1 tested liquid-cooled servers and phase two tested air-cooled. It is important to note that the S-HRU's power consumption was included in Phase 1 but not in Phase 2. The test was performed in a consistent 18-degree Celsius ambient temperature environment. The surrounding racks were not populated, so there was not a noisy neighbor issue. To validate temperature consistency, eight temperature readings were taken at the rack's doors. Four sensors were located at the front and four at the back. Data was recorded with a Huato S220 T-8 Thermocouple device. During both tests, the inlet air temperature stayed consistent. 

We also measured CPU temperatures and frequencies, as well as server fan speeds during the tests. This data is built into the SNMP stack of the management plane of the servers and was collected using a Grafana deployment. 

We monitored power at the PDU and iDRAC level. For this testing, we used four PDUs in order to meet the high rack power requirements. The data was also collected in Grafana and exported to CSV format, where the data was compiled into some wonderful Excel charts and graphs (bookkeepers love spreadsheets). 

The Results

Note that all these bullet points are positive results for the ZutaCore liquid-cooled platform, except the last ticket item air & liquid tied.

  1. There was a 4-degree Celsius drop in temperature readings collectively coming out of the rack from when cooling with liquid versus standard air.
    1. This temperature drop means your AC unit works less saving money. Imagine the difference in an entire data center of hundreds of racks that had a 4-degree Celsius exhaust reduction.
    2. It is also worth mentioning that the ambient temp of the data center can be raised due to the efficiency of the cooling. WWT was not able to raise the DC temps for a single rack but it is easy to see that temps could be at least 4 degrees higher, or more, because of the exhaust temperature reduction.
  2. There was a 5% drop in total server power consumption, which translated into approximately 35 Watts per server.
    1. The AMD EPYC server consumed 17% less power when cooling with liquid over the air testing. We believe the increase in power savings was due to the AMD processor using significantly more power than the Intel processor. 
  3. The server fans remained idle on the Intel systems during power testing with the CPUs at 99% when cooled with liquid.
    1. The AMD fans ran higher than idle during testing with liquid-cooled but still much slower than air.
  4. All systems had consistent CPU frequencies during the test, so there was no evidence of throttling by the management planes or the operating systems.

 

CPU Frequency Remained Consistent Throughout Both Tests

 

The Refreshing Sound of Idle Fans

There is one other noticeable benefit of using ZutaCore's platform that was unfortunately not measured during the test. The decibel level emitting from the rack of the servers was drastically lower with liquid-cooled versus air. Since we have no hard data all we can offer is a description of what it was like near a rack during testing. As we stood behind the rack of 33 servers during ZutaCore liquid cooled testing we could have a normal volume conversation. This is because the server fans remained nearly idle for the duration of the test. While performing air testing the sound was deafening and we wore ear protection.

Call To Action

The ZutaCore product showed it can bring down power consumption and decrease the cooling requirements. We can improve even the data center decibel levels with this product. As the transition to a greener data center happens, everyone will enjoy the benefits of this solution from the server administration staff to the data center technicians. If you would like to see testing firsthand please contact your WWT representative to schedule your own ZutaCore performance tests. Come see the ZutaCore solution at the ATC.

Technologies