I got an email from APC PowerChute Business Edition that said simply “Battery Disconnected.” The connected UPS is an APC SmartUPS 1500 model DLA1500 (the Dell equivalent of an SUA1500). Of course the battery is not disconnected; the connection inside the unit is a very secure clip that can’t just wiggle loose.
Once I got on site, I found that the “Replace Battery” light was blinking. For some reason there was no corresponding event in the PowerChute event log.
Of greater concern was that the unit was almost too hot to touch. When I got the battery door open, I could see the batteries bulged somewhat and were very hot. The plastic pull tab broke off when I tried to pull out the batteries. I finally managed to hook a wrench around the top and pull them out. And get this: two hours after removing them, the batteries still felt hot! I guess I’m lucky I didn’t have flammable hydrogen gas escaping as this thread says can happen. I disconnected the two batteries from each other and left them outside overnight to cool.
Reviewing the PowerChute data log, I see that the temperature is usually around 30 Celcius (86 Fahrenheit), but starting at about 7:30am, it slowly climbed to 41 C (106 F). The temperature was 40 C when the unit reported “battery disconnected.” Is it a coincidence that 40 C is the temperature at which the fan turns on (KB article)? Perhaps some safety switch was tripped to disconnect the battery? The fuse between the batteries did not trip.
Here’s the event log, showing the “battery disconnected” event at 2:39pm:
And here’s the temperature graphed from 4:10am to 4:50pm:
How Hot Is Too Hot?
According to its specs, the DLA1500 is supposed to operate safely in an environment up to 40 C. The building where this unit is installed does not have air conditioning, but the high in San Diego that day was 71 F, so the room was well below 80 F (27 C). I understand that ventilation is important, and I’ll move the UPS to sit on top of the server tower rather than beside it, but clearly the overheating here was not due to a hot room but to a battery or controller failure.
What Should Have Happened
I am definitely concerned that this high-end UPS did not detect a failed battery or an overheating condition earlier. All I got was the “battery disconnected” warning; since the server was still up, I didn’t consider it an emergency. Fortunately, it was a convenient time, so I went on site within a couple hours and discovered the overheated unit.
Next time: carefully check the Internal UPS Temp column in the Data Log. If it has risen 10 C in one day, consider a shutdown and get on site.
I wondered why PowerChute hadn’t warned me about the overheating. It turns out that the default setting for “UPS Internal Temperature Threshold Exceeded” event is 70 C (158 F):
Wow! If the unit is almost too hot to touch at 40 C, 70 C seems way too high.
I don’t know how accurate these internal temperature sensors are. At another location, I have an APC SU1400NET that always reports about 40 C, but its case is hardly warm to the touch.
Maybe the solution is to check the normal operating temperature for a given UPS in the Data Log, then to set PowerChute’s UPS Internal Temperature Threshold Exceeded event to trip about 5 degrees above. So if 30 C is normal, set PowerChute to 35 C. Note that this is a full shutdown event by default, but that is certainly preferable to risking a fire in the server room.
Has anyone else seen issues like this? What does PowerChute report as the typical internal temperature of your SmartUPS?
Update November 23, 2013 I just noticed that there is a separate Warning event called UPS Internal Temperature Warning which can be configured to send an email. It makes sense to set that a few degrees lower than the Critical Threshold Exeeded event, which shuts down the machine. That will hopefully give some advance notice of a UPS on its way to overheating.