August 28, 2004

Hexogen Outage, 15-Aug-2004

Hexogen suffered an outage today starting in the wee hours of Sunday morning.

We lost one of the disks in the drive array and a second flaked out on us during recovery. We had to restore one partition from backup. The net result was the loss of a small number of email messages from 5AM Sunday to about 10AM or so.

UPDATE: The root cause of this failure (and subsequent ones during
the week of Aug 15) was a flaky power supply which was supplying only about
+4.4V to the +5V rail, causing internal components like the add-on PCI disk
controller and the internal IDE controller to behave erratically. We replaced
the power supply with a new one on Sunday Aug 22 and things have been
good since then.

Posted by hbm at August 28, 2004 05:25 PM