Looking back at January of this year, I thought — as well as everyone else — that that would be Gnax’s only blunder with their UPS system. Gnax either cannot learn from their lessons or is mired by incompetence higher up in the management chain, because once again all servers within the data center experienced another power blip. Meanwhile as critical problems such as this remain unresolved, the CEO is more than happy to slap together uninformative, pompous graphs coined with his own last name. Marketing? Indeed. But before you market make sure you can support the product, the data center, and your customers. Customers are your foundation. Without a sturdy foundation any business will topple, it is simply a matter of time. Marketing is a way to draw in the foundation, but ultimately maintenance and providing the customer what they need is how you stay alive. Snafus such as these are absolutely catastrophic to the image of a company.
Oh, we weren’t the only ones hit at the data center. No, all of its colo customers were also knocked out as well. When you keep the filesystem healthy and test your servers thoroughly, you can quickly recover. Other guys who resell some existing technology? Eh, not so much.
Having experienced the uneventful circumstance today, the servers will be going down for a kernel upgrade tonight at 1 AM EDT (-0400 GMT). This process should take 3 – 5 minutes. Once up we’ll have control over offlining one of the two CPUs every night to save on power consumption.
– Matt
10:50 PM EDT Update: Cause of the outage was an overloaded UPS due to a lightning strike on an utility box outside the data center.