NI3: The Net Result of Imagination, Innovation, and Investment
Wednesday, February 04, 2004
On January 10, 2004, 13:35:50 GMT, thousands of EMC customers were hit by a Y2K type bug in the EMC code base of their Celerra NAS products. Thousands of Celerra servers, and the applications that rely on them, were brought down by this bug. Below is an excerpt from email authored by an EMC executive explaining the source of the bug.

Similar to the renown Y2K problems of the Year 2000, there are also other computer-related issues surrounding date and time, such as the known exposure for 'internal clock overflow' for the Year 2038. Since most computer systems measure clock time as the number of seconds since 01/01/1970, 00:00 midnight, once the Year 2038 is reached, the standard integer variables that affect date and time calculations overflow, leading to issues similar in nature to the Y2K problems. While these and other potential problems are generally well-known, what was not well-known or understood was that the Year 2004, specifically, Saturday, January 10, 2004, 13:35:50 GMT, being the halfway mark between 1970 and 2038, could present and manifest Y2K-like issues as well. Specifically, the problems experienced with the halfway mark for Celerra Servers and Time Synchronization, were exposed when a portion of the DART NAS Code relating to Kerberos, that conducted the addition and subtraction of two different dates, used an incorrect data type for this operation, resulting in the impact experienced by our Customers.

This entire incident was missed by the media in the storage industry. How can EMC miss a bug like this given the experiences of Y2K? This seriously calls into question quality control within EMC's software development organization. If EMC aspires to be a serious software company, issues like this can't be missed. This is a serious set back for their software aspirations.

 

Technorati Profile

Creative Commons License
This work is licensed under a Creative Commons License.