I've been tasked with conducting a root cause analysis for a specific event that blew up our schedule last week and putting a root cause analysis process in place for future events within the data center. I've done a lot of Googling for root cause analysis and came up with very little. I've found a book, but does anyone know of any resources out there that would help me put together a process specific to data centers?
This is no light task. During my entire career in IT and technology I have never been a part of a successful root cause analysis. This says a lot about the state of operations management in the private sector.