Data Center World is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Lessons Learned from a Data Center Incident

James Monek  (Director, Technology Infrastructure & Operations, Lehigh University)

Location: Room 201

Date: Wednesday, April 17

Time: 1:30 pm - 2:20 pm

Pass Type: AFCOM Solution Provider, All Access Conference, Industry Conference, Standard Conference - Get your pass now!

Track: Design, Build, Operate, Control

Session Type: Conference Session

Vault Recording: TBD

Audience Level: All Audiences

What would you do if you received a phone call at 5am in the morning saying there is a fire in the data center (besides thinking you just woke up from a nightmare and considering going back to sleep)? While our incident wasn't that extreme, it was enough to trigger an emergency power down and halon system dumping of halon gas to protect the data center.

During this presentation, we'll cover the timeline, how we used our incident management and BCP/DR processes, the rapid response we administered to get all services back online, and the retrospectives that occurred afterwards. We'll explore the surprises that we encountered, lessons learned, and how well-prepared teams can work together under immense pressure.

Takeaway

  • Tips on handling a major data center incident
  • Closer look at how various processes, such as incident management, BCP/DR, and retrospectives, are crucial to data center operations
  • Learn from our incident to strengthen your data center resiliency, redundancy, and operations