Discovery Cluster: Brief Power Outage
Incident Report for Northeastern University - ITS
Resolved
A majority of Discovery’s compute nodes have been brought back online after the power outage. Although approximately 50 nodes are still down, all other nodes are now available for running jobs. Please check any jobs that you had running. You will need to resubmit any job that might have been terminated by the outage. We will continue to work to restore service to the remaining nodes. If you have any issues or questions, email us at rchelp@northeastern.edu.
Posted about 1 month ago. Sep 13, 2019 - 20:45 EDT
Investigating
The utility power feed to the MGHPCC, where the Discovery cluster is located, experienced a brief power outage at 6:32 PM Friday, September 13. We are currently checking the Discovery cluster and will be working to bring up any systems that might be temporarily unavailable to due the outage. All storage systems and login nodes appear to be unaffected at this time. We will let you know when we have completed a check of the system. If you have any questions or issues, please send an email to rchelp@northeastern.edu.
Posted about 1 month ago. Sep 13, 2019 - 19:47 EDT
This incident affected: Research Computing (Discovery Cluster).