[RFO] NODE01.MCO Down All the latest from NodeSpace


[RFO] NODE01.MCO down (Resolved)
  • Priority - Critical
  • Affecting Server - NODE01.MCO.NODESPACE.US
  • Reason for Outage (RFO)

    Location:
    NODE01.MCO.DC2

    Date:
    November 6, 2017

    Events:
    9:54 PM (GMT-5) During an automatic cPanel software upgrade, our automatic Let's Encrypt SSL service encountered an error on a single site. When Apache restarted, the service was unable to start causing all websites on the server to go down. No other services (mail, DNS, etc.) were affected.
    10.32 PM (GMT-5) Our Linux Administration team removed the website that had a bad SSL certificate with an error.
    10:34 PM (GMT-5) Our Linux Administration team was able to restart Apache successfully.

    Root cause of the issue:
    Our Linux Administration team's official diagnosis is that an error with an automatically generated SSL certificate encountered an issue when the server software was upgraded. They have reached out to the software developer to alert them of this bug.

    Resolution:
    We had to remove a single website. Our server admin team will be restoring the single website from backup and will be in touch with the affected customer.

    Action plan:
    We have fully documented the software bug and have reported it to the software vendor.

    Total downtime:
    Customers on NODE01.MCO experienced 41 minutes. This downtime is not able to be claimed for SLA credit.

    We sincerely apologize about the inconvenience and will do our best to reduce/eliminate future occurrences of the same problems.
  • Date - 2017-11-06 21:54 - 2017-11-06 22:34
  • Last Updated - 2017-11-06 22:51

  Print


Comments


  Add Comment

Confirm Submission

Please enter the text from the image in the box provided, this helps us to prevent spam.



Powered by WHMCompleteSolution