Service Disruption for 13th June 2017

Discussion in 'Forum Support' started by bioanarchism, 13 June 2017.

Thread Status:
Not open for further replies.
  1. bioanarchism
    Offline

    bioanarchism Systems Architect Staff Member

    Joined:
    8 February 2009
    Messages:
    263
    Dear Members,

    Firstly, we would like to extend our deepest apologies for the service disruption that you had experienced today.

    We are very adamant on the trust and confidence that you have placed with AMULETFORUMS, and as a treasured member of the forums, we believe that you have a right to know about the service disruption that had occurred today on 13th June 2017.

    TIMELINE

    At 1320hrs (UTC+8), we have began to receive alerts on the intermittent inaccessibility of the website. By 1345hrs (UTC+8), additional steps were taken to preserve the forums database and immediate actions were undertaken to stop additional datastore writing activities on the clustered database to prevent data discrepancies.

    By 1400hrs (UTC+8), it was evident that an underlying technical issue had occurred with the Load Balancing cluster and the decision was made to swing out the Load Balancers to a standby location at a different datacenter.

    At 1500hrs (UTC+8), all system logs were scanned through for security-related topics and the web server(s) are validated in good secure health.

    At 1630hrs (UTC+8), verification checks were made to validate on the database integrity with positive results.

    Unfortunately, the transition process was not being applied as quickly as we would hope that it will be and by 1700hrs (UTC+8), the network services were restored. Additional configuration had to be applied to accept the new network configuration setup and full public-facing services were made accessible by 1930hrs (UTC+8).

    RESOLUTION AND REMEDY

    We have discovered that our best efforts for DR/BCP was not met with satisfactory results due to the unreliability of the transition experience to the failover Load Balancer located in a different datacenter.

    In lieu of this, we are in the midst of rolling out a secondary set of Load Balancers within the same datacenter premises instead, and this will act as the failover entity. The transition turnaround time will, as expected, to be shorter.

    Once again, we are sincerely sorry for the service disruption. Additional steps are being taken by the respective Account Management Team to compensate with our Premium Membership account owners as well. If you are a Premium Membership owner, you will be receiving a notification message directly from us in a short while.
     
    EmbraceBuddhism and doctor46 like this.
Thread Status:
Not open for further replies.

Users Viewing Thread (Users: 0, Guests: 0)