Elevated errors on api.blockset.com
Incident Report for Blockset
Postmortem

On Sunday night we were notified via internal monitoring systems of elevated error rates to api.blockset.com. This was affecting the BRD mobile app as well as enterprise Blockset customers. We apologize for the downtime and are taking measures to ensure this type of error does not occur again in our infrastructure. Several mitigation strategies are already in place and more will be implemented before the end of the year.

Timeline

  • Our team was alerted to elevated error rates at 2020-12-13 23:49 UTC stemming from authentication failures. During this outage anywhere between 20-80% of requests were failing.
  • We noticed that subscription (web-hook) queries which are hosted by the same database as authentication were timing out. Further investigation leads us to believe this may be caused by a new feature in our Ethereum support for delegated ERC-20 contracts.
  • Brought service fully back online at 2020-12-14 1:46 UTC by disabling web-hooks until the issue could be resolved. This reduced error rates back to baseline.
  • Issued a patch to optimize subscription queries at 2020-12-14 2:15 UTC which brought web-hooks back online.

Remediation

  • Isolate the authentication workload from other processes in Blockset. This will prevent irregularities in the notification subsystem for instance affecting public API traffic.
  • Implement a fall-back authentication option in case of temporary unavailability. This will allow requests to the public API to continue even if there is a temporary outage of the authentication system.
  • Optimize our subscription queries to provide more performance headroom. This will add headroom to our notification system so we can absorb abnormal traffic patterns in the future.
Posted Dec 14, 2020 - 20:48 UTC

Resolved
We have resolved all outstanding issues and our on-call team will continue to monitor closely for the next few hours.
Posted Dec 14, 2020 - 02:28 UTC
Update
We are continuing to monitor for any further issues.
Posted Dec 14, 2020 - 01:56 UTC
Monitoring
We have addressed the primary cause of the incident and are monitoring to ensure stability.
Posted Dec 14, 2020 - 01:26 UTC
Update
We are continuing to investigate this issue.
Posted Dec 14, 2020 - 00:46 UTC
Update
We are continuing to investigate this issue.
Posted Dec 14, 2020 - 00:44 UTC
Investigating
api.blockset.com is currently experiencing elevated error rates. The on-call team is investigating the issue and we should have a resolution shortly.
Posted Dec 14, 2020 - 00:29 UTC
This incident affected: Blockchains, Blocks, Currencies, Subscriptions, Transactions, and Transfers.