Cloud "oops turns out our distributed system is centralized in a single datacenter" Flare
Some other juicy bits about the #Cloudflare outage:
- No 24/7 (experienced) technician availability at the datacenter that hosted their control plane(!)
- No end-to-end service dependency tracking or diagrams
- Therefore, supposedly HA services depending on non-HA infrastructure
- Even if the "redundant" setup *did* work (it didn't), all three locations would be physically within *the same earthquake zone*
This is absolute clowncar level network administration, frankly, for something the size and importance of Cloudflare.
@joepie91 wait, the internet went down while I was out eating dinner? I better go read up on this lol
@joepie91 sorry, are you talking about the multi-billion dollar networking company or a 6 month old startup?
@thibaultmol Yeahhhh
@joepie91 haven't followed what happened at Cloudflare but:
- BGP is what Cloudflare is using?
- you very much can fuck up BGP, remember the Facebook outage? BGP route hijacks also happen, sometimes to the extent of your traffic randomly going via China Telecom
- you need to register as LIR with one of a few orgs like RIPE, and own/lease blocks of IP addresses for yourself, not sure what's the cost of it
"Well, we *thought* we had High Availability, but we never actually tested that"
- Cloudflare, supposed distributed systems experts, processing a double-digit percentage of the world's web traffic