The majority of major services that require internet access were down for nearly 6 hours yesterday due to Cloudflare system issues. Among those affected were X, ChatGPT itself, which were inaccessible due to systems that were supposed to prevent disruptions to operations being disrupted.
This morning Cloudflare confirmed that yesterday's outage was not caused by a cyber attack as spread on social media. Instead, a change to the ClickHouse database system caused the "feature file" configuration file for the Bot Management module to double in size.
The file, which was too large to exceed the system limit, caused an increase in HTTP 5xx errors. Cloudflare services came back and went down multiple times yesterday because these files were generated in rotation every 5 minutes
Cloudflare said that following this incident they have tightened the handling of configuration files and added a global kill switch to prevent it from happening again in the future.
