Human error caused 2022 Rogers outage, system ‘deficiencies’ made it worse:
The 2022 Rogers outage that left 12 million people without wireless and hard-wired services was caused by human error and made worse by management and system “deficiencies,” says an independent review conducted for Canada’s telecommunications regulator.
The review report also says steps taken by Rogers since the outage are “satisfactory to improve the Rogers network resiliency and reliability, as well as to address the root cause of the July 2022 outage.”
The 26-hour outage started early in the early morning of July 8 and left individuals and businesses without access to their mobile, home phone, internet and 911 services.
The Canadian Radio-television and Telecommunications Commission (CRTC) commissioned Xona Partners in September 2023 to undertake the review and determine what caused the outage.
The engineering consultancy was also tasked with looking at whether the measures taken by Rogers since the outage are sufficient to prevent another incident.
Xona Partners’ findings were contained in the executive summary of the review report, released this month. The CRTC says the full report contains sensitive information and will be released in redacted form at a later, unspecified, date.
The report summary says that in the weeks leading up to the outage, Rogers was undergoing a seven-phase process to upgrade its network. The outage occurred during the sixth’s phase of the upgrade.
“The July 2022 outage is attributed to an error in configuring the distribution routers within the Rogers IP network,” the report says.
Staff at Rogers caused the shutdown, the report says, by removing a control filter that directed information to its appropriate destination.
Without the filter in place, a flood of information was sent into Rogers’ core network, overloading and crashing the system within minutes of the control filter being removed.
Algorithm designated network upgrade as ‘low’ risk
The report says Rogers’ core network manages wireless and hard-wired data both internally, within the company, and externally, for outside customers and service providers.
“With both the wireless and wireline networks sharing a common IP core network, the scope of the outage was extreme in that it resulted in a catastrophic loss of all services,” the report says.
Having wireless and wireline services share the same network is a practice “common to many service providers,” the report says, adding that companies find it an efficient way to “balance cost with performance.”
Rogers has since announced that it will develop a new, separate, network for its wireless systems while keeping hard-wired services on the old core network. The report says that work is ongoing.
The review says that because the first five stages of the network update took place without incident, “the risk assessment algorithm downgraded the risk level for the sixth phase” of…
Read More: Human error caused 2022 Rogers outage, system ‘deficiencies’ made it worse: