Skip to main content
The global outage: how to manage the impact of service disruptions
Share on socials

The global outage: how to manage the impact of service disruptions

Nikos Georgakopoulos
Nikos Georgakopoulos
24 July 2024
7 min read
People with safety hats with one on a ladder, trying to fix the system errors on a computer dashboard
Nikos Georgakopoulos
Nikos Georgakopoulos
24 July 2024
7 min read
Jump to section
Crowdstrike and Microsoft: The great outage
Scoping a service management solution

In an increasingly online world, a cyber outage can bring businesses around the globe to a standstill.

Crowdstrike and Microsoft: The great outage

In an increasingly online world, a cyber outage can bring businesses around the globe to a standstill. Recently the gravity and impact of a cyber outage was fully realised when a software update from cybersecurity company CrowdStrike inadvertently disrupted IT systems globally—taking Windows machines offline.
The outage shut down critical infrastructure systems across many industries, cancelling more than 4,000 flights worldwide, and stalling financial and healthcare services. According to Microsoft, the tech failure disabled an estimated 8.5 million Windows devices.
Downtime can be a costly business, and this event is a reminder of how important it is to put in place safeguards before an outage happens. According to Gartner, downtime can cost companies $5,600 per minute and up to $300,000 per hour in web application downtime. But by taking proactive steps to leverage the right technology before an outage, you can lessen its impact and recover faster. While many don’t have the systems in place to manage a service outage, the ability to assess and be proactive, not just reactive, is essential.
In this blog, we’ll explore how you can mitigate the impacts brought by an IT outage, and ultimately avoid such incidents in future. But first, let’s take a look at the damage it can cause.

The damage of downtime

The potential revenue lost from downtime is bad enough. It can also place pressure on resources since urgent action is needed to address the issue. But the damage doesn't stop there. The true cost of downtime can be hard to monetise. For example, a damaged reputation can lead to a decline in customer trust and reduced business over time.
Let’s take a look at four major impacts of service disruption:
  1. Negative brand perception. Service interruptions halt businesses from providing services to customers, which can frustrate or anger customers and lead to a loss of trust. The Crowdstrike incident has been named “the largest IT outage in history,” and the magnitude of the incident allows online chatter to start picking apart the brand's integrity.
  2. Decreased customer loyalty. Interruptions decrease customer trust and may mean your customers switch to a competitor, resulting in loss of loyalty to your brand.
  3. Increased customer service demands. A major incident can create a complaints and queries backlog, putting pressure on resources and calling for extra help dealing with these requests. AI-powered solutions can help shoulder the load and take care of repetitive or common queries.
  4. Operational expenses. Aside from negatively impacting your reputation, service disruptions can cost a business greatly as you may need to compensate your customers, pay resources to work additional hours, or spend money on additional services.
Effie Bagourdi
We need to find a way to reduce uncertainty and frustration when we have outages. A robust service status platform isn’t just about mitigating crises—it’s about demonstrating reliability and commitment to the customer.
Effie Bagourdi
Service Management Practice Lead at Adaptavist

Scoping a service management solution that can navigate service disruptions

When an outage does occur, it’s important to understand the scope and cause of the disruption and take action to fix it, while communicating with employees, stakeholders or clients on resolution time. With multi-layered visibility, teams can get a full picture of this information and look back after the event to learn from data for future outages.
Visibility, insights, and automation
To prevent or simplify incident escalations, IT teams need visibility, insights, and action automation across their IT operations. Without the right visibility and insights, it can be difficult to find where problems are arising within the networks, and the causes of these issues. But monitoring tools can meet these requirements, along with providing action automation. With a monitoring tool, IT teams have full-stack observability and can turn data monitoring into meaningful, actionable insights in real-time to manage and prevent outages.
Centralised, real-time observability
Organising recovery plans in a centralised location can make it easier to find tools to help you bounce back from an IT outage if an incident returns down the line. This helps factor for future events but also keeps companies accountable, with the audit trail to support decisions and outputs.
Proactive monitoring
For those embracing digitalisation, sadly there is no way to completely safeguard against an outage, but with proactive monitoring you can mitigate the impact. Through proactive monitoring, teams can perform preventative maintenance and fix issues before they develop into an outage. While it’s not a silver bullet, comprehensive monitoring provides teams with continuous visibility into their IT environments, enabling issues to be fixed faster and the length of outage times to be shortened.
Adaptavist can help build the right solution for your needs to minimise the impact of service disruption.
Cogs ands and a Swiss army knife image

Tools to take your solution to the next level

Statuspage
Atlassian’s Statuspage can support a smooth customer experience even when services are down. Using the tool’s pre-written templates, you can communicate real-time status updates of your incident management process. Additionally, seamless integration with monitoring, chat, and help desk tools you already use means you can get the message out faster and easier, helping to maintain customer trust and brand reputation. Plus, by being proactive with communication and getting ahead of incoming questions, you’ll halt support requests and reduce the number of duplicate support tickets.
Jira and Jira Service Management integration
When brought together, these powerful Atlassian platforms can be used to reduce downtime and enhance your service delivery. They act as an all-in-one solution that can make a huge difference in how you identify and resolve major incidents. It might be that what works for you is a Jira Service Management (JSM) solution which can step in to support managing service disruptions.
Crucially, JSM integrates with Jira, so critical issues can be automatically routed to development teams based on criteria you set, and they can prioritise work based on severity. Plus, thanks to incident conference call and chat channel features, it supports collaborative working. When combined, all features are great for reducing your mean time to resolution for major incidents.

Want to manage your service outages the right way?

As the saying goes, it can take 20 years to build a reputation and five minutes to ruin it. When you consider the potential damage of a service outage, businesses must take steps and a proactive approach to manage and mitigate service outages.
While it's impossible to avoid service disruptions altogether, you can mitigate the damage they cause by implementing strategies, processes, and best practices. Taking proactive action will enable you to maintain a high level of customer service, helping you to maintain reputation and customer trust.
Get in touch with us today so we can discuss how to prepare you for any future service disruptions.
Written by
Nikos Georgakopoulos
Nikos Georgakopoulos
Head of Service Management Strategic Advisors
Nikos boasts nearly two decade's experience within IT, holding a bachelor’s in computer science, a master’s in business administration, and experience with the Atlassian ecosystem. This equips him to effectively solve problems and enhance productivity of organisations within regulated industries.
ITSM