Skip to main content

4 min read

Poor major incident management could put your organisation at security risk

Rich Blunt
Rich Blunt
16 February 24 ITSM
ITSM major incident management

A major IT issue can have a major impact on your business, which is why successfully managing these events is crucial to business continuity. This is what’s known as major incident management (MIM).

In this blog we’re taking a closer look at MIM. We’ll explain why MIM is so important to avoiding and minimising the impact of security threats, the cost major incidents can have to organisations like yours, the challenges MIM poses to businesses, and the best practices you need to follow to overcome them. 

What is a major incident?

What constitutes a major incident will vary between organisations. But, simply put, these are high-impact, urgent issues, like an emergency-level outage or loss of service for customers. An IT incident that doesn’t interfere with essential tasks is probably not a major incident.

Major incidents typically require an immediate response from incident management teams. Because if customers can’t access services or employees aren’t able to complete their work on time, then it’s likely the business will take a financial hit (among other impacts).

Incident Management vs Major Incident Management

While you can’t prevent major incidents entirely, how you handle them can make all the difference to the costs your organisation incurs.

But how does your MIM process differ from how you handle other IT incidents? IT service disruptions are not uncommon, and incident management is the process of managing these and making sure services are running smoothly in line with Service Level Agreements. Users typically report an issue to your service desk and an IT team member then resolves that issue.

Major incidents, on the other, will be reported from multiple sources. They require a special team of experts to handle and resolve them. Typically, this includes the following four stages:

  • Identification – where the major incident is declared and stakeholders are informed.
  • Containment – where a team is assembled and comes together to solve the issue.
  • Resolution – implementing the resolution plan, documenting it as a change.
  • Maintenance – reviewing the incident to check it’s been resolved, documenting the process, and taking metrics to measure the service desk’s effectiveness.

It’s also important to understand that major incidents received by a service desk often are escalated to technical teams, such as Develops, Operations etc. To work together to resolve the issue as quickly and effectively as possible. Lets take a look into what a MIM team should look like. 

The cost of security threats to Facebook

Major incidents mean downtime, and downtime doesn’t come cheap. According to Gartner surveys, the average cost of downtime has been estimated at $5,600 per minute or $100,000 to $540,000 per hour. 

For some higher-risk organisations, in sectors including finance, government, healthcare, and media and communications, this cost has been put even higher – upwards of $5 million per hour. For example, in 2019, Facebook’s 14-hour downtime cost the organisation nearly $90 million.

Of course, loss of revenue is not the only ‘cost’ organisations have to pay for downtime. Business disruptions – we’re talking reputational damage and customer churn here – can be even more costly than revenue loss. Then there might be SLA financial penalties, government fines, and the cost of lawsuits associated with the incident. As well as the knock-on effect on any physical products or equipment.

Other downtime costs include end-user productivity and internal productivity (for IT teams and those affected by the incident). The latter can also have an impact on employee happiness and staff retention, another less obvious effect of downtime.

Common challenges with major incident management and how to overcome them

With your MIT in place, it’s important you don’t fall foul of missteps that organisations make with MIM. Here we take a look at the most common ones and explain what you can do to avoid them.

1. Manual processes

Speed and efficiency are everything when it comes to MIM, and manual, repetitive processes hold your technicians back, slowing down resolution time.

How to fix it – automating your service desk processes, such as notifying stakeholders, frees up technicians’ time so they can focus on resolution.

2. Inconsistent communication

Keeping managers and key stakeholders in the loop is vital so people know about the status of the incident and what’s being done to fix it. Manual communication can be inconsistent and problem-ridden. 

How to fix it – again, automation plays a key role here, ensuring communication is structured and consistent, and everyone is notified throughout the entire ticket life cycle of what’s going on. Prompt communication keeps end users informed too so they can prepare for downtime.

3. Lack of visibility

It's common for teams involved in a major incident to not have the visibility of the original issue, and the progress made so far, which is vital to the approach for resolving the incident. 

How to fix it – determine stakeholders that will be involved in MIs, layout the process that ensures the right stakeholders are notified and decide what tools are available to support your MIM.

4. Wasted time

If MIM tasks aren’t clearly delegated and those assignments communicated to everyone else, then this can result in duplicated efforts, slowing down resolution.

How to fix it – incident management solutions like Jira Service Management can help with every step of your response process, including alerting teams of assigned tasks so everyone can stay on the same page.

5. Poor documentation

If you don’t document every major incident, then every time one occurs you’re starting from square one. This means more downtime for your customers while you figure things out.

How to fix it – the major incident manager should record all the steps you take to fix the incident, as well as other key information, like the impact it’s had. This will give you a blueprint that helps speed up resolution in future.

What’s next?

With a better understanding of what MIM is all about and the common challenges organisations face, it’s worth considering an all-in-one solution that can help you tackle these head on. We already mentioned Jira Service Management (JSM), which can make a huge difference to how you identify and resolve major incidents.

It’s important to know that JSM integrates with Jira Software, so critical issues can be automatically routed to development teams based on criteria you set, and they can prioritise work based on severity. It also supports collaborative working, thanks to incident conference call and chat channel features, and problem management with its incident investigation view. All great for reducing your mean time to resolution when it comes to major incidents.

Want to know more about how Jira Software and JSM work together to reduce downtime and enhance your service delivery? 

copy_text Icon

About the authors

Rich Blunt

Rich Blunt

Having worked for meany years in the ITSM and ESM industry Rich has seen the good, the bad and the ugliest ways to work, He uses this experience and knowledge of the Atlassian ecosystem to work with customers to avoid the many pitfalls of service management, to provide the best solutions to customers problems and challenges.