ADAPTAVIST increase boxes TRANS
July 21, 2020

How to clean up Confluence in six steps

AD
Andra Dinu 12 minute read
Keeping Confluence tidy is key to keeping it usable as a knowledge base. After years of intensive use and unregulated growth, most Confluence instances end up exhibiting common symptoms:
  
  • Outdated pages with unreliable content
  • Unnecessary content using up precious storage
  • Cluttered and confusing navigation
If ignored, these issues can develop into major problems. When users can’t find the content they’re looking for, or can’t rely on the pages they find in Confluence, they stop using it altogether and instead develop their own ‘knowledge silos’ - not something you’d want to happen if Confluence is going to be your single source of organisational truth. This last article in our ‘Confluence Content Management Automation' series is dedicated to cleaning up. The first two parts in the series are here:

There Is No Silver Bullet

While there are common symptoms, ultimately every messy Confluence instance is messy in its own way. For some, the solution will be deleting old content; others will need a complete re-arrangement of their instance; some might need to go after add-ons and macros as part of a cleanup, while others will focus entirely on pages. 

‘What’s the best way to clean up my Confluence instance?’ is a question without a one-size-fits-all answer. However, based on our consulting experience and the feedback we get from ScriptRunner for Confluence users in organisations of all shapes and sizes, we’ve created a step-by-step plan you can adapt to your unique situation.

Step 1: categorise all your spaces 

A cleanup operation needs a lot of human input. There are apps such as ScriptRunner that will make your life easier with bulk actions and automation, but ultimately it’s you who needs to decide what to move where and what you need to delete or keep. Start by taking stock of all assets in your instance, beginning with a complete categorisation of all your spaces:
 
  1. In the header bar, click the spaces menu, and select "Space directory" at the bottom of the menu. This will give you a complete list of all spaces in your instance.

    Confluence Space Directory

    All the spaces in my very messy demo instance listed with Space Directory

    2. Using a simple spreadsheet, identify:
  • Which spaces can be categorised by department/team/project?
  • Are there any intersections, such as spaces used by multiple departments?
  • Who are the intended users for each space?
  • Who are the Space owners? Be aware that in some cases the right person to help you understand how a space is used might might not necessarily be the space owner. 

Confluence Space clean up2Use a spreadsheet to take stock of your spaces

As you classify spaces using the method above, you will start noticing patterns and see the same questions emerge for different groups of people. Some teams might have a lot of spaces that seem to be essentially the same thing - maybe they can be combined? Other spaces appear to be really old - have they been completely abandoned? Remember, at this stage you are just taking stock and understanding what belongs to whom, so don’t take any action just yet.

Top tip:

It’s good practice for a general cleanup operation to also take stock of the apps used on your instance. Are you really using all the apps you’re paying for? Can you save money by replacing multiple smaller apps by one app with multiple features?

Step 2: investigate content quality 


Once you have categorised your spaces, you will have a better understanding of the main things you need to tackle. The next step is a deeper dive into your content:

Find pages and spaces that are not getting views and updates.

Finding pages and spaces that get little to no views is a good way to find potentially unnecessary content. You can use the Confluence Usage Stats plugin from Atlassian (note that it is known to cause performance problems on large installations) or other apps available on the Marketplace. If you are already tracking your Confluence instance with Google Analytics, you can see that information in the analytics console.

If pages (and spaces) haven’t been updated in a long time, you may no longer need them. An easy way to find these pages is by using the ‘Old Content notifier’ job in ScriptRunner for Confluence. Use a CQL query to focus on the content you're interested in and find pages that haven't been modified in a long time. Then either run the job now or schedule it so you're regularly notified of pages that are getting old and less relevant to your users. This is a great way to ensure you can easily stay on top of your content in Confluence without needing to do the heavy lifting yourself.

Old content notifier Scriprunner for Confluence

Pages in TS1 space that haven’t been notified in more than 104 weeks

The Space Statistics built-in script in ScriptRunner for Confluence will let you further compare and contrast your spaces based on: number of pages, comments, and volume of attachments, number of labels used, distinct number of users that commented, creation date. This extra information will give you more understanding on the problems you’re facing, and which spaces are most affected.

space statistics

Comparison between three spaces I selected in my demo instance using Space Statistics

Find pages and spaces that are hard to navigate

Content quality can also refer to how easy it is to navigate a space. You will likely find many spaces that entirely lack context. Sometimes, this will not be a problem:  a small space may contain meeting notes or resources for a single project, and as long as the project team understands what is in there, there is no need to fuss. However, if that space is meant to be used as a welcome space, for department info, documentation, etc., you will certainly need to reorganise it and add more contextual information. 


Label your findings

As you identify problematic pages and spaces, label them with ‘duplicate’, ‘needs-review’, etc.  Confluence offers a series of macros, such as Display Pages with Labels, that will help you display and organise the content you identified and prepare it for the next step. 


Step 3: Check in with your space admins

 Now is the time to ask the space owners to get involved in the process. For example, they can tell you for sure what can be safely cleaned up and what needs to be kept, if the four spaces which seem to contain more or less the same thing can be reorganised into a single one, or if the huge quantity of attachments taking up storage space can safely be deleted.

If you’re using the Display Pages with Labels macro, you can create a Confluence page displaying the content you identified by label, and point your admins in that direction.

If you have ScriptRunner for Confluence, take a look at the Custom CQL Macro. It will allow you to use a CQL search to display to each space admin all the content they need to check organised by label, in a neat table on a single Confluence page.

Step 4: Removing unnecessary content

Once your space admins have given their feedback, you’re finally ready to clean up. This is the time where you decide what to delete, and what to archive.

Archiving unnecessary content

When it comes to Confluence, the term ‘archiving’ can be a bit confusing because it can mean different things to different people. Depending on what article you are reading or what expert you are talking to, the same term is used to either mean ‘moving content in a reserved space on the same instance’, or ‘moving old content to a different server in order to save space’. 

If the main purpose of your cleanup is to make it easier for people to use and find things in Confluence, it may be enough to archive old content in a separate space on the same instance so it doesn’t confuse users. 

As an admin or space admin, you can even ‘hide’ old pages in the same space by using Space tools>Reorder pages, then moving them outside the page tree and restricting them. This is a fast, easy way to ‘clean’ the space by hiding content just outside the space, but keeping it accessible. 

Hidden page

‘Hide’ old pages from your users in the same space

Deleting content

If the main purpose of your cleanup is to make more space on your instance, you will have to either completely delete unnecessary content, or to take it off the instance. 

Top tip: 

While you’re moving content around, you can restrict access to pages - either for moving them to a separate server later or for deleting them. If you’re using ScriptRunner, use the Add/Remove Restrictions to Parent & Child Pages built-in script to add viewing and editing restrictions to a given page and all of its descendants at the same time.

 With out-of-the box Confluence, deleting content will be very time-consuming: you will have to delete page by page, comment by comment, attachment by attachment, etc. ScriptRunner for Confluence offers a series of easy-to use built-in scripts that will help you save a lot of time by allowing you to perform deletions in bulk:

  • Bulk Delete Attachments - deletes all attachments (or attachments of a specified age) for a page or multiple pages
  • Bulk Delete Comments - deletes all comments (or comments of a specified age) for a page or multiple pages
  • Bulk Purge Trash -  permanently deletes all trash for one or more specific spaces or all spaces

ScriptRunner allows Confluence admins to give access to these built-in scripts to space admins. So, once you have agreed on what needs to be deleted, you can just instruct space admins to move on to the deletion process in their own spaces.

Scriptrunner for Confluence: built in scriptsOnce enabled, space admins have access to built-in scripts from Space tools> Advanced Space Functionality

Step 5: Reorganise the remaining content

Once you get rid of the content you no longer need, it’s time to reorganise what’s left so it becomes easily accessible and readable. 

This can take one week, six months, or more, depending on complexity. Maybe you will only have to organise a few spaces in new templates, but you might have to create a whole new structure and move thousands of pages to it. Whatever you need to do, here are a few things that will make your work easier moving forward.

Top tip: 

If you’re looking at reorganising your cleaned-up content in a completely new structure, it’s easier to use the same Confluence instance rather than moving everything to a brand-new instance. Creating new spaces in the same instance and copying content in them is easier than fiddling with export and import across two instances. Once you’ve copied everything, just delete your old structure.


If you’re a ScriptRunner user, take a look at the Copy Space built-in script. You can use it to copy a new template space you created in its entirety - so that once you’ve created a space with a new, better structure, you can copy it again and again.

 

Copy space2

Add metadata to spaces and organise them using Space Categories

Your reorganised spaces should have a short description that makes it clear to unfamiliar users what the space is about. Also, grouping your spaces with Space Categories will let you locate them much easier in the future. 

Confluence Space categories

Add descriptions and categorise all your Confluence spaces

Use templates and blueprints for your spaces and pages

Organising content based on a pre-defined template will ensure you have no more pages and spaces that are lacking any context, and all the information needed for easy navigation is accessible. What’s more, having a consistent style will make it simple to find and digest content. For a deep dive into how to use templates and blueprints, have a look at this article on using Confluence templates and blueprints to better structure your instance.

Template for project plan

Basic template for a project plan


Use a standardised label system

Create and document a labelling convention, then clearly communicate it to all users. All new pages added after the cleanup will be easy to find by filtering search results based on the label. You can read more about how to use a Confluence labels system here. 

Bad Confluence page labelling example              Lack of consistency in Confluence page labelling creates chaos

Step six: Set up processes and educate your users

You’ve spent 6 months creating a clean, lean, nice-looking Confluence machine. Congratulations! Now for the most important step, which weirdly gets forgotten 2 out of 3 times: establish processes and educate your users on how to follow them. Without this, 2 years down the line you might find yourself repeating the entire thing.  In real life, this step should happen at the same time with Step 5, because these are all decisions you will need to take together with your users, as you tidy up. Here are a few things to consider:

  • Decide which standard templates each department needs. Can you replicate templates across departments to achieve more standardisation? For example, every home page can have the same layout and required pieces of info.
  • Besides a standardised labelling system, you might also need standardised naming conventions of spaces and pages. 
  • To prevent clutter in the future, you might want your space admins to regularly review content for archiving and deleting.
  • Clearly document all the agreed-upon processes.
Most likely, these decisions will be taken together with representatives of your various teams and departments. But once everything is agreed, make sure information reaches all users. Does each team need a separate info session about the new setup, or would a company-wide communication be more efficient?

 

Top tip: 

Standardisation doesn’t need to stop at Confluence. Do your project teams need its project space to correspond to a Jira project? With ScriptRunner, you can automatically generate a Jira project when a new Confluence space is created - or the other way around.

Automate your processes
Using ScriptRunner, you can make sure that a lot of the processes you have set up are automated. You can write your own bespoke automations and customisations for your own scenarios, but here are a few examples from our users: 

We hope our step-by-step approach is useful to you, and we’d love to hear more about your own strategies and automations for cleaning up Confluence. Get in touch to tell us all about it. ScriptRunner for Confluence is an amazingly versatile tool that lets you customise, extend and automate Confluence as much as you need. Explore what you can achieve with a 30-day free trial:

  Start your Free Trial!