Don’t leave your data to rot. How IT Managers should deal with old data

It’s tempting to leave business data sitting dotted on employees’ computers and around your network – and for it to stay there forever. IT Manager Michael Dear explains why this is a terrible idea.

One of the disadvantages of the internet is that we don’t use dead trees to store things. Paper’s sheer physicality gives it an extra level of security: accessing it is a pain, and to steal substantial amounts of documents takes time, money and lorries. Compare and contrast with electronic versions of the same documents, which are accessible by anyone anywhere on the planet sitting in front of a computer.

In my day-to-day professional life, one of my roles is to protect the company’s assets – and this includes data. I do this in several ways, but I start with policies. And policies give me lists. 

For example, I have an asset register policy. This states what should be considered an asset, with reasons such as cost, the role it plays and if it holds organisational data. Our accountants love the asset register as it allows them to depreciate the asset value; I love it because it tells me what I am protecting.

The same applies to what software we use. A software register makes sure we stay on the right side of the licence conditions and lets me know what I should be managing and patching. IT should be checking this regularly, with some sort of asset tool that probably doubles up as mobile device management (MDM). 

Why check regularly? Because you can only protect what you know about. 

Related reading: How to make sure your data is not at risk when using Microsoft Copilot

Where is the data register?

All this is wonderful, but what doesn’t really exist is a data register.

This leads to another data problem, which can be boiled down to this simple statement: “let’s keep everything”.

A regular bill for storing physical “data” off-site at least keeps people occasionally thinking about this; sadly, the electronic version doesn’t have such a dedicated bill. It’s wrapped up in backup and storage costs, be that on-premise or in the cloud, and is therefore hidden in the general IT budget.

Sure, I can produce a list of files that haven’t been looked at in a decade, but I then need someone to review these files. This may result in permanent deletion, so this isn’t a job for a cheap summer student: it needs someone experienced, possibly quite senior, to review the data. And if they’re doing that, it’s taking them away from their golf real job of making money. The longer it’s put off the longer the list gets.

The reason this matters is twofold: one legislative (UK/EU) and the second criminal.

Two reasons to tackle data

To start with the legislation, we in the UK have the Data Protection Act, 2018, which is shortened to GDPR. If you’re reading this in the EU then you have the same instructions, but if you’re reading this elsewhere in the world then you will different laws but similar aims.

Sticking to the UK, our duty can be boiled down to the data must be timely and accurate. Timely means it’s not out of date, while accurate means you aren’t supposed to store everything forever. How to keep the right side of the law? Policies, of course!

Your organisation should have created a data retention policy. In short, this must set out how long all data is to be held. The idea is that the senior executive’s “real work” doesn’t have to be interrupted by reviewing decades of stale old files, as this policy will state how long different classifications of data will be held for.  For example, six years for accounts, two years after the contract finishes, etc.

But I also mentioned a second, criminal reason. If you’re hit with ransomware, and the added threat of data exfiltration, you will need to tell people and other organisations what you have lost. After the event is not the time to stop and think about this. As the Scouts say, “Be prepared”.

Where to store data

Having talked about what data is stored, we need to talk about the where.

It is a good idea to have a policy on this, as to protect data you need to know where it is. I have come across people who have stored files in the weirdest places, including the Recycle Bin, the Deleted folder in Outlook or simply in files scattered across their desktop. 

There should be something in a staff handbook stating where data can and should be stored. For example, saving files to the desktop isn’t allowed, but to the file server is great. If you know where the data is you can back it up, but also it means that people can’t complain if they store it elsewhere and the data is lost.

You will note I have been talking about files, which us IT types call unstructured data, as opposed to databases which are called structured. The automatic deletion of data is easier in a structured data format as the rules can be built into the same system that people put the data into, so it happens automatically and is scalable.

Using software to handle old data

So, your policies are in place, your staff handbook up to date. The question now becomes, how to implement them.

There are software tools that can help. These are usually sold as data loss prevention (DLP) and scan unstructured folders (the files in random places) for data, open them up and catalogue the contents. They are normally really good for hunting out personal data such as credit card numbers and passport details. 

These tools then run reports that list documents containing dodgy data, and then give options such as: “Stop this file from leaving the organisation”. Or even, “Go in and replace this data with a note it has been removed”.

But these tools have a downside: they are only as good as the rule(s) you can tell the software. Sadly I have not found a great tool for this.

Ideally, you would create a policy and platform that requires all users to classify data as they create it, and then put a retention label on that document, which you can do in tools such as SharePoint. However, people tend not to like doing this and it doesn’t deal with the historical pile of data you have.

Related reading: AWS makes it cheaper to store little-used data with EFS Archive

How to deal with old data

So, what do I recommend for this massive problem of digital bit rot? I suggest having multiple massive encrypted hard disks that all old data is moved to. If it’s ever needed, it can be restored. However, if a ransomware attack happens then it’s offline and can’t be lost. In short, you don’t have to worry about what is in those old files.

And yes, I’m aware that I’m suggesting you replace the offsite paper archive with a new offsite pile of hard disks – but trust me, this is a tried and tested technique.

Need to learn more about data?

Our collection of “Explainers” provides explanations of how different technologies work. Here’s a collection, all related to data:

michael dear
Michael Dear

Michael has worked for more than 20 years running IT departments, mainly for small to medium insurance firms. His primary interest is focused on security and compliance.

NEXT UP