In today’s technology-driven climate, the word “data” is used everywhere. It has become a buzzword of sorts—and it means many things to many people.
By: Benjamin Kennedy, Director, Epiq
In today’s technology-driven climate, the word “data” is used everywhere. It has become a buzzword of sorts—and it means many things to many people. As a society, we create, use, and store tons of data. Like individuals, organisations create an enormous amount of data every day. There are emails, text messages, instant messages, word documents, social media posts, and IP addresses, just to name a few.
The increased prevalence of mandatory breach notifications, the General Data Protection Regulation (GDPR) and other data privacy regulations, and increased regulatory and legal dispute activity are forcing organisations to have more control over their data, which in turn has led to increased focus and investment in information governance. Information governance is about understanding the data created within your organisation and where it is located. Knowledge of the data lifecycle within an organisation enables defensible data disposition, an increasingly important function for companies of all shapes and sizes to manage their data.
Defensible Data Disposition
Most organisations are drowning in the large volume of data they create. With data growth at an all-time high, and no signs of it slowing down, the ability to deal with massive amounts of data inside your organisation can be overwhelming. Not only does this generate higher storage costs, but keeping unnecessary data can have other severe consequences. For example, you may have been able to defensibly delete data that later turns into a smoking gun against you in a case. Another risk is that if your organisation is ever the victim of a breach, more data than necessary will be exposed, which could lead to potential fines and reputational harm.
Keeping unnecessary data is both risky and expensive. Deleting data, in accordance with a reasonable and enforced retention policy and schedule, and in accordance with any duties to preserve documents, that has no regulatory, business, or legal purpose has a number of benefits.
Email, shared drives, and user shares get clogged with redundant content, personal multimedia files and aged data. Determining what information has ongoing business value and what can be deleted is a complex process. The challenge to remove data without value means most organisations end up hoarding and stockpiling this content and expanding their storage capacity year after year - eating away at restricted IT budgets.
Organisations can often recoup significant storage capacity by following a few simple policies. These suggestions may easily uncover tons of unnecessary data:
Redundant Content: Find duplicate files based on a hash value of the document content and eliminate duplicate copies and keep one version. A hash value is like a digital fingerprint, and only when the documents are byte-by-byte identical will it be considered a duplicate.
Aged Data: From metadata, look to last accessed and modified dates for data that has not been accessed in more than a specified number of years. The exact time period will vary between business units’ jurisdictions and the associated regulations. Outside counsel may assist with navigating the legal obligations to keep data. Searches for aged data not subject to preservation obligations and not containing intellectual property will isolate documents that are candidates for deletion. Upon review, if it is determined that this set of documents has aged and no longer has value it can be purged.
Abandoned Content: IT systems may store information from users no longer with the organisation. Electronic documents in storage keep the original and last author of content. The authorship information may also help identify documents that are candidates for deletion.
Multimedia: Music and video requires much more storage than written content such as email and business documentation. Searching and finding large multimedia files can identify users who store non-work related content that has no purpose on the network.
Classification
After deletion, the next step to controlling your data is to organise that data. In order to properly organise data there must be an understanding of what each document is and why it was created, also known as classification. This process can require the data owner to manually tag a document of its purpose and determine if it should be kept or deleted. Manually classifying information can be a challenge as it is extremely labour intensive. However, today there are tools that can auto-classify data. Auto-classification is a process that eliminates the need for humans to create and/or edit individual document metadata. The manual process is replaced with an automated solution, which analyses the text of the document and applies rules to systematise the classification process.
Auto-classification can accomplish the following:
Once a document is properly classified, a business decision can be made to either archive the document or have it deleted.
Together, defensible disposition and auto-classification policies are a highly effective way to control your organisation’s data and maximise its value. First, unnecessary data is eliminated, freeing up storage. Next, classification allows for better organisation and management of the data that needs to be kept for business, regulatory, or legal reasons. When organisations know where their data is located and have minimised the amount of data, they can take proactive steps to protect sensitive data. With data breaches continuing to make headlines, some with significant financial consequences, efforts to protect a company’s information assets can help to reduce negative public exposure and associated monetary losses.
About the author: Benjamin Kennedy brings fifteen years' experience consulting within government and private sectors on litigation support. His technical skills, passion for new technology and legal knowledge are valued by clients when developing cost effective solutions for information review exercises of any size. Kennedy's primary focus is overseeing eDiscovery in Australia and New Zealand. He and his team assist in a variety of matters including, construction and commercial disputes, class actions, and forensic investigations that involve all manner of information sources. Kennedy regularly provides educational seminars for lawyers on eDiscovery, and speaks on advancements in the field.