Data Protection and Preservation (DPP)

Back to 'Notes & Papers' index


Data needs to be protected and preserved. There are a number of reasons why this is desirable, even necessary. However, there are a number of reasons why this isn't simple, including legislation to do with human rights.

A lot of the problems become clearer when you start thinking about meta data - data about data. In particular if you use a system which binds meta data to data. It then becomes possible to talk about data ownership, and who is responsible for data.

Non-repudiation is an important issue, which applies both in practice, due to the problems of un-publishing data, and in legal situations, with regards to rights and responsibilities. It implies that data cannot be withdrawn or deleted while there remains any other data (or meta data) which refers to it. It also implies that data cannot be altered; all that can be done is to publish a new version of it.

Privacy becomes important when you start thinking about personal data. Personal data is strongly connected with issues of identity and identification, and there are commercial and legal reasons for keeping some data private. Private data includes potentially everything an individual has experienced, and how this may be published to become shared or public data is very important.

Protecting Data

Data is useless unless you can gain access to it. Any secure system must balance this against protecting the data from being inspected or changed by unauthorised people. Any one system must be considered insecure if someone unauthorised can gain physical access to it. Multiple, distributed, systems must be used to remove any single point of attack.

Authorising someone implies the need for identification data, which must itself be protected from tampering or inspection. Meta data about the use of authorisation also needs protection. The authorisation process may need to be distributed.

Preserving Data

It's not possible to guarantee that data is preserved, but you can be pretty sure data that's stored in a single location isn't safe. Distributing data greatly increases the protection. Having, for example, three copies, in different locations, makes an immense difference. But, you also need to consider how reliable any archive media is, and its expected lifespan.

If data needs to be kept over a long period then all the meta data which describes it's structure must also be preserved, which can be a problem if proprietary data formats are used. The means to access the archival media must also be available.


DPP requires control of both data and meta data. The meta data must remain associated with the data, and both must be protected from alteration. There must be ways of keeping data and meta data private, and allowing individuals to control how this becomes more public.

Ownership is a critical piece of meta data, and must control the publication process. Once data is published it cannot be expected to be able to be withdrawn; at best a new version can be published.

Distributing data, to multiple locations, is a critical part of DPP. Considering security and the limitations of hardware is also important.

(c) ROMsys Ltd, July 2013, permission given to use for non-profit making purposes

Back to 'Notes & Papers' index