Anonymisation

Can Data Truly Be Made Private?
About

Data is often called the new gold because it fuels everything from business decisions to medical research. The more data we have, the better we can predict trends, detect health outbreaks, and improve services. But there’s a problem — personal data is sensitive, and strict privacy laws (like GDPR and HIPAA) prevent companies from freely sharing it.
This creates a paradox: we need more data to solve problems, but we also need to protect people’s privacy.
One common approach to this challenge is anonymisation — hiding or altering personal details in a dataset so that individuals cannot be identified. But how effective is it really?

What is Anonymisation?

At its core, anonymisation means modifying data so that no one can trace it back to a specific person. The most common way to do this is by removing or replacing personally identifiable information (PII) like names, Social Security numbers, and phone numbers.


For example, if Jane Doe’s medical records were being anonymised, her name might be replaced with a random ID like "X538iDT12". If a hacker were to access this dataset, they wouldn’t see any names—just numbers.

Seems safe, right? Not necessarily.

The Problem with Anonymisation

Many people assume that if you remove names and ID numbers from a dataset, privacy is protected. But the reality is more complicated. Even if personal identifiers are stripped away, there are still other details (like age, location, job title) that could be pieced together to re-identify someone.

Imagine a dataset that contains:

  • Age: 37

  • City: San Francisco

  • Occupation: Data Scientist

  • Marital Status: Married

Each of these details on their own might not reveal much, but together, they might point to a specific person — especially if the attacker already knows something about Jane (like where she works).


This type of privacy risk is called a linkage attack — when someone cross-references a supposedly "anonymous" dataset with other available information (like social media or public records) to identify people.


Example: In 1997, a researcher was able to re-identify people in an "anonymised" medical database by combining it with public voter registration data. Since then, many high-profile privacy breaches have proven that true anonymity is much harder than it seems.

Attempts to Improve Anonymisation

To make anonymised data safer, researchers have developed techniques to reduce the risk of re-identification. These include:

  • Grouping ages into ranges (e.g., 20–29 instead of "25")

  • Removing specific locations (e.g., only listing a person’s country instead of their city)

  • Ensuring each individual shares the same characteristics as a group of people (called k-anonymity)

But even with these safeguards, there are no guarantees. If attackers have access to other data sources (which is almost always the case today), they can still link different datasets together to uncover private information.

Can Anonymisation Ever Be "Good Enough"?

In some cases, anonymisation works well enough for general data sharing — especially for non-sensitive data like survey results or website analytics. However, when it comes to highly sensitive data (like health records or financial transactions), anonymisation alone is not enough.


Many experts believe that stronger privacy methods — such as differential privacy and secure computation techniques — are needed to truly protect individuals while still allowing organisations to gain insights from data.

Let's Talk

Have any extra questions or need a demo? Drop us a message and let's discuss.

Or drop a message to

hello@oblivious.com

Let's Talk

Have any extra questions or need a demo? Drop us a message and let's discuss.

Or drop a message to

hello@oblivious.com

Let's Talk

Have any extra questions or need a demo? Drop us a message and let's discuss.

Or drop a message to

hello@oblivious.com