Skip to content
Feature releases

Now possible: deterministic masking for unique values across databases

Bert Nienhuis
Bert Nienhuis
Now possible: deterministic masking for unique values across databases
4:32

For organizations managing sensitive data across multiple systems, one of the most demanding requirements is ensuring that unique identifiers, such as: Social Security Numbers, IBANs, and credit card numbers, are masked consistently, regardless of where or when they are processed.

DATPROF Privacy 4.23 introduces a powerful new capability that makes this requirement significantly easier to meet.

The challenge: deterministic masking at scale

Replacing unique values consistently across a chain of databases is one of the most complex problems in data masking. The core requirement is straightforward: the same original value must always produce the same masked value, no matter which database is being processed or when the masking run takes place.

This challenge intensifies in environments where datasets partially overlap, evolve over time, or are masked in separate runs. A single inconsistency, one Social Security Number masked differently in two systems, can break referential integrity, create compliance gaps, and undermine the reliability of your test data.

How deterministic masking worked before

DATPROF has supported deterministic masking for seed file-based generators for some time. This approach works well for values such as names, companies, streets, and products, where the same input consistently maps to the same output from a predefined list.

However, seed file-based deterministic masking is not suitable for unique identifiers. Because multiple original values can map to the same masked value, both “John” and “Peter” might become “Evan”, it cannot guarantee uniqueness. For Social Security Numbers or IBANs, that is simply not acceptable.

The workaround many clients used involved conditional masking with a Value Lookup function against a dictionary table, combined with a fallback function for new values. The fallback results then had to be written back into the dictionary table and distributed across all relevant databases. While effective, this approach required significant manual effort and added considerable complexity to masking projects.

A simpler approach: the global translation database

DATPROF Privacy 4.23 removes this complexity with the introduction of the Global Translation Database. Within a Privacy project, users can now designate a central database to serve as the translation store for deterministic masking. Crucially, this database can be a different technology than the databases being masked, giving organizations maximum flexibility.

Deterministic mode is now available for any generator, not just those based on seed files. When configuring a generator in deterministic mode, users simply specify the schema and name of the translation table. DATPROF Privacy automatically creates this table in the Global Translation Database if it does not yet exist, or reuses it if it is already in place.

The platform handles the rest, no manual scripts, no table distribution, no complex conditional masking logic required.

What this means for your organization

The Global Translation Database directly addresses some of the most common operational pain points in enterprise data masking:

  • Consistent masking of unique identifiers across all systems, every time.

  • Support for partially overlapping datasets without risking value conflicts.

  • Reliable handling of datasets that evolve over time, including values added in later masking runs.

  • Reduced dependency on custom scripts and manual dictionary management.

  • Simpler, more maintainable masking projects across heterogeneous database environments.

Deterministic masking redefined

With this release, deterministic masking is no longer limited to a specific category of generators or constrained by the need for manual infrastructure. DATPROF Privacy 4.23 makes it possible to mask unique values consistently and reliably across complex, multi-technology data landscapes, without the overhead that previously came with it.

The result is a masking process that is easier to manage, safer to operate, and more scalable as your data environment grows.

Want to learn more?

If your organization deals with unique identifiers spread across multiple databases or masking environments, the Global Translation Database in DATPROF Privacy 4.23 was built for you. Get in touch with our team to see how it fits your specific data landscape and masking requirements.

Frequently asked questions

What is deterministic masking, and why does it matter for unique values?

Deterministic masking ensures that the same original value always produces the same masked value, every time it is processed. For unique identifiers like Social Security Numbers or IBANs, this is critical, if the same number appears in multiple databases, it must be masked identically in all of them to preserve referential integrity and ensure consistent, compliant test data. 

What types of data benefit most from this feature?

Any data that is unique and appears across multiple systems. Common examples include Social Security Numbers, IBANs, credit card numbers, customer IDs, order numbers, and other unique identifiers that need to remain consistent across databases. 

What is the Global Translation Database, and how does it work?

The Global Translation Database is a central database you designate within your DATPROF Privacy project to store translation tables. When a unique value is masked for the first time, the mapping between the original and masked value is stored there. The next time that value is encountered, in any database, in any masking run, DATPROF looks it up and applies the same masked value automatically. 

Does the Global Translation Database need to be the same technology as the database I am masking?

No. The Global Translation Database can be a different database technology than the databases being masked. This gives organizations the flexibility to use a central, dedicated database regardless of what technologies their source systems run on. 

Share this post