File masking: beyond database boundaries
DATPROF Runtime now masks files with the same power and flexibility as database masking, opening new possibilities for data protection.
Data protection shouldn’t be limited by where your data lives. With DATPROF Runtime 4.14, you can now anonymize sensitive information in files with the same enterprise-grade capabilities you rely on for database masking.
Bringing consistency, speed, and flexibility to file-based data masking across any platform.
File masking that matches your database masking
DATPROF’s new file masking engine delivers the sophisticated masking capabilities of DATPROF Privacy, but for files. This means you can apply multiple functions to a single column, implement conditional masking based on complex logic, and reuse lookup tables and generators across both databases and files—ensuring consistent data protection everywhere.
DATPROF works with:
-
Parquet files: High-performance masking for large datasets with gigabytes of data and millions of rows, ideal for data lakes and analytics platforms.
-
CSV and JSON Lines: Support for common data interchange formats used across applications.
-
XML and HFL formats: Specialized support for hierarchical data structures, including EI standards used in Dutch invoicing systems.

Platform-independent data protection
Modern data ecosystems are complex. Data flows through hundreds of different platforms, cloud storage systems, and SaaS applications. Rather than building individual connectors for each source, DATPROF’s file masking operates independently of the source system, focusing purely on anonymizing data in files.
This approach fills a critical gap. Tools like Azure Data Factory can extract data from countless sources but lack masking capabilities. DATPROF’s file masking engine serves as your processing platform, enabling you to mask data consistently across various cloud, SaaS, SQL, and NoSQL data sources—all through a unified interface.
Consistency across your entire data landscape
When the same customer appears in your CRM, data warehouse, and analytics files, their masked identity should match everywhere.
The real challenge isn’t just masking data, it’s ensuring that the same person gets the same masked identity across every system.
DATPROF ensures this through deterministic generators and lookup files. A name or postal code gets masked identically every time, across all systems and runs, maintaining referential integrity while protecting privacy.
This consistency extends beyond just matching values. You control whether to preserve statistical distributions: maintaining age distributions for birthdates in analytics while fully randomizing names where distribution doesn’t matter.
Getting started
File masking is available now in DATPROF Runtime 4.14. The integrated UI eliminates the complexity of command-line scripts and configuration files. Simply select your file type, define masking functions, and set conditions. The process feels familiar if you’ve built privacy templates, but now works across your file-based data sources.
Every masking run is fully auditable with complete logging of which functions were applied to which files and which template version was used. Files remain local and under your control, with the masking engine operating strictly as a processing tool—no file viewers, no unnecessary data access, just secure transformation with full transparency.
To get started read our step-to-step documentation.
Watch the tutorial
Enter your contact details and view the step-by-step file masking video guide.
Frequently Asked Questions
1. What is file masking?
File masking is the process of anonymizing or transforming sensitive information stored in files, such as Parquet, CSV, JSON Lines, XML or HFL. With DATPROF Runtime, teams can apply masking rules to file-based data with the same level of control they use for database masking.
2. Which file formats does DATPROF support for file masking?
DATPROF supports file masking for Parquet, CSV, JSON Lines, XML and HFL files. This makes it possible to protect sensitive data in data lakes, analytics platforms, integrations and file-based application flows.
3. How is DATPROF file masking different from generic ETL or data processing tools?
Generic ETL and data movement tools are often strong at extracting, transforming and loading data, but they are usually not designed specifically for privacy-safe test data. DATPROF focuses on masking logic, deterministic generators, lookup files, conditional masking, auditability and consistency across systems.
4. Can DATPROF mask the same person consistently across files and databases?
Yes. DATPROF can use deterministic generators and lookup files so the same input value receives the same masked output across multiple files, databases and runs. This helps preserve referential integrity while protecting sensitive data.
5. Can file masking be used across cloud, SaaS, SQL and NoSQL data sources?
Yes. DATPROF file masking is platform-independent because it focuses on the files themselves. Data can come from cloud platforms, SaaS systems, SQL databases or NoSQL systems, and then be masked consistently through a unified masking process.
6. Does DATPROF keep files under our control?
Yes. The files remain local and under your control, with DATPROF Runtime operating as a processing engine. This helps reduce unnecessary data exposure while still providing logging and auditability.
