Practical insights about test data for test teams

DATPROF now supports Databricks

Written by Bert Nienhuis | May 20, 2026 8:51:02 AM

 

Test teams can now mask sensitive data directly inside Databricks Delta Tables: no exports, no workarounds, no waiting on engineering teams.

As more organizations adopt Lakehouse architectures, test data increasingly lives in Databricks: distributed across Delta Tables, layered through bronze, silver, and gold zones, with extensive version history. Until now, masking that data meant pulling it out of Databricks entirely.

DATPROF's new Databricks support changes that. You can now mask and generate test data directly where it lives, making secure testing in modern data platforms finally practical.

What you can do when you connect
DATPROF to Databricks

When you connect DATPROF to Databricks, you get:

  • In-place masking on Delta Tables without moving data
  • Synthetic data generation directly inside your lakehouse
  • Conditional masking using SQL filters for precise control
  • Automatic delta table version handling that prevents access to unmasked historical data

This brings test data provisioning directly into the Databricks environment, eliminating the complex dependency chains that previously slowed down testing cycles.

Why does this matter?

For organizations running test environments on Databricks, this integration fundamentally changes how test data provisioning works. Instead of treating the lakehouse as a black box that requires engineering intervention for every test cycle, test teams can now work directly with production-like data in a secure, compliant way. The result: faster test cycles, reduced compliance risk, and fewer dependencies blocking your testing pipeline.

1. Test data without the wait

Previously, getting masked test data from Databricks involved multiple teams and days of waiting: request an export, wait for engineering capacity, get approvals, build pipelines. Masking now runs in-place, cutting what used to take days down to minutes.

2. Delta history is finally secure

Delta Tables store historical versions, useful for time travel, but risky if old, unmasked values remain accessible. DATPROF's automated version handling ensures that no historical version contains unmasked sensitive data, eliminating a major compliance blind spot.

3. Consistency across your entire stack

In integration testing, the same customer masked differently in Oracle, SQL Server, and Databricks breaks everything. DATPROF's deterministic masking ensures values stay consistent across all platforms, making cross-system tests reliable and maintainable.

4. Work more independently

Test teams no longer need data engineers to create copies, views, or interim datasets for every test cycle. By masking directly in Databricks, test data specialists can work autonomously, reducing bottlenecks and speeding up delivery.

5. New testing scenarios

This capability unlocks workflows that were previously impractical:

  • Mask data as it flows through bronze → silver → gold pipelines
  • Provide safe test data for ML/AI pipelines in the lakehouse
  • Include masking in Databricks Jobs or CI/CD workflows
  • Generate synthetic data in the same environment where models train
  • Use Delta time travel without privacy concerns

Getting Started

To start masking data in Databricks with DATPROF:

  • Connect DATPROF to your Databricks workspace using your workspace URL and access token
  • Select your Delta Tables from the DATPROF interface
  • Configure your masking rules using DATPROF's built-in functions or custom logic
  • Execute masking directly on your Delta Tables, no exports, no intermediate storage

DATPROF handles Delta Table versioning automatically, ensuring compliance across your entire data history.

For customers: find detailed setup instructions and best practices at our documentation or contact your DATPROF solutions engineer.

Not yet a customer? Schedule a product demonstration with one of our TDM experts.