Skip to main content
All Writing
PostgreSQLDokployGreenmaskinfrastructuresecurity

A Safe Masked UAT Refresh Pipeline with Dokploy and Greenmask

How I set up a repeatable PostgreSQL UAT refresh flow that reads production, masks sensitive data with Greenmask, and restores only into a disposable UAT database.

By Criston Mascarenhas, Senior Software EngineerUpdated 12 min read

UAT is useful only when it behaves enough like production to catch real problems. The awkward part is that production-like data is exactly the thing you should not casually copy into another environment.

I wanted a refresh flow with a very boring contract:

Production Postgres -> masked dump -> UAT Postgres

Production is the source. UAT is the target. Production should never be restored into, dropped, modified, or masked in place.

This post is the setup I use with Dokploy, PostgreSQL, and Greenmask. It is intentionally practical. The main goal is not clever masking. The main goal is a repeatable pipeline where the dangerous parts are obvious, isolated, and hard to point at the wrong database.

The Mental Model

The whole pipeline has three moving pieces:

PieceRoleWhat Greenmask does
Production databaseLive application dataReads from it during dump
Greenmask runnerTransformation boundaryCreates the masked dump
UAT databaseDisposable test copyReceives data during restore

The rule I keep coming back to is this:

dump.pg_dump_options.dbname       -> production
restore.pg_restore_options.dbname -> UAT

That one distinction matters more than any individual masking rule. The dump side can read production. The restore side is destructive and must only point at UAT.

Create a Real UAT Database

Start with a separate PostgreSQL service in Dokploy. Not a different schema in the same database. Not a reused database with a different application URL. A separate service with separate credentials.

For example:

Production DB service: app-prod-postgres
UAT DB service:        app-uat-postgres

A good baseline looks like this:

Production:
  Host: app-prod-postgres
  Database: app_prod
  User: prod_readonly_user
  Permission: SELECT/read-only
 
UAT:
  Host: app-uat-postgres
  Database: app_uat
  User: uat_restore_user
  Permission: owner or restore-capable

The production user should be read-only if your schema allows it. Greenmask does not need to insert, update, delete, truncate, drop, alter, or create anything in production during the dump. That permission boundary is the first real safety rail.

If someone later runs the wrong command, the production credentials should still be too weak to destroy anything.

Add a Greenmask Runner Container

Dokploy scheduled jobs run commands inside an existing container, so I keep a small Greenmask runner alive and target that container for manual and scheduled refreshes.

Here is the service:

services:
  greenmask-runner:
    image: greenmask/greenmask:latest
    container_name: app-greenmask-runner
    restart: unless-stopped
    user: "0:0"
    env_file:
      - .env
    volumes:
      - ./greenmask-uat.yml:/config/greenmask.yml:ro
      - greenmask-dumps:/dumps
    entrypoint: ["/bin/sh", "-c"]
    command: ["mkdir -p /tmp/greenmask /dumps && sleep infinity"]
 
volumes:
  greenmask-dumps:

There are a few small things in there that save a lot of time.

The container runs sleep infinity because Dokploy needs a stable running container to execute scheduled jobs in. The command also creates /tmp/greenmask and /dumps, which avoids failures like this during pg_dump:

pg_dump: error: could not create directory "/tmp/greenmask/<id>": No such file or directory

The entrypoint override is there because the Greenmask image starts with the greenmask binary by default. Without the override, Docker can treat sh as a Greenmask subcommand:

unknown command "sh" for "greenmask"

I use user: "0:0" for the first version because Dokploy volume ownership can otherwise block writes to /dumps:

mkdir /dumps/<dump-id>: permission denied

For a stricter setup, create the dump directory on the host and assign it to the UID/GID used by the Greenmask image. For the first pipeline test, I prefer getting the isolated utility container working first, then tightening filesystem permissions once the flow is proven.

Wire Greenmask in One Direction

Put the Greenmask config beside your Compose file. I usually call it:

greenmask-uat.yml

The important sections are storage, dump, restore, and transformation.

storage:
  type: "directory"
  directory:
    path: "/dumps"
 
dump:
  pg_dump_options:
    # Production source database.
    # Greenmask reads from this database during dump.
    dbname: "postgresql://prod_user:${PROD_DB_PASSWORD}@app-prod-postgres:5432/app_prod"
    no-owner: true
    no-acl: true
 
restore:
  pg_restore_options:
    # UAT target database.
    # Greenmask restores into this database.
    dbname: "postgresql://uat_user:${UAT_DB_PASSWORD}@app-uat-postgres:5432/app_uat"
    clean: true
    if-exists: true
    no-owner: true
    no-acl: true

If both databases are on the same Compose network, the service names work as hosts.

The dangerous line is not hidden. It is restore.pg_restore_options.dbname. With clean: true, restore can drop existing objects in the target database before recreating them from the dump. That is exactly what you want for a clean UAT refresh, and exactly what you do not want anywhere near production.

no-owner and no-acl are also worth keeping. They stop production ownership and grant metadata from being replayed in UAT. Without them, restores often fail on role mismatches:

ERROR: role "prod_readonly_user" does not exist
Command was: GRANT SELECT ON TABLE public.users TO prod_readonly_user;

Do not create production roles in UAT just to satisfy restore metadata. Skip that metadata.

Decide What Gets Masked, Skipped, or Copied

Greenmask reads the table list from production during dump. Your config does not need to describe every table. It needs to describe the tables and columns that should be transformed, excluded, or handled carefully.

Most schemas fall into three buckets:

1. Mask sensitive columns.
2. Skip rows that should never enter UAT.
3. Copy safe lookup tables as-is.

The trick is to preserve the relational shape of production while replacing the sensitive values. Primary keys and foreign keys usually stay unchanged.

For example:

Production user id 42 -> UAT user id 42 -> [email protected]
Production project id 80 -> UAT project id 80 -> UAT Project 80

That gives you realistic joins, permissions, activity, reports, and edge cases without leaking the real customer details.

Mask the Obvious PII First

Start with users. Email, passwords, names, API keys, and any profile fields should be treated as sensitive.

transformation:
  - schema: "public"
    name: "users"
    transformers:
      - name: "Template"
        params:
          column: "email"
          template: 'uat-user-{{- .GetColumnValue "id" -}}@example.test'
          validate: true
 
      - name: "Replace"
        resolve_env: true
        params:
          column: "password"
          value: "${UAT_PASSWORD_HASH?UAT_PASSWORD_HASH is required}"
          validate: true
 
      - name: "SetNull"
        params:
          column: "api_key"

This keeps each user stable across refreshes:

Production:
  id: 42
  email: [email protected]
 
UAT:
  id: 42
  email: [email protected]

The password field should be a real hash generated by your application, not something guessed by hand. Use the same password-hashing code path your app uses for normal users.

In Dokploy, the runner needs the values Greenmask resolves:

PROD_DB_PASSWORD=...
UAT_DB_PASSWORD=...
UAT_PASSWORD_HASH=...
GREENMASK_GLOBAL_SALT=...

Keep the global salt stable if you want deterministic masking across runs.

Rewrite URLs and Customer Entities

Production URLs can leak more than people expect: customer domains, private paths, signed assets, staging links, internal tools. I usually rewrite them rather than trying to preserve anything recognizable.

  - schema: "public"
    name: "urls"
    transformers:
      - name: "Template"
        params:
          column: "url"
          template: 'https://uat-target.example.test/project-{{- .GetColumnValue "project_id" -}}/url-{{- .GetColumnValue "id" -}}'
          validate: true

Example result:

Production:
  id: 500
  project_id: 80
  url: https://customer-production-site.com/private/page
 
UAT:
  id: 500
  project_id: 80
  url: https://uat-target.example.test/project-80/url-500

The same applies to customer-facing entities:

  - schema: "public"
    name: "clients"
    transformers:
      - name: "Template"
        params:
          column: "name"
          template: 'UAT Client {{- .GetColumnValue "id" -}}'
          validate: true
 
  - schema: "public"
    name: "projects"
    transformers:
      - name: "Template"
        params:
          column: "name"
          template: 'UAT Project {{- .GetColumnValue "id" -}}'
          validate: true
 
      - name: "Template"
        params:
          column: "description"
          template: 'Masked UAT project generated from production project {{- .GetColumnValue "id" -}}'
          validate: true

The output does not need to be pretty. It needs to be safe and still useful for testing.

Treat Free Text as Guilty Until Reviewed

Free-text fields are where sensitive data hides. A column called comment or description might contain customer names, internal URLs, credentials, snippets from support tickets, error logs, or copied report content.

I treat these as sensitive by default:

comments
notes
descriptions
issue text
activity logs
error messages
scan output
support messages
custom fields
AI-generated explanations
manual testing notes

Example rules:

  - schema: "public"
    name: "issue_comments"
    transformers:
      - name: "Template"
        params:
          column: "comment"
          template: 'Masked UAT comment for issue {{- .GetColumnValue "issue_id" -}}'
          validate: true
 
  - schema: "public"
    name: "issues"
    transformers:
      - name: "Template"
        params:
          column: "title"
          template: 'UAT Issue {{- .GetColumnValue "id" -}}'
          validate: true
 
      - name: "Template"
        params:
          column: "description"
          template: 'Masked UAT issue description for issue {{- .GetColumnValue "id" -}}'
          validate: true
 
      - name: "Template"
        params:
          column: "recommendation"
          template: 'Masked remediation guidance for UAT validation.'
          validate: true

Again, the aim is not literary quality. The aim is to keep screens, filters, counts, permissions, and workflows realistic without bringing production content along for the ride.

Drop Runtime Secrets Completely

Some tables should not be masked. They should be empty.

Auth tokens, sessions, API keys, password resets, OAuth credentials, webhook secrets, and invite tokens usually have no business entering UAT.

  - schema: "public"
    name: "tokens"
    query: "select * from public.tokens where false"

That keeps the table in the dump but writes zero rows.

I use this pattern for tables like:

tokens
sessions
refresh_tokens
password_reset_tokens
email_verification_tokens
oauth_credentials
webhook_secrets
api_keys

The rule of thumb is simple: if the value can authenticate, authorize, impersonate, call an API, or access a file, it should not be restored into UAT.

Copy Lookup Tables Carefully

Not every table needs a rule. Static lookup tables often define application behavior rather than customer data.

These are usually safe to copy as-is after a quick review:

roles
severity
guidelines
success_criteria
country
state
project_status
testing_status
reporting_status
assistive_technology
project_environment
project_platform

The review matters because teams sometimes mix customer-specific configuration into tables that look like harmless lookup data.

When I classify a schema, I use these buckets:

BucketMeaningAction
Sensitive dataPII, customer content, secrets, or production URLsMask columns
Dangerous runtime dataTokens, sessions, credentials, secretsSkip rows
Reference dataStatic application lookup dataCopy as-is
Mixed dataConfig plus customer-specific contentMask selectively
UnknownNot reviewed yetTreat as sensitive

The last row is important. Unknown does not mean safe.

Validate Before You Restore Anything

Before creating a dump or restoring into UAT, validate the config from the runner container:

greenmask --config /config/greenmask.yml validate --data --diff --transformed-only

Look at the transformed output. Do not treat validation as a green checkmark you blindly accept. You are looking for evidence that the risky fields changed:

emails
names
client names
project names
URLs
comments
issue text
tokens
code fields
free-text descriptions

Only after that do I run the first dump:

greenmask --config /config/greenmask.yml dump

At this point production is still only being read. UAT has not been touched. That separation is useful because it lets you prove the masking side before allowing a destructive restore.

Then restore into UAT:

greenmask --config /config/greenmask.yml restore latest

Before a manual restore, I still check the config like a person who enjoys sleeping:

restore.pg_restore_options.dbname points to app_uat
restore.pg_restore_options.dbname does not point to app_prod

Add Post-Restore Checks

A refresh pipeline should not end at "restore succeeded." Restores can succeed while masking is incomplete.

After the restore, run SQL checks against UAT:

SELECT COUNT(*) FROM users
WHERE email !~ '^uat-user-[0-9]+@example\.test$';
SELECT COUNT(*) FROM tokens;
SELECT COUNT(*) FROM clients
WHERE name NOT LIKE 'UAT Client %';
SELECT COUNT(*) FROM projects
WHERE name NOT LIKE 'UAT Project %';
SELECT COUNT(*) FROM issue_comments
WHERE comment NOT LIKE 'Masked UAT comment%';
SELECT COUNT(*) FROM urls
WHERE url NOT LIKE 'https://uat-target.example.test/%';

Each query should return:

0

Add checks for your own sensitive categories:

real customer emails
phone numbers
access tokens
refresh tokens
API keys
webhook secrets
production URLs
uploaded file paths
payment identifiers
OAuth credentials
third-party integration secrets

Schema changes are the reason these checks matter. A new sensitive column can be added to production later and silently copied into UAT unless the pipeline has a second line of defense.

Keep the UAT Application Isolated Too

A masked database is only one part of UAT safety. The application runtime also needs to be separate.

Example:

NODE_ENV=uat
APP_URL=https://uat.example.com
DATABASE_URL=postgresql://uat_user:uat_password@app-uat-postgres:5432/app_uat

Do not reuse production integrations for:

email providers
file storage
payment gateways
scan APIs
analytics
webhooks
third-party integrations

UAT should not send real customer emails, trigger production webhooks, charge real payments, write to production storage, or report analytics into the production stream. If the database is disposable but the integrations are live, the environment is not actually safe.

Schedule It Only After Manual Runs

Once the manual flow works, create a scheduled job in Dokploy:

  1. Open Schedule Jobs.
  2. Create a new job.
  3. Select the greenmask-runner container.
  4. Use this command:
/bin/sh -lc 'greenmask --config /config/greenmask.yml dump && greenmask --config /config/greenmask.yml restore latest'

A daily 2 AM server-time schedule looks like this:

0 2 * * *

I do not enable the schedule on day one. I run the job manually, review the output, inspect UAT, and repeat that for a few cycles. Once the process is boring, then it can be automatic.

My rollout checklist is usually:

1. Create a separate UAT database.
2. Add the Greenmask runner container.
3. Add the Greenmask config.
4. Classify sensitive tables.
5. Add masking rules.
6. Validate masking output.
7. Run the first masked dump.
8. Restore into a disposable UAT database.
9. Run SQL safety checks.
10. Point the UAT app to the UAT database.
11. Test login and core UAT flows.
12. Run the refresh manually for a few cycles.
13. Enable the Dokploy scheduled job.

The Guardrails I Would Not Skip

Use a read-only production user. Greenmask only needs SELECT for the dump. The production user should not be able to write, truncate, drop, alter, or create.

Keep restore credentials UAT-only. The restore user should not even be valid against production.

Make the names obvious. Prefer app_prod, app_uat, app-prod-postgres, and app-uat-postgres over names like db, main, default, or postgres.

Do not reuse production integrations. Use sandbox credentials for email, storage, payments, OAuth apps, webhooks, analytics, logging, and external APIs.

Check masking after every restore. Greenmask config is not a substitute for verification, especially as the schema evolves.

The safest model is:

Production is read-only input.
Greenmask is the transformation boundary.
UAT is disposable output.

If those three statements stay true, the refresh pipeline becomes predictable enough to run on a schedule without turning every refresh into a production risk.