Common Data Masking Patterns
This guide provides masking examples for common data patterns. These examples can be used as a baseline and further developed to suit your specific masking needs.
Credit Card Numbers
Generating random 16 digit card numbers
DataMasque provides the tools to easily generate random replacement 16 digit card numbers.
The example ruleset below uses the from_random_number
mask to generates random numbers between the smallest
possible 16 digit number (min: 1000000000000000
) and the largest possible 16 digit number (max: 9999999999999999
).
This ensures the that a 16 digit card number is always generated.
version: '1.0'
tasks:
- type: mask_table
table: customer
key: id
rules:
- column: card_number
masks:
- type: from_random_number
max: 9999999999999999
min: 1000000000000000
Show result
Before | After |
|
|
---|
Generating 16 digit credit card numbers with issuer prefixes
You can generate replacement masked values with a combination of a valid, randomly selected issuer prefix and random numbers.
In the example provided below, we are concatenating 2 mask rules in order generate a 16 digit credit card number that has a valid issuer prefix.
- The first mask selects a random prefix from this seed file DataMasque_credit-card-prefixes.csv.
- The second mask generates a random 14 digit number.
- The results of these 2 masks are concatenated together, creating a 16 digit card number with a 2 digit issuer prefix.
Please refer to our Supplementary Files user guide for detailed information on how to use seed files.
version: '1.0'
tasks:
- type: mask_table
table: customer
key: id
rules:
- column: card_number
masks:
- type: concat
masks:
- type: from_file
seed_column: prefix
seed_file: DataMasque_credit-card-prefixes.csv
- type: from_random_number
max: 99999999999999
min: 10000000000000
Show result
Before | After |
|
|
---|
Masking to Primary Account Number (PAN) format
Datamasque provides masking rules which can be used to easily generate masked replacement values to Primary Account Number (PAN) format. PAN format allows displaying the first 6 digits and the last 4 digits of the original credit card number.
In the example provided below, we are using the replace_substring
mask to replace the middle 6 digits of
the card number with 'x' characters, leaving the first 6 digits of the card number and the last 4 digits of
the card number unchanged. The substring to replace starts at the start_index
of 6, which is the 7th
digit of the card number, and ends at end_index: 12
, which is the 12th digit of the card number. The 6
characters at index values 6, 7, 8, 9, 10, and 11 are replaced with the value of xxxxxx
generated by the
from_fixed
task type.
version: '1.0'
tasks:
- type: mask_table
table: customer
key: id
rules:
- column: card_number
masks:
- type: replace_substring
start_index: 6
end_index: 12
masks:
- type: from_fixed
value: "xxxxxx"
Show result
Before | After |
|
|
---|
Email Addresses
Dynamically generating email addresses using other columns
DataMasque can generate replacement email addresses based on values in other columns to ensure that the replacement email address is a realistic representation. In this example, we are generating an email address that is a combination of first name and last name columns in the same table. This will ensure that the generated email matches the names within the same table row.
In the example provided below, we are concatenating 3 mask rules in order to generate a realistic email address. After this, the entire mask is converted to lower case.
The following mask rules are applied in order:
- The first mask uses
from_column
to take the value of thefirst_name
column. - The second mask uses
from_fixed
to generate a string consisting of the.
(period) character. - The third mask uses
from_column
the value of thelast_name
column. - The value of these 3 masks are concatenated.
- The next mask in the
chain
block is executed, converting the result of the concatenated masks to lower case.
Please refer to our guide on using the chain
task type with concat
in our
Ruleset YAML Specification user guide.
version: '1.0'
tasks:
- type: mask_table
table: customer
key: id
rules:
- column: email_address
masks:
- type: chain
masks:
- type: concat
masks:
- type: from_column
source_column: first_name
- type: from_fixed
value: '.'
- type: from_column
source_column: last_name
- type: from_file
seed_file: DataMasque_email_suffixes.csv
seed_column: email-suff
- type: transform_case
transform: lowercase
The ruleset above will produce the email addresses below, given the following first_name
and last_name
values present in the table.
first_name | last_name | email_address |
---|---|---|
Andrew | Brown | andrew.brown@hotmail.com |
Cindy | Dixon | cindy.dixon@gmail.com |
Edmund | Frank | edmund.frank@outlook.com |