Common Data Masking Patterns
This guide provides masking examples for common data patterns. These examples can be used as a baseline and further developed to suit your specific masking needs.
- Credit Card Numbers
- Email Addresses
Credit Card Numbers
Generating random 16 digit card numbers
DataMasque provides the tools to easily generate random replacement 16 digit card numbers.
The example ruleset below uses the
from_random_number mask to generates random numbers between the smallest
possible 16 digit number (
min: 1000000000000000) and the largest possible 16 digit number (
This ensures the that a 16 digit card number is always generated.
version: '1.0' tasks: - type: mask_table table: customer key: id rules: - column: card_number masks: - type: from_random_number max: 9999999999999999 min: 1000000000000000
Generating 16 digit credit card numbers with issuer prefixes
You can generate replacement masked values with a combination of a valid, randomly selected issuer prefix and random numbers.
In the example provided below, we are concatenating 2 mask rules in order generate a 16 digit credit card number that has a valid issuer prefix.
- The first mask selects a random prefix from this seed file DataMasque_credit-card-prefixes.csv.
- The second mask generates a random 14 digit number.
- The results of these 2 masks are concatenated together, creating a 16 digit card number with a 2 digit issuer prefix.
Please refer to our Supplementary Files user guide for detailed information on how to use seed files.
version: '1.0' tasks: - type: mask_table table: customer key: id rules: - column: card_number masks: - type: concat masks: - type: from_file seed_column: prefix seed_file: DataMasque_credit-card-prefixes.csv - type: from_random_number max: 99999999999999 min: 10000000000000
Masking to Primary Account Number (PAN) format
Datamasque provides masking rules which can be used to easily generate masked replacement values to Primary Account Number (PAN) format. PAN format allows displaying the first 6 digits and the last 4 digits of the original credit card number.
In the example provided below, we are using the
replace_substring mask to replace the middle 6 digits of
the card number with 'x' characters, leaving the first 6 digits of the card number and the last 4 digits of
the card number unchanged. The substring to replace starts at the
start_index of 6, which is the 7th
digit of the card number, and ends at
end_index: 12, which is the 12th digit of the card number. The 6
characters at index values 6, 7, 8, 9, 10, and 11 are replaced with the value of
xxxxxx generated by the
from_fixed task type.
version: '1.0' tasks: - type: mask_table table: customer key: id rules: - column: card_number masks: - type: replace_substring start_index: 6 end_index: 12 masks: - type: from_fixed value: "xxxxxx"
Dynamically generating email addresses using other columns
DataMasque can generate replacement email addresses based on values in other columns to ensure that the replacement email address is a realistic representation. In this example, we are generating an email address that is a combination of first name and last name columns in the same table. This will ensure that the generated email matches the names within the same table row.
In the example provided below, we are concatenating 3 mask rules in order to generate a realistic email address. After this, the entire mask is converted to lower case.
The following mask rules are applied in order:
- The first mask uses
from_columnto take the value of the
- The second mask uses
from_fixedto generate a string consisting of the
- The third mask uses
from_columnthe value of the
- The value of these 3 masks are concatenated.
- The next mask in the
chainblock is executed, converting the result of the concatenated masks to lower case.
Please refer to our guide on using the
chain task type with
concat in our
Ruleset YAML Specification user guide.
version: '1.0' tasks: - type: mask_table table: customer key: id rules: - column: email_address masks: - type: chain masks: - type: concat masks: - type: from_column source_column: first_name - type: from_fixed value: '.' - type: from_column source_column: last_name - type: from_file seed_file: DataMasque_email_suffixes.csv seed_column: email-suff - type: transform_case transform: lowercase
The ruleset above will produce the email addresses below, given the following
values present in the table.