DataMasque Portal

Common Data Masking Patterns

This guide provides masking examples for common data patterns. These examples can be used as a baseline and further developed to suit your specific masking needs.

Credit Card Numbers

Generating random 16 digit card numbers

DataMasque provides the tools to easily generate random replacement 16 digit card numbers.

The example ruleset below uses the from_random_number mask to generates random numbers between the smallest possible 16 digit number (min: 1000000000000000) and the largest possible 16 digit number (max: 9999999999999999). This ensures the that a 16 digit card number is always generated.

version: '1.0'
tasks:
  - type: mask_table
    table: customer
    key: id
    rules:
    - column: card_number
      masks:
        - type: from_random_number
          max: 9999999999999999
          min: 1000000000000000

Show result

Before After
card_number
1234567890123456
2345678901234567
3456789012345678
card_number
9901955563298573
8481099208721166
2474102508362565

Generating 16 digit credit card numbers with issuer prefixes

You can generate replacement masked values with a combination of a valid, randomly selected issuer prefix and random numbers.

In the example provided below, we are concatenating 2 mask rules in order generate a 16 digit credit card number that has a valid issuer prefix.

  1. The first mask selects a random prefix from this seed file DataMasque_credit-card-prefixes.csv.
  2. The second mask generates a random 14 digit number.
  3. The results of these 2 masks are concatenated together, creating a 16 digit card number with a 2 digit issuer prefix.

Please refer to our Supplementary Files user guide for detailed information on how to use seed files.

version: '1.0'
tasks:
  - type: mask_table
    table: customer
    key: id
    rules:
      - column: card_number
        masks:
          - type: concat
            masks:
              - type: from_file
                seed_column: prefix
                seed_file: DataMasque_credit-card-prefixes.csv
              - type: from_random_number
                max: 99999999999999
                min: 10000000000000

Show result

Before After
card_number
1234567890123456
2345678901234567
3456789012345678
card_number
4460277727115512
3673999219735487
8974158374290952

Masking to Primary Account Number (PAN) format

Datamasque provides masking rules which can be used to easily generate masked replacement values to Primary Account Number (PAN) format. PAN format allows displaying the first 6 digits and the last 4 digits of the original credit card number.

In the example provided below, we are using the replace_substring mask to replace the middle 6 digits of the card number with 'x' characters, leaving the first 6 digits of the card number and the last 4 digits of the card number unchanged. The substring to replace starts at the start_index of 6, which is the 7th digit of the card number, and ends at end_index: 12, which is the 12th digit of the card number. The 6 characters at index values 6, 7, 8, 9, 10, and 11 are replaced with the value of xxxxxx generated by the from_fixed task type.

version: '1.0'
tasks:
  - type: mask_table
    table: customer
    key: id
    rules:
      - column: card_number
        masks:
          - type: replace_substring
            start_index: 6
            end_index: 12
            masks:
                - type: from_fixed
                  value: "xxxxxx"

Show result

Before After
card_number
1234567890123456
2345678901234567
3456789012345678
card_number
123456xxxxxx3456
234567xxxxxx4567
345678xxxxxx5678

Email Addresses

Dynamically generating email addresses using other columns

DataMasque can generate replacement email addresses based on values in other columns to ensure that the replacement email address is a realistic representation. In this example, we are generating an email address that is a combination of first name and last name columns in the same table. This will ensure that the generated email matches the names within the same table row.

In the example provided below, we are concatenating 3 mask rules in order to generate a realistic email address. After this, the entire mask is converted to lower case.

The following mask rules are applied in order:

  1. The first mask uses from_column to take the value of the first_name column.
  2. The second mask uses from_fixed to generate a string consisting of the . (period) character.
  3. The third mask uses from_column the value of the last_name column.
  4. The value of these 3 masks are concatenated.
  5. The next mask in the chain block is executed, converting the result of the concatenated masks to lower case.

Please refer to our guide on using the chain task type with concat in our Ruleset YAML Specification user guide.

version: '1.0'
tasks:
  - type: mask_table
    table: customer
    key: id
    rules:
      - column: email_address
        masks:
          - type: chain
            masks:
              - type: concat
                masks:
                  - type: from_column
                    source_column: first_name
                  - type: from_fixed
                    value: '.'
                  - type: from_column
                    source_column: last_name
                  - type: from_file
                    seed_file: DataMasque_email_suffixes.csv
                    seed_column: email-suff            
              - type: transform_case
                transform: lowercase

The ruleset above will produce the email addresses below, given the following first_name and last_name values present in the table.

first_name last_name email_address
Andrew Brown andrew.brown@hotmail.com
Cindy Dixon cindy.dixon@gmail.com
Edmund Frank edmund.frank@outlook.com