DataMasque Portal

Date/Time Masks

These masks are used for working with date or datetime values when masking.

Random date/time (from_random_datetime)

This mask replaces the column value with a random datetime. Generated datetime values have a resolution of 1 second (the millisecond component will be 0).

Parameters
  • max (required): The generated date/time is guaranteed to be less than this date/time, formatted according to ISO 8601 or you can specify now/current_date_time, which will be the date and time that the run executes.
  • min (required): This is the minimum value for the generated date/time, formatted according to ISO 8601 or you can specify now/current_date_time, which will be the date and time that the run executes.

Note A from_random_datetime mask can be used in conjunction with a Typecast mask in order to make sure the format of the generated date/datetime value satisfies any format requirements.

Example

This example replaces the order_time column with a random date time between December 1, 2019 00:00:00 UTC-0 and December 31, 2019 05:30:00 UTC-0.

version: '1.0'
tasks:
  - type: mask_table
    table: orders
    key: id
    rules:
      - column: order_time
        masks:
          - type: from_random_datetime
            min: '2019-12-01T00:00+00:00'
            max: '2019-12-31T05:30+00:00'

Show result

Before After
order_time
2019-12-07 19:50:12+00:00
2019-12-10 22:33:00+00:00
1966-12-22 07:09:01+00:00
2019-12-27 09:23:23+00:00
2019-12-29 12:06:50+00:00
2019-12-30 08:40:34+00:00
order_time
2019-12-23 20:02:50+00:00
2019-12-09 11:32:00+00:00
2019-12-12 21:31:41+00:00
2019-12-06 12:07:18+00:00
2019-12-02 05:28:45+00:00
2019-12-20 08:40:34+00:00


Random date (from_random_date)

This mask replaces the column value with a random date. Generated date values have a resolution of 1 day and do not include a time component.

Parameters
  • max (required): The generated date/time is guaranteed to be less than this date/time, formatted according to ISO 8601 or you can specify now/current_date_time, which will be the date and time that the run executes.
  • min (required): This is the minimum value for the generated date/time, formatted according to ISO 8601 or you can specify now/current_date_time, which will be the date and time that the run executes.

Note A from_random_date mask can be used in conjunction with a Typecast mask in order to make sure the format of the generated date value satisfies any format requirements.

Example

This example replaces the date_of_birth column with a random date between January 1 1950 and December 31 2000.

version: '1.0'
tasks:
  - type: mask_table
    table: employees
    key: id
    rules:
      - column: date_of_birth
        masks:
          - type: from_random_date
            min: '1950-01-01'
            max: '2000-12-31'

Show result

Before After
date_of_birth
1959-03-07
1965-11-10
1966-06-22
1979-02-27
1990-05-31
1999-07-31
date_of_birth
1997-08-25
1970-10-09
1995-01-18
1973-04-18
1961-02-26
1992-04-30

Note that this mask uses the ISO-8601 standard for date values. This allows for ISO weeks to be used for formatting dates, ISO week formats are specified with a 'W' in front of the week number and optionally followed by the day of the week. Additionally, day of the year can be specified as a three-digit number. See Date formats below.

Date formats
Format Example Equivalent
yyyy2022January 1, 2022
yyyy-mm2022-03March 1, 2022
yyyy-mm-dd2022-03-05March 5, 2022
yyyy-Www2022-W20Week 20 of 2022
(May 16, 2022)
yyyy-Www-d2022-W20-4Day 4 of Week 20 of 2022
(May 19, 2022)
yyyy-ddd2022-050Day 50 of 2022
(February 19, 2022)

Retain age (retain_age)

This mask is designed to be used on a column containing dates. It will mask by generating random dates in such a way that age (in years) is retained, calculated based on the date the mask is executed. This can be used to ensure that columns such as date of birth, transaction date, registration date or other creation dates are masked without conflicting with other rules that may rely on the age of the masked date.

Parameters
  • date_format (optional): If the column value is read as a string type, then it will be converted to a date using this format. The new randomly generated date will also be written back to the database in the same format. Format strings must be valid Python date/time format strings. Defaults to "%Y-%m-%d".
  • day_of_year (optional): For consistent masking, specify a day_of_year to calculate the ages on that day of the current year. This should be specified in the format "<month>-<day>"; for example, "8-1" for the 1st of August. The year will be set to the current year. If this option is omitted, then ages will be calculated based on the current date.
  • allow_future (optional): When set to true, this allows for the masking of future dates while preserving the age derived from the date. Dates in the future will only be masked with dates that are also in the future. If set to false and a future date is encountered, an error will be raised. Defaults to true.
Example

This example replaces the date_of_birth column values while retaining the employee's age. The dates are based on the masking run being executed on August 1st, 2022. The employees' ages are 50, 50, 49 and 22, respectively, and are maintained even with the new dates of birth.

version: '1.0'
tasks:
  - type: mask_table
    table: employees
    key: id
    rules:
      - column: date_of_birth
        masks:
          - type: retain_age

Show result

Before After
date_of_birth
1972-07-31
1972-08-01
1972-08-02
2000-02-29
date_of_birth
1972-05-30
1971-08-10
1973-04-21
1999-10-20

Note that the mask considers birthdays on the 29th of February to be celebrated on the 28th of February on non-leap-years.

If masking datetime values (rather than just dates) then the time portion of the masked value will be 00:00:00 (midnight).


Retain date component (retain_date_component)

This mask will generate a date where up to two of the year, month, or day components value will be retained.

Parameters
  • date_format (optional): If the column value is read as a string type, then it will be converted to a date using this format. The new randomly generated date will also be written back to the database in the same format. Format strings must be valid Python date/time format strings. Defaults to "%Y-%m-%d".
  • year (this OR month OR day required): This is a bool which specifies whether the year component of the date should be retained. Defaults to False.
  • month (this OR year OR day required): This is a bool which specifies whether the month component of the date should be retained. Defaults to False.
  • day (this OR year OR month required): This is a bool which specifies whether the day component of the date should be retained. Defaults to False.
  • minimum_year (optional): This is the minimum year that can be generated. Defaults to 100 years ago.
  • maximum_year(optional): This is the maximum year that can be generated. Defaults to current year.
  • force_change (optional): Ensures the newly generated date is different to the input date. On the chance that the same date is generated.
Example

In this example the month of each date is retained, while the day and year will be randomly generated. The year will be a value between 1980 and 2005 (inclusive).

version: '1.0'
tasks:
  - type: mask_table
    table: customers
    key: id
    rules:
      - column: date_of_birth
        masks:
          - type: retain_date_component
            month: True
            minimum_year: 1980
            maximum_year: 2005

Show result

Before After
date_of_birth
1972-07-31
2000-08-01
2010-10-02
1996-02-29
date_of_birth
1998-07-21
2004-08-15
1990-10-27
2000-02-12

Note: In order for retain_date_component to consistently produce the same values between runs in different years, a fixed maximum_year must be specified.


Retain year (retain_year)

This mask replaces the column value with a random date. The generated date will have the year component retained while randomising month and day.

Parameters
  • date_format (optional): If the column value is read as a string type, then it will be converted to a date using this format. The new randomly generated date will also be written back to the database in the same format. Format strings must be valid Python date/time format strings. Defaults to "%Y-%m-%d".
  • force_change (optional): Ensures the newly generated date is different to the input date. On the chance that the same date is generated.
Example

This example replaces the date_of_birth column values with new dates while retaining their year component.

version: '1.0'
tasks:
  - type: mask_table
    table: employees
    key: id
    rules:
      - column: date_of_birth
        masks:
          - type: retain_year
            force_change: True

Show result

Before After
date_of_birth
1972-07-31
1984-08-01
1996-08-02
2000-02-29
date_of_birth
1972-01-30
1984-11-10
1996-06-28
2000-02-13