Masking Functions
Masks are the basic "building-block" algorithms provided by DataMasque for generating and manipulating column values. Multiple masks can be combined in a list to create a pipeline of transformations on the data, or combined using combinator masks to build up more complex output values.
Parameters
Mask algorithms are defined by their type
parameter - this parameter is common
to (and required by) all masks:
type
(required) determines the type of mask, and therefore what other parameters can be specified.
Note: Masks operate by either manipulating the original column value, or by generating an entirely new value that replaces the original value. The behaviour depends on the mask type.
- Generic masks
- Fixed value (
from_fixed
)
Outputs a fixed value - From column (
from_column
)
Outputs from another column (or table) - From file (
from_file
)
Generates values sourced from a CSV file - From blob (
from_blob
)
Outputs a value that is the entire contents of a file - From format string (
from_format_string
)
Generates random values according to a format string - From choices (
from_choices
)
Generates a random value picked from a (optionally weighted) set of values.
- Fixed value (
- String masks
- Imitate (
imitate
)
Replaces characters with other characters from the same set (letters for letters, numbers for numbers) - Random text (
from_random_text
)
Generates random strings of letters - Transform case (
transform_case
)
Transforms the case of characters in a string - Substring (
take_substring
)
Extracts a substring from a column's value - Replace substring (
replace_substring
)
Applies masks to a specific portion of string values - Replace regular expression (
replace_regex
)
Applies masks to parts of string values that match a given regular expression - Substitute (
substitute
) (deprecated)
Deprecated - Useimitate
instead
- Imitate (
- Data Pattern Masks
- Credit Card (
credit_card
)
Replaces credit card values with random ones - Brazilian CPF (
brazilian_cpf
)
Replaces Brazilian CPF numbers with random ones - Social Security Number (
social_security_number
)
Replaces social security numbers with random ones
- Credit Card (
- Numeric masks
- Random Number (
from_random_number
)
Generates a random integer/decimal between two numbers - supports triangular or uniform distribution - Random Boolean (
from_random_boolean
)
Generates a random true/false or 1/0 value - Numeric Bucket (
numeric_bucket
)
Generates replacement numbers whilst retaining specified ranges
- Random Number (
- Date/time masks
- Random date/time (
from_random_datetime
)
Generates random dates and times - Random date (
from_random_date
)
Generates random dates (without time components) - Retain age (
retain_age
)
Transforms values into random date times that preserve the age (in years) - Retain date component (
retain_date_component
)
Transforms specific parts of a date to be random - Retain year (
retain_year
)
Transforms a date time randomly whilst keeping the same year
- Random date/time (
- Transformation masks
- Typecast (
typecast
)
Converts data to a different type - Do nothing (
do_nothing
)
Prevents specific data from being masked
- Typecast (
- Combination masks
- Concatenate (
concat
)
Combines the output of multiple mask operations together - Chain (
chain
)
Chains multiple mask operations, passing the output of one to the next
- Concatenate (
- Unique Masks
- From Unique (
from_unique
)
Generates random but unique strings or numbers, from a format string (databases only) - From Unique Imitate (
from_unique_imitate
)
Transforms strings or numbers to be random, retaining format and uniqueness
- From Unique (
- Document masks
- JSON (
json
)
Masks data inside JSON documents - XML (
xml
)
Masks data inside XML documents
- JSON (