DataMasque glossary of terms
A connection defines the parameters and credentials that allow DataMasque to connect to and mask a target database.
By default, the mask types provided by DataMasque for generating random data (i.e.
from_random_text) will produce a completely random
masked value for each row the mask is applied to. This is the most secure option for masked data generation, as the
masked values are generated independently of the original unmasked values.
However, there are many cases in data masking that require repeated generation of the same masked value for a given
input value (i.e. the data generation must be deterministic). DataMasque achieves this by using hash-based algorithms
that securely generate 'deterministic random' values. These values are uniformly distributed, but are deterministic with
regards to their input. Random mask types are made deterministic by specifying the
hash_columns parameter on the corresponding masking rule.
Masks are the algorithms provided by DataMasque for generating and manipulating database column values. Some
mask types operate by modifying their input value (i.e.
take_substring), while others act as a source of values
When masks are combined in sequence they act as a pipeline, passing the output from one mask into the input of the next. The first mask in the sequence receives the original column value as input.
mask_table task requires a list of
rules. A rule describes the sequence of one or
more mask algorithms that will be applied to a single database column. In most cases you
can consider there to be a one-to-one mapping of rules to database columns.
A ruleset is the configuration that defines the tasks that will be executed by DataMasque during a masking run. Rulesets are created and edited using the ruleset editor, and are written in the YAML configuration language.
Tasks are the basic building blocks of a ruleset. Each task represents some action that DataMasque will take during a masking run. Different task types are available for common database masking needs:
- To create a temporary table from an SQL query, use the
- To truncate a table, use the
- To run a SQL script, use the
- To mask a table using DataMasque mask algorithms, use the
- Use the special
serialtask types to group subtasks for parallel execution.
For more information on tasks, see the Ruleset YAML Specification.