Transformation Masks
Transformation masks transform the underlying data type they work on. These masks are typically used alongside other masks for more complex masking operations.
- Typecast (
typecast
)
Converts data to a different type - Do Nothing (
do_nothing
)
Prevents specific data from being masked
Typecast (typecast
)
A mask to convert a value from one data type to another. This is particularly useful when a value is stored as one data type in a database, but needs to be transformed to a different data type to apply certain mask functions within DataMasque.
If the typecast cannot be performed (e.g. because the value of the column is incompatible with the target data type), an error is raised.
Parameters
typecast_as
(required): The data type you wish to convert the column value to. Must be one of:date
,datetime
,string
,integer
,float
,decimal
, orboolean
.date_format
(optional): When converting a string to a datetime, this format string can be provided to specify the expected format of the string. Conversely, when converting a datetime to a string, this format string can be provided to determine the format of the resulting string. Format strings must be valid Python date/time format strings. See also Date Parsing in DataMasque for more information.
Notes
- With
typecast_as: integer
, thetypecast
mask can only convert floats, or strings representing floats, if the floating-point value has a zero decimal part. For example, both4.0
(float) and"4.000"
(string) are converted to the integer4
. If the value has a non-zero decimal part, or is a special float value likenan
(not a number) orinf
(infinity), then an error is raised. - With
typecast_as: integer
,typecast_as: float
ortypecast_as: decimal
, the Boolean valuestrue
andfalse
are converted to1
and0
respectively. - The
decimal
type is designed for use with Parquet columns of typedecimal128
, but is also compatible withDECIMAL
,NUMBER
(Oracle) andNUMERIC
(SQL Server) columns. If you experience failures masking data with thedecimal
typecast, try usingfloat
instead.
Example
This example will replace every start_date
in the employees
table with a fixed date value of 2010-01-01
. The
fixed date value is specified as a string and typecast as a datetime before being written to the database.
version: '1.0'
tasks:
- type: mask_table
table: employees
key: id
rules:
- column: start_date
masks:
- type: from_fixed
value: '2010-01-01'
- type: typecast
typecast_as: datetime
date_format: '%Y-%m-%d'
Manipulating date values as strings
If you wish to manipulate non-string values using their string representations, you can use the typecast
mask.
Using the pattern demonstrated below, you may convert the value to a string, apply some manipulations, and then convert
back to the original data type. For example, to truncate the day and month from a date while retaining the
year, you may do the following:
version: '1.0'
tasks:
- type: mask_table
table: employees
key: id
rules:
- column: date_of_birth
masks:
# Convert to string
- type: typecast
typecast_as: string
date_format: '%Y-%m-%d'
# Take the 'YYYY-' component and append a fixed value of
# '01-01' to create the value 'YYYY-01-01'
- type: concat
masks:
- type: replace_substring
start_index: 5
masks:
- type: from_fixed
value: '01-01'
# Convert back to original datatype
- type: typecast
typecast_as: datetime
date_format: '%Y-%m-%d'
Do nothing (do_nothing
)
A simple mask which does not mask the data instead, sets the original value. This is useful when there are certain items or elements of columns that are not intended to be masked.
For example, you can use it to retain certain known XML attributes and mask all the rest. For more information please refer to XML Documentation.