This document contains all notable changes included in each release of DataMasque.
The DataMasque versioning scheme follows semantic versioning convention MAJOR.MINOR.PATCH:
- MAJOR version is incremented when incompatible API/schema changes are made
- MINOR version is incremented when functionalities are added in a backwards compatible manner
- PATCH version is incremented when backwards compatible bug fixes are made
[2.14.0] - 2023-09-07
from_blobmask to replace binary data in
BLOBcolumns or files. Support is included for Oracle databases or entire files using
- The sidebar menu can now be toggled by clicking anywhere on it.
[2.13.0] - 2023-08-18
DataMasque now supports deployments using Redhat Podman.
jsonmasks now support
hash_sourcesto fetch hash data from inside the current XML or JSON document.
Support for Oracle Native Network Encryption (NNE).
Ruleset generator can filter to show sensitive columns only.
Run log filtering and searching.
value_on_missingoption has been added to
from_fileto specify a value to be inserted instead of
nullwhen a value can't be found in a CSV seed file.
The list of run logs is server side paginated for faster loading.
The ruleset generator process is now asynchronous.
Multiple foreign keys in the ruleset generator table are displayed on separate lines for better readability.
Login session are limited to 12 hours maximum, with a one-hour inactivity timeout.
All configurable run options are recorded in the run logs for auditing.
The Add User panel does not hide after a user is added, so adding multiple users is easier.
Updated password policy to incorporate NIST standards.
Columns or attributes masked by
mask_tabular_filetasks are now listed in the run preview and run log.
The side menu can be clicked anywhere to expand/collapse.
Schema discovery does not fail if the schema has no tables.
Using Select All in the ruleset generator now accumulates selections.
Runs can no longer be started with ruleset of the wrong type (e.g. files for DB connections, and vice versa).
The full ruleset and connection names are now displayed on the run log screen.
Failed conditional comparisons now log the datatypes correctly.
Sorting users by role in the UI now works correctly.
Updating a user's username no longer removes their role.
Invalid data in batch size and max rows now shows validation errors.
An error is now shown if the run secret is too short.
Invalid license messages now shown in run log.
MySQL errors in
run_sqltasks are captured and shown in run log.
[2.12.0] - 2023-06-09
xmlmasks for consistent generation of lists of values, when using
A mixed-gendered firstname seed file (
gendercolumn for filtering.
hash_columnsnow support a
trimparameter to trim surrounding whitespace on hash value(s).
Admin server exceptions are now logged to a file for easier troubleshooting.
Amazon DynamoDB supports conditionals on columns that don't exist, when
on_missingis set to
The region of an Amazon DynamoDB table may now be set in the connection dialogue, instead of having to be set in the ruleset.
Added support for IMDSv2 on AWS EC2.
--non-interactiveflag for the DataMasque installer, for unattended updates.
Performance improvements to Ruleset Generator.
Performance improvements to JSON rule generator.
Generated rulesets now contain comments about how keys columns are masked, and if extra columns have been included due to being part of a composite key.
Logs for Azure Blob Storage connections have reduced verbosity to decrease the amount of unnecessary log entries.
The run log display now supports colours for different log levels.
Improved logging of errors if the agent is unable to communicate with the admin server, for easier troubleshooting.
Warnings about rows being skipped due to
skipis only shown once per table (per run).
imitateis now the default fallback mask in Ruleset Generator if no matches are found (previously it was
Amazon DynamoDB permissions are checked at the start of the masking run, rather than at the end, for quicker time to failure.
substitutemask is renamed to
substituteis still retained for backwards compatibility, but is deprecated and may be removed from a future version of DataMasque.
Amazon DynamoDB is now masked in chunks to prevent high memory usage. The file size for masking is configurable per run.
Values being set on Amazon DynamoDB key columns are now attempted to be cast to the correct type.
Masking of Amazon DynamoDB tables with on-demand or provisioned capacities, as well as local/global secondary indexes with either on-demand or provisioned capacities.
XML declaration is retained even if it does not contain an encoding.
XML conditions support XML documents with or without declarations.
Deselect all columns on the Ruleset Generator now works correctly.
Visibility of Add File Connection button for the Mask Builder role.
A useful error is now shown when trying to compare timezone-aware and timezone-naive times in a condition.
Public accessibility checking of S3 buckets are now more thorough, and show more useful error messages for disabling public access.
[2.11.2] - 2023-05-15
Performance improvements to file masking on AWS S3.
Performance improvements to masking of NDJSON and Apache Avro files (on all file connection types).
Performance improvements to schema discovery with Ruleset Generator.
Masking against databases without a database ID no longer fails. This mostly affects MSSQL but the fix applies to all database types.
Fixed support for MySQL databases where the connection's hostname combined with the database name is more than 100 characters (e.g. some AWS RDS configurations).
Improved detection of client IP addresses when DataMasque is used with an HTTP(S) load balancer.
[2.11.1] - 2023-04-03
- Select All checkbox in Ruleset Generator no longer selects unselectable foreign keys that should only be masked by cascade.
[2.11.0] - 2023-03-29
Support for masking Amazon DynamoDB.
Conditional masking for files, including
elserules. Conditions can be applied to XML or JSON documents.
Support for masking NDJSON files.
Support for masking Apache Avro files.
Conditions in database masking may use predicates from XML or JSON data in columns.
Support for shortcuts in date-based conditional masking, e.g.
from_uniquemask to generate unique values for use in columns or databases without unique constraints.
A previous version of DataMasque can now overwrite a newer version on install, by providing the
--forceflag to the DataMasque installer (the
Improvements to date of birth column detection in the Ruleset Generator.
delimiteroption can be specified for tabular file masking to set the delimiter of character delimited files. For example,
delimiter: "\t"for tab-delimited files. When omitted, this defaults to
Support for masking UUID columns in PostgreSQL.
Added special UTF8 versions of some default seed files for use with columns or files that support UTF8 encoded input. These files have
UTF8in their names. Any default seed file without
UTF8in its name contains only ASCII characters.
Support for XML namespaces. This includes the
hash_sourcesand conditional masking.
includerules can now be applied to the entire path name or just the file name.
from_random_datetimemasks can now specify
maxvalue. This allows the ruleset to stay up to date with the current execution time.
nowcan also be used as a shorter synonym of
Support for retaining parameters when re-running a previous run.
Batch size, ruleset UUID and name are all displayed on their own line in the run log, for easier auditing.
Task configuration (ruleset) is now added to the runlog for all tasks.
Ruleset Generator now supports a wider range of unique key generators.
update_foreign_keysoption to unique key masking of additional cascades.
The default value of
xmlmasks is now
error. That is, if this option was not specified in prior versions of DataMasque, it would default to
skip. Attempting to mask an XML or JSON node that did not exist would simply skip to the next mask. From 2.11 onwards, the task will now fail. Previous behaviour can be restored by explicitly setting
on_missing: skipin the relevant places in the ruleset.
The same table and/or columns may be masked more than once in a single ruleset. This now produces a warning instead of an error. The exception to this are:
- Amazon DynamoDB, a table may only be masked once per ruleset.
- The same table can not be masked multiple times in a
The same file cannot be masked multiple times in a single ruleset. Using
includerules can prevent file double-ups. The masking run will not start if multiple masks apply to the same file.
retain_yearmasks have been updated to retain a
nullwas retrieved from an SQL column. Previously they would fail or use a fallback mask.
Tabular file masking now defaults to character delimited files (e.g. CSV ) for files that don't match other known prefixes.
The file extension for fixed-width files can now be specified with or without a leading
imitatemask was renamed to
imitateis still available for backwards compatibility but may be removed in a future version of DataMasque.
jsonmask now returns the same type it received, for example, JSON encoded as a string will be returned as a JSON encoded as a string; already decoded JSON data will be returned as a raw object.
The minimum year in
retain_date_componentdefaults to 100 years less than the maximum year (if not specified). Previously it was 100 years before today.
Foreign keys can no longer be selected in the ruleset generator. Instead, their foreign primary key should be masked and cascaded.
Multiple target column formats of
mask_unique_keyare allowed to be variable-width.
Unique key masking cascades now work for an arbitrary number of levels, and will cascade to foreign keys that reference supersets of the set of masked target key columns
update_foreign_keysoption to unique key masking of additional cascades.
Better support for conditionals between different types, e.g. floats, ints and decimals.
Better support for conditional dates parsed from strings.
Improved support for ruleset generator when working with composite keys.
Support for files masking with blank root directory.
Run log counts for total items masked fixed for parallel or failed tasks.
Glob matching on filename now enters into all directories to check for matching files.
Corrected file-masking IAM example roles in documentation.
Long ruleset names no longer overflow their container on the Run Preview screen.
jsonmask no longer fails if being applied to
nullor scalar values.
The special schema discovery runs are not allowed to be rerun in the run log list.
Only connections that support schema discovery are listed in the Ruleset Generator connections menu.
Ruleset Generator now supports MSSQL
hash_sources are now allowed in a file masking ruleset.
Files with unsupported encodings now show a clearer error message.
skipoption for skipping values in tabular file masking.
nullreplacements even if all rows contain
nullin a key column.
For MySQL, constraints added during masking are now successfully removed.
The run log now reports the actual batch sized used for
mask_unique_key(e.g. if the batch size is made smaller than that specified because it was bigger than the number of rows in the target table).
[2.10.1] - 2023-01-24
xmlmask no longer requires a
fallback_maskto be specified.
[2.10.0] - 2022-12-12
Performance improvements to MSSQL Linked Server table masking.
Performance improvements to Ruleset Generator.
mask_typeparameter in Connection Test API is optional for backwards compatibility. Defaults to
Additional informational errors in file masking runs.
Link to File Ruleset Creator inside File Ruleset List.
[2.9.0] - 2022-11-14
Object file masking support:
- JSON file masking.
- XML file masking.
- Full-file redaction of other file types.
Tabular file masking support:
- CSV files.
- Parquet files.
- Fixed-width column files.
File masking in AWS S3 and Azure Blob Storage.
JSON mask generator.
retain_date_component: mask a date but retain the year, month or day.
retain_year: mask a date's month and day, retaining the year.
from_choices: select random values from a list of choices, with optional weighting. A good alternative to
from_fileif there are only a small number of choices or if weighting is required.
typecastsupport for typecasting to
Support for XML documents/columns with encoding declaration.
mask_unique_keyon MySQL autoincrement columns.
Added Rerun button to perform a run again with the same connection(s) and ruleset.
Added Edit button for ruleset snapshot in run log.
Support for composite unique keys in Ruleset Generator.
Credit card masking with unknown prefix when using
retain_prefixdoes not error, instead it will retain just the first digit of the credit card.
Errors within sub-masks of
concatare displayed in the run log.
Default batch size is now 50,000 rows.
JSON masking of
nulldatabase values no longer get converted to JSON
MSSQL connections failing to be terminated if cancelled during a
Non-generic error message if trying to create a run with connection(s) or rulesets that do not exist.
Fixed cancelling MySQL runs.
mask_tableruns will not start if attempting to mask a column that's also used as a key.
Default seed files do not contain non-ASCII characters, so width counts are compatible with any columns regardless of encoding.
If no license is present, then a non-generic error is shown when executing schema discovery.
Better handling of invalid SQL in
Xpath based hashing extracts the first value for single-item arrays instead of hashing on the array.
Column width is taken into account when generating
from_random_textrules with Ruleset Generator.
Ruleset Generator uses
,as the glue for addresses rather than
Autoscaling of Celery workers is disabled, instead the pool size is 16 available processes.
Ruleset Editor size increase.
[2.8.1] - 2022-10-21
- Performance improvements to MySQL table masking.
[2.8.0] - 2022-09-23
MySQL database support.
MSSQL Linked Server support.
Masking of XML columns, with new
Simple and automatic character group replacement with new
Generate credit card numbers in many formats with new
Mask numeric values into the same "bucket" with new
Mask dates while retaining age with new
Elements of JSON or XML documents can be used as hash values.
Option to enforce consistency across multiple JSON elements when using the
Generated rulesets will now automatically add substring masks if masked data would exceed the length of the column.
Ruleset YAML can be uploaded through the web UI.
Primary key/unique key masking now automatically cascades through multiple tables.
Added warning that row counts might not be accurate if using
Improved column detection in ruleset generator for ages and names.
Fixed erroneous datatype mismatch error sometimes generated when using
Reduce memory usage in run log API fetch.
Empty strings (or other specified values) in CSVs can now be configured to be treated as
Included seed files have had duplicate values removed.
Fixed random generation when using hash columns with
MSSQL database constraints are no longer updated in a dry run.
Improved consistency between sensitive data discovery web display and CSV export.
Fixed expected row count check when using
Rulesets with errors no longer create additional rulesets each time they are saved.
"Select All" in the ruleset generator now selects only visible rows.
Generated database IDs now have the
dm-prefix to distinguish them from non-generated.
Data read from CSVs is now always treated as strings.
Hashing now differentiates correctly between
nulland the string
[2.7.2] - 2022-08-11
Values in CSV seed files are now treated as strings to preserve formatting and leading zeros in numeric columns.
CSV quoting added to default CSV seed files.
Added sorting of MSSQL key column names for consistency when masking tables with composite keys.
More robust MSSQL database ID retrieval.
[2.7.1] - 2022-08-04
- Saving Rulesets would sometimes cause an error.
[2.7.0] - 2022-08-02
Better ruleset generation, more keywords are detected and used to generate rulesets.
Ruleset generation for Amazon Redshift databases.
Role based permissions, with Mask Builder and Mask Runner roles.
Masking of JSON columns, with new
Masking of values using format strings with the
Support for on-premise Active Directory with SAML Single Sign-On.
Script to update
ALLOWED_HOSTSon DataMasque web admin.
search_pathnow included in run logs.
Detecting and warning of tables in multiple schemas on PostgreSQL.
Generated ruleset editor form prevents losing changes when navigating away.
Prevent deletion of built-in CSV seed files.
Connecting to MSSQL 2012 in some circumstances.
Buffer size (number of rows to fetch and mask at once) has been renamed to batch size.
Batch size may be specified on a per-table basis.
[2.6.1] - 2022-06-21
UI display issues on Dashboard.
Validation errors not being cleared after Ruleset errors fixed in Ruleset Editor.
Restore warning upon leaving Ruleset Editor page when there are unsaved changes.
List of PII categories in Ruleset Generator now displays correctly.
Add Ruleset button in Ruleset List now links directly to Ruleset Generator.
Additional cascades are no longer included in Ruleset generation.
[2.6.0] - 2022-06-09
Support for cross-schema masking.
The ruleset generation functionality to automate generating YAML rulesets.
Support to subscribe to email notifications for masking runs.
Support for format strings in
Support for UUID mask pattern that generates unique values in the Universal Unique Identifier (UUID) format.
Masking summary information to run logs.
Validation for the use of duplicate tables/columns in rulesets.
Support for the use of the wildcard character
*when specifying keywords.
Support for the use of space or underscore
-when specifying keywords.
New address seed files.
Cancelling tasks incorrectly labelled them as failed masking runs.
Setting ‘Continue on failure’ to true failed to allow masking runs to continue executing when there is a task failure.
For Microsoft SQL Server, page lock is disabled on target tables during masking runs.
Improved performance and decreased memory consumption of from_file masking.
Renamed custom keywords to global data classification keywords.
Renamed ignored keywords to global ignored keywords.
[2.5.0] - 2022-03-11
Support for Amazon Redshift.
Support for specifying a schema name for PostgreSQL connections.
Expose additional API endpoints. See API Reference for more details.
Validate Run secret has a minimum of 20 characters.
Improved handling of case-insensitive table and column names for PostgreSQL databases.
Prevent double masking on the same column(s) that are part of multiple foreign key constraints.
- Uniqueness validation of
keycolumn(s) in mask_table task.
[2.4.0] - 2021-11-05
- YAML Templating tool.
nameattribute of the ruleset YAML is deprecated in favour of ruleset name in the ruleset's property.
random_seedattribute the ruleset YAML is deprecated in favour of the Run secret option.
[2.3.0] - 2021-09-30
Support for PostgreSQL 11, 12 and 13.
Support for Microsoft SQL Server named instance.
Deployment support for AWS Marketplace.
Definitions to provide reusable task, mask and rule YAML blocks in ruleset.
- Support multiple SQL statements in masking task
[2.2.0] - 2021-08-05
Support for PostgreSQL 9.6 and 10.
Sensitive data discovery with built-in, custom and ignored keywords.
- Improved parallelism implementation to further optimise data masking performance.
[2.1.0] - 2021-06-02
- SAML Single Sign-On (SSO) integration for Azure Active Directory.
- Support for deployment on Cohesity v6.3.1 App Marketplace.
[2.0.0] - 2021-05-17
mask_unique_keytask type to support masking of primary keys and unique keys.
- Improved handling of case-insensitive table and column names for SQL Server databases.
- The deprecated
table_nameattribute of the
truncate_tabletasks. This attribute has been replaced by
- The deprecated
max_workersattribute of the
mask_tabletask. This attribute has been replaced by
[1.3.0] - 2021-04-30
- Support for Microsoft SQL Server 2019.
- An issue that caused an incorrect masking run status to be reported.
- Documentation improvements for the
- Licence quota consumption reduction can happen if database size reductions are sustained.
- Database size calculation for licence quota consumed by Microsoft SQL Server databases now excludes offline files.
- Simultaneous masking runs on the same connection are disallowed.
table_nameattribute of the
truncate_tabletasks is deprecated in favour of
max_workersattribute of the
mask_tabletask is deprecated in favour of
[1.2.2] - 2021-03-12
- Uniqueness validation of
mask_tabletask for Microsoft SQL Server connections.
- Automated quarterly usage summary email.
- An issue in detecting errors in parallel tasks.
- An issue on masking
NULLfor Microsoft SQL Server connections.
- Documentation improvements on database privileges requirements and installation guides.
- Upper limit on
mask_tabletasks is removed.
- The use of any value other than
mask_tabletask is deprecated for Oracle connections.
mask_tabletask is deprecated.
[1.2.1] - 2020-12-11
- Support for joined table columns as
hash_columnsfor deterministic masking.
[1.2.0] - 2020-11-16
- Additional support for Microsoft SQL Server 2012 and 2014.
- Licence quota breaches and expiry notification.
- Enhancement on Ruleset YAML editor with ruleset YAML schema validation, documentation hover display, and auto-complete.
- System audit logs to web interface.
- Deterministic / hash based masking.
- Support for multiple Oracle wallets in database connections.
- Sample input and output for each supplied mask in the user guide.
- Masks to generate random decimal numbers, booleans, and dates.
Continue on failureoption in the web interface to perform masking runs that will continue on task failures.
- Deployment support for Cohesity v6.5.1 App Marketplace.
- An issue that caused the
from_random_textmask to ignore the ruleset
- Timezone truncation when masking
TIMESTAMP WITH TIME ZONEcolumns in Oracle databases.
- An issue that displayed misleading error message on ruleset editor.
- Migrate to a cumulative usage licensing quota.
run_sqlnow runs queries with 'auto commit' enabled.
- Now supports both SSL version 1.0 and 1.2 (previously 1.0 only) for Oracle Wallet.
- Improved API performance.
[1.1.0] - 2020-06-23
- Multi-user support.
[1.0.0] - 2020-06-04
DataMasque is a best-of-breed data masking solution that empowers organisations to take control of their data security and makes protecting privacy, identity and rights as secure and straightforward as possible.
DataMasque champions commitment to data privacy and is fundamentally built and designed to promote masking irreversibility.