DataMasque Portal

Changelog

This document contains all notable changes included in each release of DataMasque.

The DataMasque versioning scheme follows semantic versioning convention MAJOR.MINOR.PATCH:

  • MAJOR version is incremented when incompatible API/schema changes are made
  • MINOR version is incremented when functionalities are added in a backwards compatible manner
  • PATCH version is incremented when backwards compatible bug fixes are made

[2.9.0] - 2022-11-14

Added

  • Object file masking support:

    • JSON file masking.
    • XML file masking.
    • Full-file redaction of other file types.
  • Tabular file masking support:

    • CSV files.
    • Parquet files.
    • Fixed-width column files.
  • File masking in AWS S3 and Azure Blob Storage.

  • JSON ruleset generator.

  • retain_date_component: mask a date but retain the year, month or day.

  • retain_year: mask a date's month and day, retaining the year.

  • from_choices: select random values from a list of choices, with optional weighting. A good alternative to from_file if there are only a small number of choices or if weighting is required.

  • typecast support for typecasting to date.

  • Support for XML documents/columns with encoding declaration.

  • Support for mask_unique_key on MySQL autoincrement columns.

  • Added Rerun button to perform a run again with the same connection(s) and ruleset.

  • Added Edit button for ruleset snapshot in run log.

  • Support for composite unique keys in Ruleset Generator.

Fixed

  • Credit card masking with unknown prefix when using retain_prefix does not error, instead it will retain just the first digit of the credit card.

  • Errors within sub-masks of chain or concat are displayed in the run log.

  • Default batch size is now 50,000 rows.

  • JSON masking of null database values no longer get converted to JSON nulls ("null" string).

  • MSSQL connections failing to be terminated if cancelled during a run_sql task.

  • Non-generic error message if trying to create a run with connection(s) or rulesets that do not exist.

  • Fixed cancelling MySQL runs.

  • mask_table runs will not start if attempting to mask a column that's also used as a key.

  • Default seed files do not contain non-ASCII characters, so width counts are compatible with any columns regardless of encoding.

  • If no license is present, then a non-generic error is shown when executing schema discovery.

  • Better handling of invalid SQL in run_sql tasks.

Changed

  • Xpath based hashing extracts the first value for single-item arrays instead of hashing on the array.

  • Column width is taken into account when generating from_random_text rules with Ruleset Generator.

  • Ruleset Generator uses , as the glue for addresses rather than ,.

  • Autoscaling of Celery workers is disabled, instead the pool size is 16 available processes.

  • Ruleset Editor size increase.

[2.8.1] - 2022-10-21

Changed

  • Performance improvements to MySQL table masking

[2.8.0] - 2022-09-23

Added

  • MySQL database support.

  • MSSQL Linked Server support.

  • Masking of XML columns, with new xml mask.

  • Simple and automatic character group replacement with new imitate mask.

  • Generate credit card numbers in many formats with new credit_card mask.

  • Mask numeric values into the same "bucket" with new numeric_bucket mask.

  • Mask dates while retaining age with new retain_age mask.

  • Elements of JSON or XML documents can be used as hash values.

  • Option to enforce consistency across multiple JSON elements when using the json mask.

  • Generated rulesets will now automatically add substring masks if masked data would exceed the length of the column.

  • Ruleset YAML can be uploaded through the web UI.

  • Primary key/unique key masking now automatically cascades through multiple tables.

  • Added warning that row counts might not be accurate if using skip or if.

Fixed

  • Improved column detection in ruleset generator for ages and names.

  • Fixed erroneous datatype mismatch error sometimes generated when using skip rules.

  • Reduce memory usage in run log API fetch.

  • Empty strings (or other specified values) in CSVs can now be configured to be treated as null.

  • Included seed files have had duplicate values removed.

  • Fixed random generation when using hash columns with replace_regex sub masks.

  • MSSQL database constraints are no longer updated in a dry run.

  • Improved consistency between sensitive data discovery web display and CSV export.

  • Fixed expected row count check when using where.

  • Rulesets with errors no longer create additional rulesets each time they are saved.

Changed

  • "Select All" in the ruleset generator now selects only visible rows.

  • Generated database IDs now have the dm- prefix to distinguish them from non-generated.

  • Data read from CSVs is now always treated as strings.

  • Hashing now differentiates correctly between null and the string "None".

[2.7.0] - 2022-08-02

Added

  • Better ruleset generation, more keywords are detected and used to generate rulesets.

  • Ruleset generation for Amazon Redshift databases.

  • Role based permissions, with Mask Builder and Mask Runner roles.

  • Masking of JSON columns, with new json mask.

  • Masking of values using format strings with the from_format_string mask.

  • Support for on-premise Active Directory with SAML Single Sign-On.

  • Script to update ALLOWED_HOSTS on DataMasque web admin.

  • PostgreSQL search_path now included in run logs.

  • Detecting and warning of tables in multiple schemas on PostgreSQL.

Fixed

  • Generated ruleset editor form prevents losing changes when navigating away.

  • Prevent deletion of built-in CSV seed files.

  • Connecting to MSSQL 2012 in some circumstances.

Changed

  • Buffer size (number of rows to fetch and mask at once) has been renamed to batch size.

  • Batch size may be specified on a per-table basis.

[2.6.0] - 2022-06-09

Added

  • Support for cross-schema masking.

  • The ruleset generation functionality to automate generating YAML rulesets.

  • Support to subscribe to email notifications for masking runs.

  • Support for format strings in mask_unique_key.

  • Support for UUID mask pattern that generates unique values in the Universal Unique Identifier (UUID) format.

  • Masking summary information to run logs.

  • Validation for the use of duplicate tables/columns in rulesets.

  • Support for the use of the wildcard character * when specifying keywords.

  • Support for the use of space or underscore _ or dash - when specifying keywords.

  • New address seed files.

Fixed

  • Cancelling tasks incorrectly labelled them as failed masking runs.

  • Setting ‘Continue on failure’ to true failed to allow masking runs to continue executing when there is a task failure.

Changed

  • For Microsoft SQL Server, page lock is disabled on target tables during masking runs.

  • Improved performance and decreased memory consumption of from_file masking.

  • Renamed custom keywords to global data classification keywords.

  • Renamed ignored keywords to global ignored keywords.

[2.5.0] - 2022-03-11

Added

  • Support for Amazon Redshift.

  • Support for specifying a schema name for PostgreSQL connections.

Changed

  • Expose additional API endpoints. See API Reference for more details.

  • Validate Run secret has a minimum of 20 characters.

  • Improved handling of case-insensitive table and column names for PostgreSQL databases.

  • Prevent double masking on the same column(s) that are part of multiple foreign key constraints.

Removed

  • Uniqueness validation of key column(s) in mask_table task.

[2.4.0] - 2021-11-05

Added

  • YAML Templating tool.

Deprecated

  • The name attribute of the ruleset YAML is deprecated in favour of ruleset name in the ruleset's property.

  • The random_seed attribute the ruleset YAML is deprecated in favour of the Run secret option.

[2.3.0] - 2021-09-30

Added

  • Support for PostgreSQL 11, 12 and 13.

  • Support for Microsoft SQL Server named instance.

  • Deployment support for AWS Marketplace.

  • Definitions to provide reusable task, mask and rule YAML blocks in ruleset.

Changed

  • Support multiple SQL statements in masking task run_sql.

[2.2.0] - 2021-08-05

Added

Changed

  • Improved parallelism implementation to further optimise data masking performance.

[2.1.0] - 2021-06-02

Added

  • SAML Single Sign-On (SSO) integration for Azure Active Directory.

Removed

  • Support for deployment on Cohesity v6.3.1 App Marketplace.

[2.0.0] - 2021-05-17

Added

  • The mask_unique_key task type to support masking of primary keys and unique keys.

Changed

  • Improved handling of case-insensitive table and column names for SQL Server databases.

Removed

  • The deprecated table_name attribute of the mask_table and truncate_table tasks. This attribute has been replaced by table.
  • The deprecated max_workers attribute of the mask_table task. This attribute has been replaced by workers.

[1.3.0] - 2021-04-30

Added

  • Support for Microsoft SQL Server 2019.

Fixed

  • An issue that caused an incorrect masking run status to be reported.

Changed

  • Documentation improvements for the workers and key of the mask_table task.
  • Licence quota consumption reduction can happen if database size reductions are sustained.
  • Database size calculation for licence quota consumed by Microsoft SQL Server databases now excludes offline files.
  • Simultaneous masking runs on the same connection are disallowed.

Deprecated

  • The table_name attribute of the mask_table and truncate_table tasks is deprecated in favour of table.
  • The max_workers attribute of the mask_table task is deprecated in favour of workers.

[1.2.2] - 2021-03-12

Added

  • Uniqueness validation of key column(s) in mask_table task for Microsoft SQL Server connections.
  • Automated quarterly usage summary email.

Fixed

  • An issue in detecting errors in parallel tasks.
  • An issue on masking key with value NULL for Microsoft SQL Server connections.

Changed

  • Documentation improvements on database privileges requirements and installation guides.

Removed

  • Upper limit on max_workers for mask_table tasks is removed.

Deprecated

  • The use of any value other than ROWID for key attribute in mask_table task is deprecated for Oracle connections.
  • The where attribute for mask_table task is deprecated.

[1.2.1] - 2020-12-11

Added

  • Support for joined table columns as hash_columns for deterministic masking.

[1.2.0] - 2020-11-16

Added

  • Additional support for Microsoft SQL Server 2012 and 2014.
  • Licence quota breaches and expiry notification.
  • Enhancement on Ruleset YAML editor with ruleset YAML schema validation, documentation hover display, and auto-complete.
  • System audit logs to web interface.
  • Deterministic / hash based masking.
  • Support for multiple Oracle wallets in database connections.
  • Sample input and output for each supplied mask in the user guide.
  • Masks to generate random decimal numbers, booleans, and dates.
  • A Continue on failure option in the web interface to perform masking runs that will continue on task failures.
  • Deployment support for Cohesity v6.5.1 App Marketplace.

Fixed

  • An issue that caused the from_random_text mask to ignore the ruleset random_seed parameter.
  • Timezone truncation when masking TIMESTAMP WITH TIME ZONE columns in Oracle databases.
  • An issue that displayed misleading error message on ruleset editor.

Changed

  • Migrate to a cumulative usage licensing quota.
  • run_sql now runs queries with 'auto commit' enabled.
  • Now supports both SSL version 1.0 and 1.2 (previously 1.0 only) for Oracle Wallet.
  • Improved API performance.

[1.1.0] - 2020-06-23

Added

  • Multi-user support.

[1.0.0] - 2020-06-04

DataMasque is a best-of-breed data masking solution that empowers organisations to take control of their data security and makes protecting privacy, identity and rights as secure and straightforward as possible.

DataMasque champions commitment to data privacy and is fundamentally built and designed to promote masking irreversibility.