API Endpoints

Authentication
- POST /api/auth/token/login/
User Object
Profile Object
- GET /api/users/me/profile/
- POST /api/users/me/profile/
Run Object
- GET /api/runs/
- POST /api/runs/
- GET /api/runs/{id}/
- POST /api/runs/{id}/cancel/
- GET /api/runs/validate/
- GET /api/runs/{id}/sdd-report/
- GET /api/runs/{id}/run-report/
- DELETE /api/runs/{id}/db-discovery-results/
- GET /api/runs/{id}/db-discovery-results/report/
- Option Object (Referenced in the /runs/ POST Request)
Runlog Object
- GET /api/runs/{id}/log/
- GET /api/runs/{id}/log/download/
Connection Object
Connection Fileset Object
Ruleset Object
Seed Object
- GET /api/seeds/
- POST /api/seeds/
Audit Log Object
- GET /api/audit-logs/
- GET /api/audit-logs/download/
Schema Discovery
Generating Rulesets
File Data Discovery
- POST /api/run-file-data-discovery
- GET /api/runs/{id}/file-discovery-results/
Oracle Wallets
Git Related Endpoints
Exporting DataMasque Configuration
- GET /api/export/v1/
Importing DataMasque Configuration
- POST /api/import/v1/
Other API Requests

Authentication

The DataMasque API uses token authentication. Tokens are 40-character strings containing 0-9 and a-f. Tokens should be included in the Authorization HTTP header for each request, with the word Token prepended.

For example

GET /runs/123/
Authorization: Token abcdef1234567890abcdef1234567890abcdef12

There are two types of authentication tokens:

A non-expiring API Token which has access to only some endpoints. You can get this token from the My Account page.
A User Token that is valid for only 12 hours, but has access to all endpoints. User tokens are granted by posting your username and password to the /api/auth/token/login/ endpoint.

The documentation for each endpoint on this page includes the type of token that is required to access it. If an endpoint does not require the use of the Authorization header then its authorization is noted as Anonymous.

The purpose and use case of each token type is explained below.

API Token

The API Token is a long-lived credential retrieved from the My Account page. It remains valid indefinitely, unless revoked (also on the My Account page). This token is valid only for use with specific API endpoints.

It is designed to be used in automated scripts whose content may not be stored securely, therefore it mainly has access to controlling masking runs and checking their status.

User Token

The User Token is exclusively issued after a successful login, either through the user interface or by making a request to /api/auth/token/login/.

This token offers enhanced security due to its limited lifetime, expiring after 12 hours, and is only accessible after a successful login. When accessing DataMasque through the UI, the token is granted as a cookie which will expire after 1 hour of inactivity.

It can be used against all API endpoints, and grants access based on the user account's permissions.

Both token types serve distinct purposes within the DataMasque API, offering a balance between security and usability.

POST /api/auth/token/login/

Authorization: Anonymous.

POST /api/auth/token/login/ Parameters

Field	Type	Required	Location	Description
`username`	`string`	Yes	Request Body	The username of the user you are logging in as.
`password`	`string`	Yes	Request Body	The password for the user.

POST /api/auth/token/login/ Responses

Status Code	Description
`200`	A JSON serialised user object, including a short-lived API key.

POST /api/auth/token/login/ Postman example

Open Postman.
Create a new request.
Set the method to POST and the URL to https://<your-datamasque-host>/auth/token/login/.
Under Headers, add Content-Type as a key and set the value as application/json.
Select the Body tab then the raw button.
Include your DataMasque login details in this format in the text editor shown:

{
  "username": "<your-username>",
  "password": "<your-password>"
}

Press the blue Send button to the right of the URL bar.

POST /api/auth/token/login/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/auth/token/login/" \
     -H "Content-Type: application/json" \
     -d '{"username": "<your-username>", "password": "<your-password>"}'

User Object

User objects have the following fields:

Field	Type	Description
`id`	`integer`	The `id` of the `User`.
`username`	`string`	The username for the `User`. Used when logging in.
`email`	`string`	The email of the `User`.
`date_joined`	`date`	The date the `User` was created.
`api_token`	`string`	The API token for the `User`.
`has_temporary_password`	`boolean`	Whether user has a temporary password or not. If true, the user has not finalised their account creation.
`is_active`	`boolean`	Whether or not the user account is active. If false, the account is disabled.
`is_staff`	`boolean`	Whether or not the user is a staff account.
`is_superuser`	`boolean`	Whether or not the account is a superuser and has admin privileges.
`is_sso_user`	`boolean`	Whether or not the account is an SSO enabled account.
`is_subscribed_to_sdd_updates`	`boolean`	Whether or not the user has subscribed to sensitive data discovery updates.
`user_roles`	`array[string]`	List of roles assigned to the user. Full list of roles can be found in User Roles
`user_permissions`	`array[string]`	List of permissions assigned to the user.

User Roles

User objects may be assigned one or none of the below roles, as part of their user_roles array.

Role	Description
`mask_runner`	A user with this role is responsible solely for executing masking operations.
`mask_builder`	In addition to the capabilities of the `mask_runner` role, this role includes the ability to create and manage rulesets.

Requests related to User Object:

GET /api/users/

Authorization: Admin User token only.

Returns a list of user accounts.

GET /api/users/ Parameters

No parameters.

GET /api/users/ Responses

Status Code	Description
`200`	Returns a JSON serialised list of User objects.

GET /api/users/ `curl` example

curl "https://<your-datamasque-host>/api/users/" \
     -H "Authorization: Token <your-api-token>"

GET /api/users/{id}/

Authorization: Admin User token or the user themselves.

Retrieve information about a specific user.

GET /api/users/{id}/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the user.

GET /api/users/{id}/ Responses

Status Code	Description
`200`	A JSON serialized User object for the specified user.
`403`	Forbidden: If the token does not have the required permissions.
`404`	Not Found: If the user with the specified id does not exist.

GET /api/users/{id}/ `curl` example

curl "https://<your-datamasque-host>/api/users/{id}/" \
     -H "Authorization: Token <your-api-token>"

GET /api/users/me/

Authorization: User token only.

Returns the details of the currently logged-in user.

GET /api/users/me/ Responses

Status Code	Description
`200`	Returns a JSON serialised User object for the user that is currently logged in.

GET /api/users/me/ `curl` example

curl "https://<your-datamasque-host>/api/users/me/" \
     -H "Authorization: Token <your-api-token>"

POST /api/users/me/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/users/me/profile/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{"git_directory_path": "path/to/root"}'

POST /api/users/

Authorization: Admin User token only.

Create a new user account.

POST /api/users/ Parameters

Field	Type	Required	Location	Description
`username`	`string`	Yes	Request Body	The username of the user being created.
`password`	`string`	Yes	Request Body	The password for the new user account.
`re_password`	`string`	Yes	Request Body	The password for the new user again, to confirm the password entered above.
`email`	`string`	Yes	Request Body	The email address of the new user.
`role`	`array[string]`	No	Request Body	The role(s) assigned to the user. If provided, the user will be added to the specified group(s). Defaults to no role which has the same permissions as `mask_runner`.

POST /api/users/ Responses

Status Code	Description
`201`	A JSON serialized User object for the created user.
`400`	Bad Request: If the request data is invalid or user creation is disabled.
`403`	Forbidden: If the token does not have the required permissions.

POST /api/users/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/users/" \
     -H "Authorization: Token <your-admin-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "username": "<your-new-username>",
           "password": "<your-new-password>",
           "re_password": "<your-new-password>",
           "email": "<your-new-email>",
           "role": "<your-user-role>"
         }'

GET /api/users/me/

Authorization: User token only.

Returns the details of the currently logged-in user.

GET /api/users/me/ Responses

Status Code	Description
`200`	Returns a JSON serialised User object for the user that is currently logged in.

GET /api/users/me/ `curl` example

curl "https://<your-datamasque-host>/api/users/me/" \
     -H "Authorization: Token <your-api-token>"

GET /api/users/{id}/

Authorization: Admin User token (to query any user's details) or the queried user's token.

Retrieve information about a specific user.

GET /api/users/{id}/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the user.

GET /api/users/{id}/ Responses

Status Code	Description
`200`	A JSON serialized User object for the specified user.
`403`	Forbidden: If the token does not have the required permissions.
`404`	Not Found: If the user with the specified `id` does not exist.

GET /api/users/{id}/ `curl` example

curl "https://<your-datamasque-host>/api/users/{id}/" \
     -H "Authorization: Token <your-api-token>"

PATCH /api/users/{id}/

Authorization: Admin User token (to update any user's details) or the updating user's token.

Partially update information for a specified user.

PATCH /api/users/{id}/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the user to update.
`username`	`string`	No	Request Body	The new username of the user. Only an Admin User can update this.
`email`	`string`	No	Request Body	The new email address of the user. An Admin User or the user themselves can update this.
`user_roles`	`array[string]`	No	Request Body	The role(s) assigned to the user. If provided, the user will be added to the specified group(s). Only an Admin User can update this.

PATCH /api/users/{id}/ Responses

Status Code	Description
`200`	A JSON serialized User object for the updated user.
`400`	Bad Request: If the request data is invalid.
`403`	Forbidden: If the token does not have the required permissions.
`404`	Not Found: If the user with the specified id does not exist.

PATCH /api/users/{id}/ `curl` example

curl -X PATCH "https://<your-datamasque-host>/api/users/{id}/" \
     -H "Authorization: Token <your-admin-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "username": "<your-new-username>",
           "email": "<your-new-email>",
           "user_roles": ["<user-role>"]
         }'

PUT /api/users/{id}/

Authorization: Admin User token (to update any user's details) or the updating user's token.

Update information for a specified user.

PUT /api/users/{id}/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the user to update.
`username`	`string`	No	Request Body	The new username of the user. Only an Admin User can update this.
`email`	`string`	No	Request Body	The new email address of the user. An Admin User or the user themselves can update this.
`user_roles`	`array[string]`	No	Request Body	The role(s) assigned to the user. If provided, the user will be added to the specified group(s). Only an Admin User can update this.

PUT /api/users/{id}/ Responses

Status Code	Description
`200`	A JSON serialized User object for the updated user.
`400`	Bad Request: If the request data is invalid.
`403`	Forbidden: If the token does not have the required permissions.
`404`	Not Found: If the user with the specified id does not exist.

PUT /api/users/{id}/ `curl` example

curl -X PUT "https://<your-datamasque-host>/api/users/{id}/" \
     -H "Authorization: Token <your-admin-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "username": "<your-new-username>",
           "email": "<your-new-email>",
           "user_roles": ["<user-role>"]
         }'

POST /api/users/{id}/reset-password/

Authorization: Admin User token only.

Reset the password for a specified user.

POST /api/users/{id}/reset-password/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the user whose password is being reset.

POST /api/users/{id}/reset-password/ Responses

Status Code	Description
`200`	Returns a JSON object with the new temporary password.
`403`	Forbidden: If the token does not have the required permissions.
`404`	Not Found: If the user with the specified id does not exist.

POST /api/users/{id}/reset-password/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/users/{id}/reset-password/" \
     -H "Authorization: Token <your-admin-api-token>" \
     -H "Content-Type: application/json"

Profile Object

A Profile object stores settings for a particular user. There is a one-to-one relationship between a user and their Profile. A Profile object may only be updated by the user that it belongs to (i.e. a user can only update their own Profile, admins cannot update Profiles of other users).

Profile objects have the following fields:

Field	Type	Description
`git_directory_path`	`string`	The Git directory path for this user when pushing/pulling rulesets to/from a Git repository.

Extra Field Notes

`git_directory_path`

This overrides the global Git directory for the DataMasque instance, for this user only. This value can be set even if Git integration is disabled, it will just have no effect.

GET /api/users/me/profile/

Authorization: User token only.

Returns the Profile Object for the currently logged-in user.

GET /api/users/me/profile/ Parameters

No parameters.

GET /api/users/me/profile/ Responses

Status Code	Description
`200`	Returns a JSON serialised `Profile` object, with fields as described above.

GET /api/users/me/profile/ Parameters

No parameters.

GET /api/users/me/profile/ `curl` example

curl "https://<your-datamasque-host>/api/users/me/profile/" \
     -H "Authorization: Token <your-api-token>"

POST /api/users/me/profile/

Authorization: User token only.

Updates the Profile object for the current user. Partial updates are supported: only fields that are contained in the request will be updated (i.e. if a field is not present in the request then its stored value remains unchanged).

POST /api/users/me/profile/ Responses

Status Code	Description
`204`	The `Profile` update was successful.

Run Object

Run objects have the following fields:

Field	Type	Description
`id`	`integer`	The `id` of the `Run`. Use this in API URLs that need a run `id`.
`name`	`string`	The name of the `Run`.
`status`	`string`	Indicates the `Run` status. The potential values are: `queued`, `running`, `finished`, `finished_with_warnings`, `failed`, `cancelling`, and `cancelled`. A status of `finished` or `finished_with_warnings` indicates the `Run` completed successfully; `failed` indicates an error. `finished_with_warnings` indicates there were warnings during the run, refer to the run log to view them.
`mask_type`	`string`	The masking type of the `Run`, valid options are `"database"` or `"file"`.
`connection`	`string`	Deprecated, replaced by `source_connection`.
`connection_name`	`string`	Deprecated, replaced by `source_connection_name`.
`source_connection`	`string`	A UUID identifying the source connection used for this `Run`. For database connections, the `source_connection` also acts as the destination.
`source_connection_name`	`string`	The name of the source connection of the `Run`. For database connections, the `source_connection` also acts as the destination.
`destination_connection`	`Optional[string]`	A UUID identifying the destination connection used for this `Run`. Only present for file connections, as the `source_connection` also acts as the destination for database connections.
`destination_connection_name`	`Optional[string]`	The name of the destination connection of the `Run`. Only present for file connections, as the `source_connection` also acts as the destination for database connections.
`ruleset`	`string`	A UUID identifying the ruleset used for this `Run`.
`ruleset_name`	`string`	Ruleset name of the `Run`.
`start_time`	`string`	Start time of the `Run`, in ISO 8601 format.
`end_time`	`string`	End time of the `Run`, in ISO 8601 format.
`options`	`object`	An `Option` object of configuration for the `Run`.

Requests related to Run Objects:
- GET /api/runs/
- POST /api/runs/
- GET /api/runs/{id}/
- POST /api/runs/{id}/cancel/
- GET /api/runs/validate/
- GET /api/runs/{id}/sdd-report/
- Option Object (Referenced in the /api/runs/ POST Request)

GET /api/runs/

Authorization: User token or API token.

Get a list of DataMasque Runs.

GET /api/runs/ Parameters

Field	Type	Required	Location	Description
`mask_type`	`string`	No	Query Parameter	The mask type of the `Run`. The potential values are: `database`, `file`.
`connection_ruleset_name`	`string`	No	Query Parameter	The name of the source or destination connection name or the ruleset name of the `Run`.
`status`	`string`	No	Query Parameter	The status of the `Run`. The potential values are: `queued`, `running`, `finished`, `finished_with_warnings`, `failed`, `cancelling`, and `cancelled`.

GET /api/runs/ Responses

Status Code	Description
`200`	A JSON serialised list of Run objects.

GET /api/runs/ `curl` example

curl "https://<your-datamasque-host>/api/runs/" \
     -H "Authorization: Token <your-api-token>"

POST /api/runs/

Authorization: User token or API token.

Start a new masking run.

POST /api/runs/ Parameters

Field	Type	Required	Location	Description
`name`	`string`	Yes	Request Body	The name of the `Run`.
`connection`	`string`	No	Request Body	Deprecated, replaced by `source_connection`.
`source_connection`	`string`	Yes	Request Body	A UUID identifying the source connection to be used for this `Run`. For database connections, the `source_connection` also acts as the destination.
`destination_connection`	`string`	Required only for runs on file connections.	Request Body	A UUID identifying the connection to be used for this `Run`.
`ruleset`	`string`	Yes	Request Body	A UUID identifying the ruleset to be used for this `Run`.
`options`	`object`	Yes	Request Body	An Option object of configuration for this `Run`.

POST /api/runs/ Responses

Status Code	Description
`200`	A JSON serialised Run object.

POST /api/runs/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/runs/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "name": "<run-name>",
           "source_connection": "<source-connection-uuid>",
           "destination_connection": "<destination-connection-uuid>",  # Include this only if required
           "ruleset": "<ruleset-uuid>",
           "options": {
             #... option object details ...
           }
         }'

GET /api/runs/{id}/

Authorization: User token or API token.

Retrieve information about a masking run.

GET /api/runs/{id}/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

GET /api/runs/{id}/ Responses

Status Code	Description
`200`	A JSON serialised Run object.

GET /api/runs/{id}/ `curl` example

curl "https://<your-datamasque-host>/api/runs/{id}/" \
     -H "Authorization: Token <your-api-token>"

POST /api/runs/{id}/cancel/

Authorization: User token or API token.

Cancel a masking run.

GET /api/runs/validate/

Authorization: User token or API token.

Validate that the run actually occurred.

GET /api/runs/validate/ Parameters

Field	Type	Required	Location	Description
`run_hash`	`string`	Yes	Query Parameter	The hash of the run that can be retrieved from `run_hash` column in the `DATAMASQUE_RUN_HISTORY` table.
`run_completion_time`	`string`	Yes	Query Parameter	The finish time of the run that can be retrieved from the run log or from the `completion_time` column in the `DATAMASQUE_RUN_HISTORY` table. It must be in the datetime format: `%Y-%m-%d %H:%M:%S`
`ruleset_content_sha256`	`string`	Yes	Query Parameter	The hash of the ruleset that can be retrieved from the run log or from the `ruleset_content_sha256` column in the `DATAMASQUE_RUN_HISTORY` table.

GET /api/runs/validate/ `curl` example

Given the run log contains:

SHA256 hash of ruleset: 7ee08ef63db7fed2baf577f16d74427c2250ba05f6858b0a27b70e05ccbff6eb

Finished At: 2024-05-22 22:11:35 UTC

The DATAMASQUE_RUN_HISTORY table has:

run_hash: 8d34cc930ce7eae40a633e95aef3aee5d2108511eb20ac35805f2e0834115bb9

curl -X GET "https://<your-datamasque-host>/api/runs/validate/?run_hash=8d34cc930ce7eae40a633e95aef3aee5d2108511eb20ac35805f2e0834115bb9&run_completion_time=2024-05-22 22:11:35&ruleset_content_sha256=7ee08ef63db7fed2baf577f16d74427c2250ba05f6858b0a27b70e05ccbff6eb" \
     -H "Authorization: Token <your-api-token>"

POST /api/runs/{id}/cancel/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

POST /api/runs/{id}/cancel/ Responses

Status Code	Description
`201`	Operation succeeded

POST /api/runs/{id}/cancel/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/runs/{id}/cancel/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json"

GET /api/runs/{id}/sdd-report/

Authorization: User token only.

A binary serialised SDD Report object.

GET /api/runs/{id}/sdd-report/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

GET /api/runs/{id}/sdd-report/ Responses

Status Code	Description
`200`	The server will return the SDD Report in the response body which can be downloaded as a CSV file.
`404`	If there is no SDD Report for a run, the server will return `404` status code.

GET /api/runs/{id}/sdd-report/ `curl` example

curl "https://<your-datamasque-host>/api/runs/{id}/sdd-report/" \
     -H "Authorization: Token <your-api-token>"

GET /api/runs/{id}/run-report/

Authorization: User token only.

A binary serialised Run Report object.

GET /api/runs/{id}/run-report/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

GET /api/runs/{id}/run-report/ Responses

Status Code	Description
`200`	The server will return the Run Report in the response body which can be downloaded as a CSV file.
`404`	If there is no Run Report for a run, the server will return `404` status code.

GET /api/runs/{id}/run-report/ `curl` example

curl "https://<your-datamasque-host>/api/runs/{id}/run-report/" \
     -H "Authorization: Token <your-api-token>"

DELETE /api/runs/{id}/db-discovery-results/

Deletes the database discovery results for a run. Use this only when the results are no longer needed, for instance because you have completed another discovery run on the same database more recently.

Warning! Deletion of results is irreversible.

Note: This endpoint can only be used to delete discovery results that were created on versions of DataMasque v2.22 and later. It is not possible to delete discovery results from versions prior to v2.22.

DELETE /api/runs/{id}/db-discovery-results/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

DELETE /api/runs/{id}/db-discovery-results/report/ Responses

Status Code	Description
`204`	Deletion was successful.
`404`	Not Found: There are no database discovery results for this run, or a run with the specified ID does not exist.

DELETE /api/runs/{id}/db-discovery-results/ `curl` example

curl -X DELETE "https://<your-datamasque-host>/api/runs/{id}/db-discovery-results/" \
     -H "Authorization: Token <your-api-token>"

GET /api/runs/{id}/db-discovery-results/report/

Downloads database schema discovery results as a CSV.

GET /api/runs/{id}/db-discovery-results/report/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

GET /api/runs/{id}/db-discovery-results/report/ Responses

Status Code	Description
`200`	The server will return the discovery results in the response body which can be downloaded as a CSV file.
`404`	Not Found: There are no database discovery results for this run, or a run with the specified ID does not exist.

GET /api/runs/{id}/db-discovery-results/report/ `curl` example

curl -o report.csv "https://<your-datamasque-host>/api/runs/{id}/db-discovery-results/report/" \
     -H "Authorization: Token <your-api-token>"

Option Object

Option objects have the following fields:

Field	Type	Description
`batch_size`	`integer`	An argument to specify the number of rows to fetch in each batch retrieved from the database for masking. This is ignored for file masking.
`dry_run`	`boolean`	Indicates a dry run where no data in the database is actually changed. Values should either be `true` to indicate a dry run, or `false` to run normally. Default value is `false`. More information on dry runs is available in the Masking runs documentation.
`max_rows`	`integer`	A parameter to specify the maximum number of rows that will be masked by each `mask_table` task¹. Defaults to no limit. This is ignored for file masking.
`continue_on_failure`	`boolean`	If there is a task failure, and this option is `false`, DataMasque will skip all remaining unstarted tasks. If this option is `true`, DataMasque will continue performing other tasks even if there is a task failure. Default value is `false`.
`run_secret`	`string`	The run secret is used in the random generation of masked values. If left unspecified, a random secret will be automatically generated and returned in the API response ². Masking runs performed on the same DataMasque instance with the same run secret will produce the same masked values for identical unmasked database inputs. You should only specify a run secret if you require consistent masking across runs, otherwise it is more secure to allow a new run secret to be automatically generated for each run. Run secrets must be at least 20 characters long.
`disable_instance_secret`	`boolean`	If this option is set to `true`, DataMasque will exclude its instance-specific secret and generate masked values based solely on the run secret. You may wish to disable the instance in order to achieve consistent masking across DataMasque instances. However, by disabling the instance secret, any DataMasque instance using the same `run_secret` could replicate your data masking.
`diagnostic_logging`	`boolean`	If set to `true`, the run log will include information to help diagnose errors. This includes information about the tables, columns and keys being masked, memory usage information and more verbose output. Defaults to `false`.
`buffer_size` (deprecated; will be removed in release 3.0.0)	`integer`	Replaced by `batch_size`.

¹ max_rows does not apply to mask_unique_key tasks.

² The run_secret contained in the API response can be provided in subsequent API calls to start runs, facilitating consistent masking across those runs.

Additionally, the following options apply to schema discovery runs (i.e. runs that include at least one run_schema_discovery task):

Field	Type	Description
`custom_keywords`	`array[string]`	List of keywords that, where a column's name matches one or more of the keywords, indicates the column contains sensitive data. Default value is an empty list.
`ignored_keywords`	`array[string]`	List of keywords that, where a column's name matches one or more of the keywords, indicates the column should be excluded from the schema discovery results. Default value is an empty list.
`disable_global_custom_keywords`	`boolean`	If set to `true`, then the user-defined global set of custom keywords will not be used to flag columns as sensitive. Default value is `false`.
`disable_global_ignored_keywords`	`boolean`	If set to `true`, then the user-defined global set of ignored keywords will not be used to exclude columns from the schema discovery results. Default value is `false`.
`disable_built_in_keywords`	`boolean`	If set to `true`, then DataMasque's built-in list of keywords will not be used to flag columns as sensitive. Default value is `false`.
`schemas`	`array[string]`	List of schema (database for MySQL/MariaDB) names against which to perform schema discovery. Default value is an empty list, meaning schema discovery will run against the schema configured on the database connection, or the database user's default schema. Default value is an empty list.

Requests related to Option Object:
- POST /api/runs/

Runlog Object

Runlog objects have the following fields:

Field	Type	Description
`run`	`integer`	ID of the `Run` this `Runlog` was generated for.
`timestamp`	`string`	Timestamp of this `Runlog`'s generation, in ISO 8601 format.
`message`	`string`	The log message passed from the masking worker.
`log_level`	`integer`	Numeric representation of the log level, values are `20` for INFO, `30` for WARNING, and `40` for ERROR.
`status`	`string`	Indicates the `Run` status. The potential values are: `queued`, `running`, `finished`, `finished_with_warnings`, `failed`, `cancelling`, and `cancelled`. A status of `finished` or `finished_with_warnings` indicates the `Run` completed successfully; `failed` indicates an error. `finished_with_warnings` indicates there were warnings during the run, refer to the run log to view them.
`is_dry_run`	`boolean`	Indicates whether the `Run` is a dry run.

Requests related to Runlog Object:
- GET /api/runs/{id}/log/

GET /api/runs/{id}/log/

Authorization: User token or API token.

List all logs for a specified Run in a JSON response.

GET /api/runs/{id}/log/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.
`limit`	`integer`	No	Query Parameter	The maximum number of `RunLog` entries to return.
`offset`	`integer`	No	Query Parameter	The starting position of the query in relation to the complete set of RunLogs for this `Run`.
`ordering`	`integer`	No	Query Parameter	Controls the order of the results. Available fields to order by are `id` and `timestamp`. Reverse the order by prefixing the field name with `-`. Multiple orderings may be specified separated by a comma.

GET /api/runs/{id}/log/ Responses

Status Code	Description
`200`	A JSON serialised list of Runlog objects. Default is to return the all the logs for the run.

GET /api/runs/{id}/log/ `curl` examples

Fetch the complete run log:

curl "https://<your-datamasque-host>/api/runs/{id}/log/" \
     -H "Authorization: Token <your-api-token>"

Fetch the first 25 logs:

curl "https://<your-datamasque-host>/api/runs/{id}/log/?limit=25&offset=0" \
     -H "Authorization: Token <your-api-token>"

Fetch logs from 50-100:

curl "https://<your-datamasque-host>/api/runs/{id}/log/?limit=50&offset=50" \
     -H "Authorization: Token <your-api-token>"

Order by timestamp and id descending (newest first):

curl "https://<your-datamasque-host>/api/runs/{id}/log/?ordering=-timestamp,-id" \
     -H "Authorization: Token <your-api-token>"

GET /api/runs/{id}/log/download/

Authorization: User token only.

All logs for a specified Run in a plain text file.

GET /api/runs/{id}/log/download/ Parameters

Field	Type	Required	Location	Description
`timezone`	`string`	Yes	Query Parameter	Timezone offset to use for the Run logs in the format +HH:MM or -HH:MM. Example: +07:00, -05:00.

GET /api/runs/{id}/log/download/ Responses

Status Code	Description
`200`	The server will return the Run Log content in the response body which can be downloaded as a log file.

GET /api/runs/{id}/log/download/ `curl` example

curl "https://<your-datamasque-host>/api/runs/{id}/log/download/?timezone=+07:00" \
     -H "Authorization: Token <your-api-token>"

Connection Object

Database Connection objects have the following fields:

Field	Type	Description
`version`	`string`	The connection version. This should be set to `1.0'.
`id`	`integer`	The `id` of the `Connection`. Use this in API URLs that need a connection `id`.
`name`	`string`	The name of the `Connection`.
`user`	`string`	The name of the user in the database connection.
`db_type`	`string`	The type of database the connection is connecting to.
`database`	`string`	The database the connection is connecting to.
`host`	`string`	The hostname of the database connection.
`port`	`integer`	The database port being connected through.
`dbpassword`	`string`	The password for the user connecting to the database.
`schema`	`string`	The schema of the database to connect to.
`options`	`object`	An `Option` object of configuration for the `Run`
`service_name`	`string`	The service name for the connection. Only used for Oracle. (Optional)
`connection_fileset`	`string`	The connection fileset attached to this connection. Currently only used for MySQL and MariaDB. (Optional)
`mask_type`	`string`	The type of masking the connection can perform, only `database` or `file` are valid. (Optional) Should be set to `database` for database `Connections`.
`last_discovery_run_date`	`string`	The created_time of the last run on this connection including a `run_schema_discovery` task, or `null` if no such run has been performed.
`last_discovery_run_id`	`string`	The ID of the last run on this connection including a `run_schema_discovery` task, or `null` if no such run has been performed.
`is_read_only`	`boolean`	Whether or not the connection to the database is read-only.
`data_encoding`	`string`	Only for Oracle, Postgres, MySQL, and MariaDB connections An encoding to be used when retrieving data containing different character sets from the database. Should match the encoding of the data stored, not the character set of the database. The list of supported encodings can be found on the Database Connections page.
`iam_role_arn`	`string`	Only for Amazon DynamoDB connections The IAM role ARN for DataMasque to assume role

File Connection objects have the following fields:

Field	Type	Description
`version`	`string`	The connection version. This should be set to `1.0'.
`id`	`integer`	The `id` of the `Connection`. Use this in API URLs that need a connection `id`.
`name`	`string`	The name of the `Connection`.
`type`	`string`	The type of file system the connection is connecting to. Valid options are `"s3_connection"`, `"azure_blob_connection"` or `"mounted_share_connection"`.
`base_directory`	`string`	The root file path where files intended to be masked are stored.
`bucket`	`string`	The name of the S3 bucket containing the `base_directory`. Only for S3 `Connections`.
`container`	`string`	The name of the Azure Blob Storage container containing the `base_directory`. Only for Azure Blob `Connections`.
`connection_string`	`string`	The connection string configured with the authorization information to access data in your Azure Storage account. Only for Azure Blob `Connections`.
`mask_type`	`string`	The type of masking the connection can perform, only `database` or `file` are valid. (Optional) Should be set to `file` for file `Connections`.
`is_file_mask_source`	`boolean`	A boolean if the connection is a source `Connection` for file masking. (Optional) Defaults to `false` if not provided.
`is_file_mask_destination`	`boolean`	A boolean if the connection is a destination `Connection` for file masking. (Optional) Defaults to `false` if not provided.

Requests related to Connection Object:

GET /api/connections/

Authorization: User token only.

Get a list of all DataMasque connections.

Optionally, you can add an {id} to the end of the request to only return the details of the connection with that specific id.

GET /api/connections/ Parameters

Can optionally follow the URL with the id of a specific connection to only return information on that connection.

GET /api/connections/ Responses

Status Code	Description
`200`	A JSON serialised Connection object.

Quickstart example using `curl`

curl "https://<your-datamasque-host>/api/connections/" \
     -H "Authorization: Token <your-api-token>"

POST /api/connections/

Authorization: User token only.

Create a new connection object.

POST /api/connections/ Parameters

Database Connections

Field	Type	Required	Location	Description
`version`	`string`	Yes	Request Body	The connection version. This should be set to `1.0`.
`name`	`string`	Yes	Request Body	The name of the `Connection`.
`user`	`string`	Yes	Request Body	The name of the user in the database connection.
`db_type`	`string`	Yes	Request Body	The type of database the connection is connecting to.
`database`	`string`	Yes	Request Body	The database the connection is connecting to.
`host`	`string`	Yes	Request Body	The hostname of the database connection.
`port`	`integer`	Yes	Request Body	The database port being connected through.
`dbpassword`	`string`	Yes	Request Body	The password for the user connecting to the database.
`schema`	`string`	Yes	Request Body	The schema of the database to connect to.
`service_name`	`string`	No	Request Body	The service name for the connection. Only applies to Oracle.
`connection_fileset`	`string`	No	Request Body	The connection fileset attached to this connection. Only applies to MySQL and MariaDB.
`mask_type`	`string`	No, defaults to `database` if not provided.	Request Body	The type of masking the connection can perform, only `database` or `file` are valid.
`is_read_only`	`boolean`	No, defaults to `false` if not provided.	Request Body	Whether or not the connection to the database read-only.
`data_encoding`	`string`	No, defaults to `None` if not provided.	Request Body	Only for Oracle, Postgres, MySQL, and MariaDB connections An encoding to be used when retrieving data containing different character sets from the database. Should match the encoding of the data stored, not the character set of the database. The list of supported encodings can be found on the Database Connections page.
`iam_role_arn`	`string`	No, role assumption will only take place if provided.	Request Body	Only for Amazon DynamoDB connections The IAM role ARN for DataMasque to assume role

File Connections

Field	Type	Required	Location	Description
`version`	`string`	Yes	Request Body	The connection version. This should be set to `1.0'.
`name`	`string`	Yes	Request Body	The name of the `Connection`.
`type`	`string`	Yes	Request Body	The type of file system the connection is connecting to. Valid options are `"s3_connection"`, `"azure_blob_connection"` or `"mounted_share_connection"`.
`base_directory`	`string`	Yes	Request Body	The root file path where files intended to be masked are stored.
`bucket`	`string`	Required only for S3 `Connections`.	Request Body	The name of the S3 bucket containing the `base_directory`.
`container`	`string`	Required only for Azure Blob `Connections`.	Request Body	The name of the Azure Blob Storage container containing the `base_directory`.
`connection_string`	`string`	Required only for Azure Blob `Connections`.	Request Body	The connection string configured with the authorization information to access data in your Azure Storage account.
`mask_type`	`string`	No, defaults to `database` if not provided.	Request Body	The type of masking the connection can perform, only `database` or `file` are valid.
`is_file_mask_source`	`boolean`	No, defaults to `false` if not provided.	Request Body	A boolean if the connection is a source `Connection` for file masking.
`is_file_mask_destination`	`boolean`	No, defaults to `false` if not provided.	Request Body	A boolean if the connection is a destination `Connection` for file masking.
`iam_role_arn`	`string`	No, role assumption will only take place if provided.	Request Body	The IAM role ARN for DataMasque to assume role as for S3 connections.

POST /api/connections/ Responses

Status Code	Description
`201`	A JSON serialised Connection object.

POST /api/connections/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/connections/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "version": "1.0",
           "name": "<connection_name>",
           "user": "<database_user>",
           "db_type": "<database_type>",
           "database": "<database_name>",
           "host": "<database_host>",
           "port": <database_port>,
           "password": "<database_password>",
           "schema": "<database_schema>",
           "service_name": "<oracle_service_name>",
           "connection_fileset": "<connection_fileset>",
           "mask_type": "database"
         }'

PUT /api/connections/{id}/

Authorization: User token only.

Update a connection with a specified id with new values.

PUT /api/connections/{id}/ Parameters

Database Connections

Field	Type	Required	Location	Description
`version`	`string`	Yes	Request Body	The connection version. This should be set to `1.0`.
`name`	`string`	Yes	Request Body	The name of the `Connection`.
`user`	`string`	Yes	Request Body	The name of the user in the database connection.
`db_type`	`string`	Yes	Request Body	The type of database the connection is connecting to.
`database`	`string`	Yes	Request Body	The database the connection is connecting to.
`host`	`string`	Yes	Request Body	The hostname of the database connection.
`port`	`integer`	Yes	Request Body	The database port being connected through.
`dbpassword`	`string`	Yes	Request Body	The password for the user connecting to the database.
`schema`	`string`	Yes	Request Body	The schema of the database to connect to.
`service_name`	`string`	No	Request Body	The service name for the connection. Only applies to Oracle.
`connection_fileset`	`string`	No	Request Body	The connection fileset attached to this connection. Only applies to MySQL and MariaDB.
`mask_type`	`string`	No, defaults to `database` if not provided.	Request Body	The type of masking the connection can perform, only `database` or `file` are valid.
`is_read_only`	`boolean`	No, defaults to `false` if not provided.	Request Body	Whether or not the connection to the database is read-only.
`iam_role_arn`	`string`	No, role assumption will only take place if provided.	Request Body	The IAM role ARN for DataMasque to assume role as for S3 connections.

File Connections

Field	Type	Required	Location	Description
`version`	`string`	Yes	Request Body	The connection version. This should be set to `1.0'.
`name`	`string`	Yes	Request Body	The name of the `Connection`.
`type`	`string`	Yes	Request Body	The type of file system the connection is connecting to. Valid options are `"s3_connection"`, `"azure_blob_connection"` or `"mounted_share_connection"`.
`base_directory`	`string`	Yes	Request Body	The root file path where files intended to be masked are stored.
`bucket`	`string`	Required only for S3 `Connections`.	Request Body	The name of the S3 bucket containing the `base_directory`.
`container`	`string`	Required only for Azure Blob `Connections`.	Request Body	The name of the Azure Blob Storage container containing the `base_directory`.
`connection_string`	`string`	Required only for Azure Blob `Connections`.	Request Body	The connection string configured with the authorization information to access data in your Azure Storage account.
`mask_type`	`string`	No, defaults to `database` if not provided.	Request Body	The type of masking the connection can perform, only `database` or `file` are valid.
`is_file_mask_source`	`boolean`	No, defaults to `false` if not provided.	Request Body	A boolean if the connection is a source `Connection` for file masking.
`is_file_mask_destination`	`boolean`	No, defaults to `false` if not provided.	Request Body	A boolean if the connection is a destination `Connection` for file masking.
`iam_role_arn`	`string`	No, role assumption will only take place if provided.	Request Body	The IAM role ARN for DataMasque to assume role as for S3 connections.

PUT /api/connections/{id}/ Responses

Status Code	Description
`200`	A JSON serialised Connection object with the new updated values.

PUT /api/connections/{id}/ `curl` example

curl -X PUT "https://<your-datamasque-host>/api/connections/{connection_id}/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "version": "1.0",
           "name": "<connection_name>",
           "user": "<database_user>",
           "db_type": "<database_type>",
           "database": "<database_name>",
           "host": "<database_host>",
           "port": <database_port>,
           "password": "<database_password>",
           "schema": "<database_schema>",
           "service_name": "<oracle_service_name>",
           "connection_fileset": "<connection_fileset>",
           "mask_type": "database"
         }'

DELETE /api/connections/{id}/

Authorization: User token only.

Delete the connection with the specified id.

DELETE /api/connections/{id}/ Parameters

No parameters.

DELETE /api/connections/{id}/ Responses

Status Code	Description
`204`	Operation succeeded

DELETE /api/connections/{id}/ `curl` example

curl -X DELETE "https://<your-datamasque-host>/api/connections/{id}/" \
     -H "Authorization: Token <your-api-token>"

POST /api/connections/test/

Authorization: User token only.

Test a connection to validate that it is able to successfully connect to the target database.

POST /api/connections/test/ Parameters

Database Connections

Field	Type	Required	Location	Description
`version`	`string`	Yes	Request Body	The connection version. This should be set to `1.0`.
`name`	`string`	Yes	Request Body	The name of the `Connection`.
`user`	`string`	Yes	Request Body	The name of the user in the database connection.
`db_type`	`string`	Yes	Request Body	The type of database the connection is connecting to.
`database`	`string`	Yes	Request Body	The database the connection is connecting to.
`host`	`string`	Yes	Request Body	The hostname of the database connection.
`port`	`integer`	Yes	Request Body	The database port being connected through.
`dbpassword`	`string`	Yes	Request Body	The password for the user connecting to the database.
`schema`	`string`	Yes	Request Body	The schema of the database to connect to.
`service_name`	`string`	No	Request Body	The service name for the connection. Only applies to Oracle.
`connection_fileset`	`string`	No	Request Body	The connection fileset attached to this connection. Only applies to MySQL and MariaDB.
`is_read_only`	`boolean`	No, defaults to `false` if not provided.	Request Body	Whether or not the connection to the database is read-only.
`iam_role_arn`	`string`	No, role assumption will only take place if provided.	Request Body	The IAM role ARN for DataMasque to assume role as for S3 connections.

File Connections

Field	Type	Required	Location	Description
`version`	`string`	Yes	Request Body	The connection version. This should be set to `1.0'.
`name`	`string`	Yes	Request Body	The name of the `Connection`.
`type`	`string`	Yes	Request Body	The type of file system the connection is connecting to. Valid options are `"s3_connection"`, `"azure_blob_connection"` or `"mounted_share_connection"`.
`base_directory`	`string`	Yes	Request Body	The root file path where files intended to be masked are stored.
`bucket`	`string`	Required only for S3 `Connections`.	Request Body	The name of the S3 bucket containing the `base_directory`.
`container`	`string`	Required only for Azure Blob `Connections`.	Request Body	The name of the Azure Blob Storage container containing the `base_directory`.
`connection_string`	`string`	Required only for Azure Blob `Connections`.	Request Body	The connection string configured with the authorization information to access data in your Azure Storage account.
`mask_type`	`string`	No, defaults to `database` if not provided.	Request Body	The type of masking the connection can perform, only `database` or `file` are valid.
`is_file_mask_source`	`boolean`	No, defaults to `false` if not provided.	Request Body	A boolean if the connection is a source `Connection` for file masking.
`is_file_mask_destination`	`boolean`	No, defaults to `false` if not provided.	Request Body	A boolean if the connection is a destination `Connection` for file masking.
`iam_role_arn`	`string`	No, role assumption will only take place if provided.	Request Body	The IAM role ARN for DataMasque to assume role as for S3 connections.

POST /api/connections/test/ Responses

Status Code	Description
`200`	Operation succeeded

Connection Fileset Object

Connection Fileset objects have the following fields:

Field	Type	Description
`id`	`integer`	The `id` of the `Connection Fileset`. Use this in API URLs that need a connection_fileset `id`.
`name`	`string`	The name of the `Connection Fileset`.
`database_type`	`string`	The type of database the `Connection Fileset` is associated with (currently only `mysql` is supported; this will work with both MySQL and MariaDB connections).
`zip_archive`	`string`	The location of the Zip archive.

Requests related to Connection Fileset:

POST /api/connections/test/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/connections/test/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "name": "<your-connection-name>",
           "user": "<your-connection-user>",
           "db_type": "oracle",
           "database": "<your-database>",
           "host": "<your-host>",
           "port": 1433,
           "dbpassword": "<your-password>",
           "schema": "<optional-schema>",
           "service_name": "<optional-service-name>",
           "connection_fileset": "<optional-connection-fileset>",
           "version": "1.0"
         }'

GET /api/connection-filesets/

Authorization: User token only.

Returns a list of Connection Filesets. These may be used to encrypt connections to MySQL and MariaDB databases.

GET /api/connection-filesets/ Parameters

No parameters.

GET /api/connection-filesets/ Responses

Status Code	Description
`201`	A list of JSON serialised Connection Filesets.

GET /api/connection-filesets/ `curl` example

curl "https://<your-datamasque-host>/api/connection-filesets/" \
     -H "Authorization: Token <your-api-token>"

POST /api/connection-filesets/

Authorization: User token only.

Create a new Connection Fileset.

POST /api/connection-filesets/ Parameters

Field	Type	Required	Location	Description
`name`	`string`	Yes	Form Field	The name of the `Connection Fileset`.
`database_type`	`string`	Yes	Form Field	The type of database the `Connection Fileset` is associated with (currently only `mysql` is supported; this will work with both MySQL and MariaDB connections).
`zip_archive`	`file`	Yes	Form Field	The Zip archive file.

POST /api/connection-filesets/ Responses

Status Code	Description
`201`	A JSON serialised object of the Connection Fileset that was created.

POST /api/connection-filesets/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/connection-filesets/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: multipart/form-data" \
     -F "name=<fileset_name>" \
     -F "database_type=<database_type>" \
     -F "zip_archive=@</path/to/your/file.zip>"

PUT /api/connection-filesets/{id}/

Authorization: User token only.

Update a Connection Fileset.

PUT /api/connection-filesets/{id}/ Parameters

Field	Type	Required	Location	Description
`name`	`string`	Yes	Form Field	The name of the `Connection Fileset`.
`database_type`	`string`	Yes	Form Field	The type of database the `Connection Fileset` is associated with (currently only `mysql` is supported; this will work with both MySQL and MariaDB connections).
`zip_archive`	`file`	Yes	Form Field	The Zip archive file.

PUT /api/connection-filesets/{id}/ Responses

Status Code	Description
`201`	A JSON serialised object of the Connection Fileset that was created.

PUT /api/connection-filesets/{id}/ `curl` example

curl -X PUT "https://<your-datamasque-host>/api/connection-filesets/{id}/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: multipart/form-data" \
     -F "name=<fileset_name>" \
     -F "database_type=<database_type>" \
     -F "zip_archive=@</path/to/your/file.zip>"

DELETE /api/connection-filesets/{id}/

Authorization: User token only.

Deletes the Connection Fileset with the specified id. You may not delete a Connection Fileset associated to an existing connection.

DELETE /api/connection-filesets/{id}/ Parameters

No parameters.

DELETE /api/connection-filesets/{id}/ Responses

Status Code	Description
`204`	Operation succeeded.

DELETE /api/connection-filesets/{id}/ `curl` example

curl -X DELETE "https://<your-datamasque-host>/api/connection-filesets/{id}/" \
     -H "Authorization: Token <your-api-token>"

Ruleset Object

Ruleset objects have the following fields:

Field	Type	Description
`id`	`integer`	The `id` of the `Ruleset`. Use this in API URLs that need a ruleset `id`.
`name`	`string`	The name of the `Ruleset`.
`config_yaml`	`string`	The contents of the `Ruleset`, including of all the masking rules.
`is_valid`	`boolean`	Whether or not the `Ruleset` is valid, and can be used for masking runs.
`mask_type`	`string`	The masking type of the `Ruleset`. This can be `"database"` or `"file"`.

Requests related to Ruleset Object:

GET /api/rulesets/

Authorization: User token only.

Returns a list of all rulesets.

GET /api/rulesets/ Parameters

No parameters.

GET /api/rulesets/ Responses

Status Code	Description
`200`	A JSON serialised list of Ruleset objects.

GET /api/rulesets/ `curl` example

curl "https://<your-datamasque-host>/api/rulesets/" \
     -H "Authorization: Token <your-api-token>"

GET /api/rulesets/{id}/

GET /api/rulesets/{id}/ Parameters

No parameters.

GET /api/rulesets/ Responses

Status Code	Description
`200`	A JSON serialised Ruleset object.

curl "https://<your-datamasque-host>/api/rulesets/{id}/" \
     -H "Authorization: Token <your-api-token>"

POST /api/rulesets/

Authorization: User token only.

Creates a new ruleset.

POST /api/rulesets/ Parameters

Field	Type	Required	Location	Description
`name`	`string`	Yes	Request Body	The name of the `Ruleset`.
`config_yaml`	`string`	Yes	Request Body	The YAML contents of the `Ruleset`.
`mask_type`	`string`	No	Request Body	The masking type of the `Ruleset`. Valid options are `"database"` or `"file"`.

POST /api/rulesets/ Responses

Status Code	Description
`201`	A JSON serialised Ruleset object.

POST /api/rulesets/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/rulesets/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "name": "<your-new-name>",
           "config_yaml": "version: \"1.0\"\ntasks:\n  - type: run_data_discovery"
         }'

PUT /api/rulesets/{id}/

Authorization: User token only.

Update an existing ruleset.

PUT /api/rulesets/{id}/ Parameters

Field	Type	Required	Location	Description
`name`	`string`	Yes	Request Body	The name of the `Ruleset`.
`config_yaml`	`string`	Yes	Request Body	The YAML contents of the `Ruleset`.
`mask_type`	`string`	No	Request Body	The masking type of the `Ruleset`. Valid options are `"database"` or `"file"`.

PUT /api/rulesets/{id}/ Responses

Status Code	Description
`200`	A JSON serialised Ruleset object with the updated values.

PUT /api/rulesets/{id}/ `curl` example

curl -X PUT "https://<your-datamasque-host>/api/rulesets/{id}/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "name": "<your-new-name>",
           "config_yaml": "version: \"1.0\"\ntasks:\n  - type: run_data_discovery"
         }'

DELETE /api/rulesets/{id}/

Authorization: User token only.

Deletes the ruleset with the specified id.

DELETE /api/rulesets/{id}/ Parameters

No parameters.

DELETE /api/rulesets/{id}/ Responses

Status Code	Description
`200`	Operation succeeded

DELETE /api/rulesets/{id}/ `curl` example

curl -X DELETE "https://<your-datamasque-host>/api/rulesets/{id}/" \
     -H "Authorization: Token <your-api-token>" \

Seed Object

Field	Type	Description
`id`	`integer`	The `id` of the `Seed`.
`name`	`string`	The name of the `Seed`.
`seed_file`	`string`	The location of the `Seed`.
`created date`	`datetime`	The date that the `Seed` was uploaded.
`filename`	`string`	The file name of the uploaded `Seed`.

Requests that use Seed Object:
- GET /api/seeds/
- POST /api/seeds/

GET /api/seeds/

Authorization: User token only.

Get a list of all DataMasque seed files.

Optionally, you can add an {id} to the end of the request to only return the details of the seed with that specific id.

GET /api/seeds/ Parameters

No parameters.

GET /api/seeds/ Responses

Status Code	Description
`200`	A JSON serialised list of Seed objects.

GET /api/seeds/ `curl` example

curl "https://<your-datamasque-host>/api/seeds/" \
     -H "Authorization: Token <your-api-token>"

POST /api/seeds/

Authorization: User token only.

Create a new seed from a csv file.

POST /api/seeds/ Parameters

Field	Type	Required	Description
`name`	`string`	No	The name of the csv file.
`description`	`string`	No	A description of the seed file to displayed on the files menu.
`seed_file`	`file`	No	The seed file.

POST /api/seeds/ Responses

Status Code	Description
`201`	A JSON serialised Seed object.

POST /api/seeds/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/seeds/" \
     -H "Authorization: Token <your-api-token>" \
     -F "name=<fileset_name>" \
     -F "seed_file=@</path/to/your/seed_file.csv>"

Audit Log Object

Field	Type	Description
`id`	`integer`	The id of the audit log.
`timestamp`	`datetime`	The timestamp of when the audit log was created.
`username`	`string`	The username which created the audit log.
`category`	`string`	The category for the audit log, one of the following: `auth`, `run`, `ruleset`, or `connection`
`action`	`string`	The action taken. One of the following: `logged_in` `logged_out`, for `auth` actions, `started`, `cancelled`, for masking `run` actions, `created`, `modified`, `deleted` for `connection` or `ruleset` actions.
`description`	`string`	A short description of what happened during the action.

Requests that use Audit Log Object:
- GET /api/audit-logs/

Audit Log CSV

A CSV representation of the Audit Log Object

The CSV file contains the following headers:

Field	Type	Description
`timestamp`	`datetime`	The timestamp of when the audit log was created.
`username`	`string`	The username which created the audit log.
`category`	`string`	The category for the audit log, one of the following: `auth`, `run`, `ruleset`, or `connection`
`action`	`string`	The action taken. One of the following: `logged_in` `logged_out`, for `auth` actions, `started`, `cancelled`, for masking `run` actions, `created`, `modified`, `deleted` for `connection` or `ruleset` actions.
`description`	`string`	A short description of what happened during the action.

Requests that use Audit Log CSV:
- GET /api/audit-logs/download/

GET /api/audit-logs/

Authorization: User token only.

Retrieve all Audit Logs.

GET /api/audit-logs/ Parameters

No parameters.

GET /api/audit-logs/ Response

Status Code	Description
`200`	A list of JSON serialised list of Audit Log objects

GET /api/audit-logs/ `curl` example

curl "https://<your-datamasque-host>/api/audit-logs/" \
     -H "Authorization: Token <your-api-token>"

GET /api/audit-logs/download/

Authorization: User token only.

Retrieve all Audit Logs.

GET /api/audit-logs/download/ Parameters

No parameters.

GET /api/audit-logs/download/ Response

Status Code	Description
`200`	The server will return the audit logs in the response body which can be then downloaded as a CSV file.

GET /api/audit-logs/download/ `curl` example

curl -o <your-downloads-path>/<your-download-name>.csv -X GET "https://<your-datamasque-host>/api/audit-logs/" \
     -H "Authorization: Token <your-api-token>"

Schema Discovery

POST /api/schema-discovery/

Authorization: User token only.

Executes schema discovery against a database connection.

POST /api/schema-discovery/ Parameters

Field	Type	Required	Description
`connection`	`string`	Yes	The `id` of the `Connection`.
`custom_keywords`	`array[string]`	Yes	List of keywords that, where a column name matches one or more of the keywords, indicates the column contains sensitive data.
`disable_built_in_keywords`	`boolean`	Yes	If set to `true`, then DataMasque's built-in list of keywords will not be used to flag columns as sensitive.
`disable_global_custom_keywords`	`boolean`	Yes	If set to `true`, then the user-defined global set of custom keywords will not be used to flag columns as sensitive.
`disable_global_ignored_keywords`	`boolean`	Yes	If set to `true`, then the user-defined global set of ignored keywords will not be used to exclude columns from the discovery results.
`ignored_keywords`	`array[string]`	Yes	List of keywords that, where a column name matches one or more of the keywords, indicates the column should be excluded from the schema discovery results.
`in_data_discovery`	`object`	No	In-data discovery options. An object containing `enabled`, `row_sample_size`, `custom_rules`, `non_sensitive_rules` and `force` options. Defaults to `{enabled: false}`.
`schemas`	`array[string]`	Yes	List of schema names (or database for MySQL/MariaDB) against which to perform schema discovery. Send an empty list to run against the schema configured on the database connection, or the database user's default schema if one is not specified for the connection.

POST /api/schema-discovery/ Responses

Status Code	Description
`201`	A JSON serialised Run object.

POST /api/schema-discovery/ Example

curl -X POST "https://<your-datamasque-host>/api/schema-discovery/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "connection": "<your-connection-id>",
           "custom_keywords": [],
           "ignored_keywords": [],
           "disable_global_custom_keywords": false,
           "disable_global_ignored_keywords": false,
           "disable_built_in_keywords": false,
           "in_data_discovery": {
             "enabled": true,
             "row_sample_size": 500,
             "custom_rules": [
               {
                 "name": "temp_staff",
                 "pattern": "temp.*"
               }
             ],
             "non_sensitive_rules": [
               {"pattern": "retired.*"}
             ],
           }
         }'

GET /api/schema-discovery/{connection_id}/

Authorization: User token or API token.

Retrieve schema discovery results.

GET /api/schema-discovery/{connection_id}/ Parameters

None

GET /api/schema-discovery/{connection_id}/ Response

Status Code	Description
`200`	A JSON serialised object containing a Schema Discovery object and a Run object.

Field	Type	Description
`data`	`object`	A Schema Discovery.
`last_sdd_run`	`object`	A JSON serialised Run object.

Schema Discovery Object

Schema Discovery objects have the following fields:

Field	Type	Description
`options`	`object`	List of `ignored_keywords` and `customised_keywords`.
`schemas`	`list[object]`	List of schema objects each with `name` and list of `tables`. `tables` contain `name` and a list of `columns`.
`sd_version`	`string`	Schema discovery version e.g. "1.1.1".

Schema Discovery Column Object

Column objects have the following fields:

Field	Type	Description
`name`	`string`	The column name
`data_type`	`string`	The data type for this field e.g `varchar`, `integer`, `numeric`, `timestamp without time zone`.
`categories`	`list[string]`	A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom.
`max_length`	`number`	The column length
`description`	`string`	The reason the column was flagged as sensitive.
`foreign_keys`	`list[object]`	A list of foreign key objects containing `name` and `referenced_column`.
`is_unique_key`	`boolean`	Is the column a unique key.
`numeric_scale`	`number`	If the `data_type` is `numeric` this refers to the maximum number of decimal places.
`ruleset_match`	`boolean`	The type of information detected by sensitive data discovery, used internally by the the ruleset generator to suggest a suitable masking rule.
`in_data_result`	`list[object]`	A list of In Data matches.
`is_primary_key`	`boolean`	Is the column a primary key.
`numeric_precision`	`number`	If the `data_type` is `numeric` this refers to the maximum number of digits present.
`constraint_columns`	`list[string]`	A list of column names participating in the constraint.
`pk_constraint_name`	`string`	The name of the primary key constraint.
`uk_constraint_name`	`string`	The name of the unique key constraint.
`unique_index_names`	`list[string]`	A list of index names for this column.
`allow_in_data_override`	`boolean`	A boolean representing that a Sensitive Data match can be overridden by an In Data match.
`referencing_foreign_keys`	`list[string]`	A list of foreign keys referencing this column.

GET /api/schema-discovery/v2/{run_id}/

Authorization: User token or API token.

Retrieve schema discovery results with server-side pagination, sorting, filtering and searching.

GET /api/schema-discovery/v2/{run_id}/ Parameters

Field	Type	Required	Location	Description
`limit`	`number`	No	Query Parameter	The maximum number of results to return. Defaults to 50 if not set.
`offset`	`number`	No	Query Parameter	The index of the first item to be returned within the whole set of results. Defaults to 0 if not set.
`ordering`	`string`	No	Query Parameter	Controls the sort order of results. Specify one or more columns separated by commas. To specify descending sort order, prefix the field name with '-'. Defaults to `?ordering=schema,table,column`.
`search`	`string`	No	Query Parameter	Performs a case-insensitive partial match on the schema, table or column name.
`categories`	`string`	No	Query Parameter	Filters the categories (Data Classifications) using an exact match. Valid values are `PII`, `PHI` or `PCI`.
`data_type`	`string`	No	Query Parameter	Filters the data type name (excluding the length or numeric precision/scale) e.g `?data_type=varchar`.
`description`	`string`	No	Query Parameter	Searches the description using a case-insensitive partial match.
`flagged_by`	`string`	No	Query Parameter	Filters the Flagged By field using an exact match. Valid values are `In-Data Discovery` or `Metadata Discovery`.
`is_sensitive`	`boolean`	No	Query Parameter	Filters the results for sensitive matches. Set to `true` to return only sensitive results, or `false` for only non-sensitive.
`constraint`	`string`	No	Query Parameter	Filters for results with either Primary or Unique constraints. Valid values are `primary` or `unique` (case-insensitive).

GET /api/schema-discovery/v2/{run_id}/ Response

Status Code	Description
`200`	A JSON serialised object containing pagination meta-data and a list of Schema Discovery Result objects.

Field	Type	Description
`count`	`number`	Total number of unpaginated results.
`next`	`string`	Pagination link to the next page of results.
`previous`	`string`	Pagination link to the previous page of results.
`results`	`list[object]`	A list of Schema Discovery Result objects.

Schema Discovery Result object

Schema Discovery Result objects have the following fields:

Field	Type	Description
`id`	`number`	A unique `id` for the result.
`column`	`string`	The column name.
`table`	`string`	The table name.
`schema`	`string`	The schema name.
`data`	`object`	A v2 Schema Discovery Column Object.

v2 Schema Discovery Column Object

v2 Schema Discovery Column objects have the following fields:

Field	Type	Description
`data_type`	`string`	The data type for this field e.g `varchar`, `integer`, `numeric`, `timestamp without time zone` with the `max_length` or `numeric_precision` and `numeric_scale` appended.
`max_length`	`number`	The column length.
`foreign_keys`	`list[object]`	A list of foreign key objects containing `name` and `referenced_column` as a string containing `schema.table.column`.
`discovery_matches`	`list[object]`	A list of Discovery Match objects sorted by priority.
`numeric_precision`	`number`	The numeric precision of the column, the meaning of which depends on the database and data type.
`numeric_scale`	`number`	The numeric scale of the column, the meaning of which depends on the database and data type. Default is `null`.
`constraint_columns`	`list[string]`	A list of column names participating in the constraint.
`pk_constraint_name`	`string`	The name of the primary key constraint. Default is `null`.
`uk_constraint_name`	`string`	The name of the unique key constraint. Default is `null`.
`unique_index_names`	`list[string]`	A list of index names for this column.
`referencing_foreign_keys`	`list[object]`	A list of foreign keys referencing this column. The objects contain a `name` and `referencing_column` as a string containing `schema.table.column`.
`categories`	`list[string]`	A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom.
`description`	`string`	The reason the column was flagged as sensitive (blank for non-sensitive columns).
`flagged_by`	`string`	Indicates whether the column was flagged by `In-Data Discovery` or `Metadata Discovery` (or blank for non-sensitive columns).
`constraint`	`string`	Indicates whether the column is either a `Primary` or `Unique` key.

Discovery Match Object

Discovery Match objects have the following fields:

Field	Type	Description
`label`	`string`	A name for the rule that flagged the match. Can also be `custom`, `custom_non_sensitive` or `ignore` for user-defined match rules.
`categories`	`list[string]`	A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom.
`flagged_by`	`string`	Indicates whether the column was flagged by `In-Data Discovery` or `Metadata Discovery`.
`description`	`string`	The reason the column was flagged as sensitive.

Generating Rulesets

POST /api/generate-ruleset/

Authorization: User token only.

Returns a ruleset string for selected columns of a connection.

Prerequisite: Make sure you have the schema-discovery report for the connection specified in the post data.

POST /api/generate-ruleset/[v1/|v2/] `curl` example

curl -X POST "https://<your-datamasque-host>/api/generate-ruleset/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "connection": "<your-connection-id>",
           "selected_columns": {
             "schema_name": {
               "table_name": [
                 "column_name_1",
                 "column_name_2"
               ]
             }
           }
         }'

POST /api/generate-ruleset/[v1/] Response

The default response for a version 1 request is a json encoded string containing the ruleset yaml. The trailing /v1/ is optional for version 1.

POST /api/generate-ruleset/v2/ Response

The version 2 response is a plain text containing the ruleset yaml.

POST /api/generate-file-ruleset/

Authorization: User token only.

Returns a ruleset string for selected data of a file connection.

The selected data is a list of file groups, each of which contains:

A list of files which are the full paths relative to the base directory of the connection.
A list of locators, which are either JSON locators or strings containing a single header column name. JSON locators must be formatted as lists even if they consist of a single element.

Each file group will generate at least one task in the ruleset (either mask_file or mask_tabular_file).

Generally, only one task will be generated per file group, but in cases where files have different extensions, delimiters or encodings, multiple tasks will be generated to cater for these settings.

File groups should only contain files of the same type, that is, don't specify object files, multi-record files, or tabular files in the same file group. If multiple file types are mixed, then the generated ruleset will attempt to split into multiple tasks, but the results may be unexpected.

Prerequisite: Make sure you have the file-discovery report for the connection specified in the POST data so that a discovery run has been completed on the connection and the files can be selected from the report.

POST /api/generate-file-ruleset/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/generate-file-ruleset/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "connection": "<your-connection-id>",
           "selected_data": [
             {
               "files": ["file1.json", "file2.json"],
               "locators": [["age"], ["users", "*", "name"]]
             },
             {
               "files": ["file1.csv", "file2.csv"],
               "locators": ["gender", "address"]
             },
             [repeated for different file groups…]
           ],
         }'

POST /api/generate-file-ruleset/ Response

The response is plain text containing the ruleset yaml.

Generate Ruleset Result Object

Generate Ruleset Result objects are returned by DataMasque for the async-generate-ruleset family of APIs. They have the following fields:

Field	Type	Description
`connection`	`string`	The ID of the connection for which a ruleset is being generated.
`generated_ruleset`	`string`	The ruleset that has been generated. Not applicable if ruleset generation was started using the `from-csv` API.
`status`	`string`	The status of the ruleset generation task. One of `queued`, `running`, `finished`, `failed`, or `cancelled`.
`status_message`	`string`	A status message describing the progress of the ruleset generation task.
`error_message`	`string`	The error message when generating the ruleset has failed.
`last_updated`	`string`	The timestamp of the last update to this `Generate Ruleset Result`, in ISO 8601 format.

Requests that use Generate Ruleset Result Object:

Endpoint to query to get the generated ruleset:

When ruleset generation is started using POST /api/async-generate-ruleset/{connection_id}/ and completes successfully, the generated ruleset is available in the generated_ruleset field of the response from the GET /api/async-generate-ruleset/{connection_id}/ API endpoint.

When ruleset generation is started using POST /api/async-generate-ruleset/{connection_id}/from-csv/ and completes successfully, a ZIP file of all generated rulesets can be downloaded from the GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ API endpoint.

GET /api/async-generate-ruleset/{connection_id}/

Authorization: User token only.

Returns result of generating ruleset progress.

GET /api/async-generate-ruleset/{connection_id}/ Parameters

Field	Type	Required	Location	Description
`connection_id`	`string`	Yes	URL Path	The `id` of the `Connection`.

GET /api/async-generate-ruleset/{connection_id}/ Responses

Status Code	Description
`200`	A JSON serialised Generate Ruleset Result Object.
`404`	Not Found: No connection with the specified ID exists.

GET /api/async-generate-ruleset/{connection_id}/ `curl` example

curl "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
     -H "Authorization: Token <your-api-token>"

POST /api/async-generate-ruleset/{connection_id}/

Authorization: User token only.

Start generating ruleset for selected columns of a database connection or for selected data of a file connection.

POST /api/async-generate-ruleset/{connection_id}/ Parameters

Field	Type	Required	Location	Description
`connection_id`	`string`	Yes	URL Path	The `id` of the `Connection`.

POST /api/async-generate-ruleset/{connection_id}/ Responses

Status Code	Description
`201`	A JSON serialised Generate Ruleset Result Object.
`404`	Not Found: No connection with the specified ID exists.

POST /api/async-generate-ruleset/{connection_id}/ `curl` example

For generating rulesets on database connections:

curl -X POST "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "selected_columns": {
             "schema_name": {
               "table_name": [
                 "column_name_1",
                 "column_name_2"
               ]
             }
           }
         }'

For generating rulesets for file connections:

POST /api/async-generate-ruleset/{connection_id}/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "selected_data": [
             {
               "files": ["file1.json", "file2.json"],
               "locators": [["age"], ["users", "*", "name"]]
             },
             {
               "files": ["file1.csv", "file2.csv"],
               "locators": ["gender", "address"]
             },
             [repeated for different file groups…]
           ],
         }'

DELETE /api/async-generate-ruleset/{connection_id}/

Authorization: User token only.

Cancels ruleset generation currently in progress for a connection. If the ruleset generation has already finished, deletes any generated ruleset.

Warning! Deletion of the generated ruleset is irreversible.

DELETE /api/async-generate-ruleset/{connection_id}/ Parameters

Field	Type	Required	Location	Description
`connection_id`	`string`	Yes	URL Path	The `id` of the `Connection`.

DELETE /api/async-generate-ruleset/{connection_id}/ Responses

Status Code	Description
`200`	Ruleset generation cancelled before any results were processed.
`204`	Ruleset generation had finished. The generated ruleset has been deleted.
`404`	Not Found: No connection with the specified ID exists.

DELETE /api/async-generate-ruleset/{connection_id}/ `curl` example

curl -X DELETE "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
     -H "Authorization: Token <your-api-token>"

POST /api/async-generate-ruleset/{connection_id}/from-csv/

Authorization: User token only.

Start generating a ruleset for selected columns of a database connection. The columns are specified by modifying the CSV report retrieved from the /api/runs/{run_id}/db-discovery-results/report/ endpoint. Specifically, there is one discovered database column detailed in each row of the CSV report, and if that column is to be included in ruleset generation, the Selected column of the CSV should be marked with 1, true, y or yes (case-insensitive).

POST /api/async-generate-ruleset/{connection_id}/from-csv/ Parameters

Field	Type	Required	Location	Description
`connection_id`	`string`	Yes	URL Path	The `id` of the `Connection`.
`csv_or_zip_file`	`file`	Yes	Request Body	The byte content of the CSV, or the ZIP file containing one or more CSVs.
`target_size_bytes`	`int`	No	Request Body	Generate rulesets of approximately this size in bytes. Defaults to 512,000 (500 KiB).
`force_run`	`boolean`	No	Request Body	If set to `true`, cancel any existing ruleset generation and restart it. Defaults to `false`.

POST /api/async-generate-ruleset/{connection_id}/from-csv/ Responses

Status Code	Description
`201`	A JSON serialised Generate Ruleset Result Object.
`404`	Not Found: No connection with the specified ID exists.

POST /api/async-generate-ruleset/{connection_id}/from-csv/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
     -H "Authorization: Token <your-api-token>" \
     -F "csv_file=@selected_report.csv" \
     -F "target_size_bytes=250000"

GET /api/async-generate-ruleset/{connection_id}/download-rulesets/

Authorization: User token only.

Once ruleset generation invoked via POST /api/async-generate-ruleset/{connection_id}/from-csv/ is completed, query this endpoint to download the rulesets in a ZIP file.

GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ Parameters

Field	Type	Required	Location	Description
`connection_id`	`string`	Yes	URL Path	The `id` of the `Connection`.

GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ Responses

Status Code	Description
`200`	Returns a streamed ZIP file containing the generated rulesets.
`400`	Bad Request: The ruleset generation is still in progress, or has failed.

If an error response is received, query the GET /api/async-generate-ruleset/{connection_id}/ endpoint to check the status of ruleset generation.

GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ `curl` example

curl -o rulesets.zip "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/download-rulesets/" \
     -H "Authorization: Token <your-api-token>"

File Data Discovery

POST /api/run-file-data-discovery/

Authorization: User token only.

Executes data discovery against files on a file connection. The file connection must already be configured. Use the UUID of the file connection in the request, which can be found:

at the top of the page when you view the connection in the DataMasque UI, or
in the URL when you view the connection in the DataMasque UI, or
in the id field of the Connection Object.

Discovery keywords

By default, DataMasque's extensive list of built-in keywords is used to identify which fields and attributes in the files are considered sensitive. DataMasque matches the name of the field or attribute against each keyword using a case-insensitive, partial match. For example, a field named credit_CARD_NUMBER will match the Credit card keyword.

You can use various options to refine the set of discovery keywords.

Setting disable_built_in_keywords to true means that the built-in keyword list linked above will not be used. In this case, the discovery process will use only the keywords given in custom_keywords and any configured global custom keywords.
The custom_keywords option allows you to specify a list of additional keywords to match on. Any fields or attributes whose name includes one or more of those keywords will be flagged as sensitive.
A match between a field or attribute's name and a value in the ignored_keywords list will cause a field or attribute to be completely excluded from the results, even if its name suggests that the field may contain sensitive data.
Global keywords, as configured through the Settings page of the DataMasque UI, are also considered unless disable_global_custom_keywords and/or disable_global_ignored_keywords (as appropriate) are set to true.

Warning! Ignored keywords have priority. If a field or attribute name matches both a built-in, global, or custom keyword and also matches an entry in ignore_keywords or a global ignored keyword, the field or attribute will not be included in the discovery results.

Specifying files to discover

Supported filetypes for discovery are:

JSON (.json)
NDJSON (.ndjson)
Parquet (.parquet)
CSV (.csv)

Note: Files' types are determined solely by the file extension, not by their content.

Use the include, skip and recurse options to control which files are included in the discovery process. These have the same syntax and meaning as in a from_file task definition. If none of these options are included, DataMasque will run discovery against all files (of the supported filetypes) in the base directory specified on the connection, but will not recurse into subdirectories.

See also Choosing files to mask with include/skip for an exact specification of the behaviour of, and some common examples of, include and skip rules.

Warning! If a file matches both an include and a skip rule, that file will not be included in data discovery.

Note: Take care to correctly escape backslashes in include or skip regexes. For example, if you want to match a literal dot (.) in a filename, the regex needs to escape the dot with a backslash and this backslash must itself be escaped as part of JSON encoding rules, since the request body is in JSON format. So you might use the JSON object {"regex": "file\\.[0-9]+\\.csv"}, representing the regex file\.[0-9]+\.csv which will match file.53.csv but not filex53.csv.

Encoding of CSV files

The encoding option controls how DataMasque interprets CSV files. The default encoding is utf-8. Refer to Python Standard Encodings for a list of supported encodings.

Supported Parquet column types

The list of Parquet column data types supported by file data discovery is the same as the list of supported data types for Parquet masking. See the list of supported data types here.

For complex columns (those of struct, map and list type), also called nested columns, all fields of scalar data type within the columns are discovered separately. In the file discovery reports, the locators for the individual scalar fields are given as JSON paths with the column name as the first element.

Note: This differs from the syntax used for masking these fields where the column name must be specified separately from the path to the field within the column.

For example, with a column named staff of type map<string, struct<name: string, employee_id: int64, salary_history: list<float>>> (a map where the keys are strings and the values are a structure type with keys name, employee_id, and salary_history, the latter being a list), the discovered fields will all have one of the following path formats:

staff/<key value>/name
staff/<key value>/employee_id
staff/<key value>/salary_history/*

where <key value> is a key in the top-level map. Notice that all list indices are replaced with the wildcard * and treated as a single field.

Custom and ignored keywords match on the name of the individual field (such as name in the above example), not the name of the column. For list fields, they match on the last string element of the path (ignoring list indices), for example salary_history.

In-data discovery options

The in_data_discovery parameter on the API request body allows you to control whether and how the discovery process uses in-data discovery to refine sensitive data matches. It is an object parameter with the following fields.

You must specify the enabled parameter (true or false).
Optional parameters are a row_sample_size (positive integer), force (a boolean), a list of zero or more custom_rules, and a list of zero or more non_sensitive_rules.
Each entry in custom_rules is an object with parameters name and pattern, where name is any user-defined name and pattern is a regex.
Each entry in non_sensitive_rules is an object with a pattern parameter, again a regex.
row_sample_size defaults to 1000.
force defaults to false.
custom_rules and non_sensitive_rules are empty by default.

When enabled, in-data discovery applies the built-in rules, alongside any specified custom_rules and non_sensitive_rules, matching against the data within tabular file columns, or scalar values within JSON documents or complex Parquet columns.

Warning! Non-sensitive rules have priority. If a field or attribute name matches a keyword, built-in IDD rule or custom IDD rule, and also matches a non-sensitive rule, the field or attribute will be marked in the discovery results as Custom Non-Sensitive.

The row_sample_sizecontrols how many samples the in-data discovery process will examine to try to identify the type of data. Configure the row_sample_size according to your needs, bearing in mind that in-data discovery samples only the first <row_sample_size> rows or values encountered when processing the file (so the first 1000 rows in a CSV file, for example, with the default sample size). Use of very large sample sizes can slow down data discovery and consume a lot of RAM (see also this table of memory limits for in-data discovery).

If your files are small and/or consistent in that they have the same kind of data present in most or all rows, then a sample size of 100-500 rows is sufficient.
If you have large files with sparse data (many nulls) and/or differing data formats within a column or JSON path, use a larger sample size.

When enabled, force will run IDD on a column even if schema discovery has already flagged the column as containing sensitive data.

POST /api/run-file-data-discovery/ Parameters

Field	Type	Required	Description
`connection`	`string`	Yes	The `id` of the `Connection`.
`in_data_discovery`	`object`	No	In-data discovery options. An object containing `enabled`, `row_sample_size`, `custom_rules`, `ignore_rules` and `force` options. Defaults to `{enabled: false}`.
`custom_keywords`	`array[string]`	No	List of keywords that, where a field or attribute's name matches one or more of the keywords, indicates the column contains sensitive data. Default value is an empty list.
`ignored_keywords`	`array[string]`	No	List of keywords that, where a field or attribute's name matches one or more of the keywords, indicates the field or attribute should be excluded from the schema discovery results. Default value is an empty list.
`disable_global_custom_keywords`	`boolean`	No	If set to `true`, then the user-defined global set of custom keywords will not be used to flag fields or attributes as sensitive. Default value is `false`.
`disable_global_ignored_keywords`	`boolean`	No	If set to `true`, then the user-defined global set of ignored keywords will not be used to exclude fields or attributes from the discovery results. Default value is `false`.
`disable_built_in_keywords`	`boolean`	No	If set to `true`, then DataMasque's built-in list of keywords will not be used to flag fields or attributes as sensitive. Default value is `false`.
`include`	`array[object]`	No	Files to discover, specified as `glob` or `regex`. Default value is an empty list, meaning everything will be included.
`skip`	`array[object]`	No	Files to exclude, specified as `glob` or `regex`. Default value is an empty list, meaning everything will be included.
`recurse`	`boolean`	No	Whether to recurse into subdirectories of the base directory, or of items matched by `include`. Default value is `false`.
`encoding`	`string`	No	File byte encoding. Only applies to CSV files. Default value is `utf-8`.
`workers`	`integer`	No	Number of workers. Refer to the File Ruleset Generator page for information. Allowed range is 1-32. Defaults to 1.

POST /api/run-file-data-discovery/ Responses

Data discovery runs asynchronously as a special type of masking run. This API endpoint returns a Run object which contains an id field. Use the GET /api/runs/{id}/ endpoint with this run ID to query the status of the data discovery process. To retrieve the file discovery results when the run is complete, use the GET /api/runs/{id}/file-discovery-results/ endpoint with this run ID.

Status Code	Description
`201`	A JSON serialised Run object.

POST /api/run-file-data-discovery/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/run-file-data-discovery" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "connection": "<your-connection-id>",
           "in_data_discovery": {
             "enabled": true,
             "row_sample_size": 500,
             "custom_rules": [
               {
                 "name": "temp_staff",
                 "pattern": "temp.*"
               }
             ],
             "non_sensitive_rules": [
               {"pattern": "retired.*"}
             ],
             "force": false
           },
           "custom_keywords": ["id1", "id2"],
           "ignored_keywords": ["ignore1"],
           "include": [
             {"glob": "*.ndjson"},
             {"glob": "*.json"},
           ],
           "skip": [
             {"regex": "backup/staff[0-9]+\\.json"},
           ],
           "recurse": true,
           "workers": 4
         }'

GET /api/runs/{id}/file-discovery-results/

Authorization: User token or API token.

Retrieve file discovery results.

GET /api/runs/{id}/file-discovery-results/ Parameters

Field	Type	Required	Location	Description
`id`	`integer`	Yes	URL Path	The `id` of the `Run`.

GET /api/runs/{id}/file-discovery-results/ Responses

Status Code	Description
`200`	A JSON serialised list of File Discovery Result objects.

GET /api/runs/{id}/file-discovery-results/ `curl` example

curl "https://<your-datamasque-host>/api/runs/{id}/file-discovery-results/" \
     -H "Authorization: Token <your-api-token>"

GET /api/runs/{id}/file-discovery-results/ Example response

This shows a group of results where one file was discovered with a Metadata match on Passenger ID, an In-Data match on Name and no matches on Ticket.

[
  {
    "id": 1,
    "connection": {
      "id": "f795b7f1-d654-41c8-bb7c-db741d81dc19",
      "name": "example_file_source"
    },
    "file_type": "csv",
    "files": [
      {
        "path": "example.csv",
        "delimiter": ",",
        "encoding": "utf-8",
        "file_type": "csv"
      }
    ],
    "results": [
      {
        "locator": "PassengerId",
        "matches": [
          {
            "label": "identifiers",
            "categories": ["PII", "PHI"],
            "flagged_by": "Metadata Discovery",
            "description": "Identification"
          }
        ],
        "data_types": ["int"]
      },
      {
        "locator": "Name",
        "matches": [
          {
            "label": "name",
            "categories": ["PII", "PCI", "PHI"],
            "flagged_by": "In-Data Discovery",
            "description": "Full Names"
          }
        ],
        "data_types": ["str"]
      },
      {
        "locator": "Ticket",
        "matches": [],
        "data_types": ["str"]
      }
    ]
  }
]

File Discovery Result Object

File Discovery Result objects have the following fields:

Field	Type	Description
`id`	`integer`	The `id` of the `File Discovery Result`.
`connection`	`object`	The UUID and name identifying the connection used for this `File Discovery Result`.
`file_type`	`string`	The file type (`csv`, `parquet`, `json`, or `ndjson`). File Discovery Results are grouped by file type.
`files`	`array[object]`	A list of File objects.
`results`	`array[object]`	A list of Result objects.

File Object

File objects have the following fields:

Field	Type	Description
`path`	`string`	The discovered file's path, relative to the base directory of the connection.
`file_type`	`string`	The file type (`csv`, `parquet`, `json`, or `ndjson`).
`delimiter`	`Optional[string]`	For delimited text files, the field separator. e.g "," for csv
`encoding`	`Optional[string]`	The file encoding, for example "utf-8".

Result Object

Result objects have the following fields:

Field	Type	Description
`locator`	`array['string' or 'int']` or `string`	Either a JSON locator or a column name.
`matches`	`array['object']`	A list of Match objects.
`data_types`	`array['string']`	The list of data types found for this field: `int`, `long`, `str`, `date`, `time`, `year`, `timestamp`, `boolean`, `float`, or `decimal`.

Match Object

Match objects have the following fields:

Field	Type	Description
`categories`	`array['string']`	A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom.
`flagged_by`	`string`	Whether the column was flagged for sensitive data through in-data discovery or through the standard sensitive data discovery / keyword matching process. `Metadata Discovery` or `In-Data Discovery`.
`description`	`string`	The name of the rule which caused the column to be flagged for sensitive data.
`label`	`string`	Machine-readable representation of `description`.

Oracle Wallets

GET /api/oracle-wallets/

Authorization: User token only.

Returns a list of Oracle wallets. These are used to connect to encrypted Oracle connections.

GET /api/oracle-wallets/ Parameters

No parameters.

GET /api/oracle-wallets/ Responses

Status Code	Description
`201`	A JSON serialised list of Oracle wallets.

GET /api/oracle-wallets/ `curl` example

curl "https://<your-datamasque-host>/api/oracle-wallets/" \
     -H "Authorization: Token <your-api-token>"

POST /api/oracle-wallets/

Authorization: User token only.

Create a new Oracle wallet.

POST /api/oracle-wallets/ Parameters

Field	Type	Required	Location	Description
`name`	`string`	Yes	Form Field	The name of the Oracle Wallet.
`zip_archive`	`file`	Yes	Form Field	The Zip archive file.

POST /api/oracle-wallets/ Responses

Status Code	Description
`201`	A JSON serialised Oracle wallet object of the wallet created.

POST /api/oracle-wallets/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/oracle-wallets/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: multipart/form-data" \
     -F "name=<fileset_name>" \
     -F "zip_archive=@</path/to/your/file.zip>"

DELETE /api/oracle-wallets/{id}/

Authorization: User token only.

Delete the Oracle wallet with the specified id.

DELETE /api/oracle-wallets/{id}/ Parameters

No parameters.

DELETE /api/oracle-wallets/{id}/ Responses

Status Code	Description
`204`	Operation succeeded.

DELETE /api/oracle-wallets/{id}/ `curl` example

curl -X DELETE "https://<your-datamasque-host>/api/oracle-wallets/{id}/" \
     -H "Authorization: Token <your-api-token>"

Git Setting Object

Git settings are global for the DataMasque instance and can only be updated by an admin user. Git settings are updated on the Settings page in the DataMasque UI.

Git Setting objects have the following fields:

Field	Type	Description
`git_repository_url`	`string`	The URL of where the Git repository is hosted.
`git_branch`	`string`	The name of the Git branch from which DataMasque will push or pull.
`git_directory_path`	`string`	The directory that DataMasque will push and pull rulesets to, relative to the root of the repository. Note that DataMasque does not support pushing/pulling rulesets in subdirectories of this directory.

GET /api/git-setting/

Authorization: User token only.

Retrieve a Git Setting Object with information about the DataMasque instance's Git settings.

GET /api/git-setting/ Parameters

No parameters.

GET /api/git-setting/ Responses

Status Code	Description
`200`	A JSON serialized Git Setting Object for the DataMasque instance.

GET /api/git-setting/ `curl` example

curl "https://<your-datamasque-host>/api/git-setting/" \
     -H "Authorization: Token <your-api-token>"

GET /api/git-setting/user/

Authorization: User token only.

Retrieve a Git Setting Object with information about the DataMasque instance's Git settings. If the current user has specified a git_directory_path, this will be present in the response. Otherwise, the git_directory_path will be the global one for the DataMasque instance.

GET /api/git-setting/user/ Parameters

No parameters.

GET /api/git-setting/user/ Responses

Status Code	Description
`200`	A JSON serialized Git Setting Object for the DataMasque instance.

GET /api/git-setting/user/ `curl` example

curl "https://<your-datamasque-host>/api/git-setting/user/" \
     -H "Authorization: Token <your-api-token>"

SSH Key Object

SSH Key objects have the following fields:

Field	Type	Description
`name`	`string`	The specified filename of the SSH Key file.
`date_uploaded`	`string`	The ISO 8601 datetime string of when the user uploaded the SSH key.

GET /api/git-ssh-key/

Authorization: User token only.

Retrieve an SSH Key Object for information about the current user's uploaded SSH Key.

GET /api/git-ssh-key/ Parameters

No parameters.

GET /api/git-ssh-key/ Responses

Status Code	Description
`200`	A JSON serialized SSH Key Object which is the most recent SSH Key Upload for the user which made the request.

GET /api/git-ssh-key/ `curl` example

curl "https://<your-datamasque-host>/api/git-ssh-key/" \
     -H "Authorization: Token <your-api-token>"

PUT /api/git-ssh-key/

Authorization: User token only.

Upload an SSH Key to be used to access a Git remote repository.

Warning: A user may have only one SSH key at a time, so the existing key will be deleted and replaced with the uploaded key for the user making the request.

PUT /api/git-ssh-key/ Parameters

Field	Type	Required	Location	Description
`key_file`	`file`	Yes	Form Field	The SSH Key file.
`name`	`string`	Yes	Form Field	The name of the file.

PUT /api/git-ssh-key/ Responses

Status Code	Description
`200`	A JSON serialized SSH Key Object, which is the most recent SSH Key Upload for the user making the request.

PUT /api/git-ssh-key/ `curl` example

curl -X PUT "https://<your-datamasque-host>/api/git-ssh-key/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: multipart/form-data" \
     -F "key_file=@</path/to/your/file>" \
     -F "name=<your-ssh-key-filename>"

DELETE /api/git-ssh-key/

Authorization: User token only.

Delete the current user's uploaded SSH key.

DELETE /api/git-ssh-key/ Parameters

No parameters.

DELETE /api/git-ssh-key/ Responses

Status Code	Description
`204`	The SSH key associated with the requesting user has been deleted.

DELETE /api/git-ssh-key/ `curl` example

curl DELETE -X "https://<your-datamasque-host>/api/git-ssh-key/" \
     -H "Authorization: Token <your-api-token>"

GET /api/ruleset-git/

Authorization: User token only.

Pull the content of a specific ruleset given its commit ID. The current user's Git SSH key is used for authentication.

How File Paths Are Built

Internally, DataMasque generates the name of the file by appending the specified extension to ruleset_name. The file name is then appended to git_directory_path (from the DataMasque Git Settings) to build the full file path. For example, for a ruleset_name of My Ruleset, extension of .yml and git_directory_path of masking/rulesets, the file masking/rulesets/My Ruleset.yml will be retrieved. Its contents will be that as at the specified commit ID.

GET /api/ruleset-git/ Parameters

Field	Type	Required	Location	Description
`commit_id`	`string`	Yes	Query Parameter	The Git commit ID for the ruleset.
`ruleset_name`	`string`	Yes	Query Parameter	The name of the ruleset. Used to build the path as per How File Paths Are Built above.
`extension`	`string`	No	Query Parameter	The extension to save with the ruleset name. Must be `.yml` or `.yaml`. Default to `.yml` if missing.

GET /api/ruleset-git/ Responses

Status Code	Description
`200`	A JSON object with a single key, `config_yaml`, that contains the ruleset content

GET /api/ruleset-git/ `curl` example

curl "https://<your-datamasque-host>/api/ruleset-git/?commit_id=<your-full-commit-id>&ruleset_name=<your-ruleset-name>&extension=.yaml" \
     -H "Authorization: Token <your-api-token>"

POST /api/ruleset-git/

Authorization: User token only.

Commit then push changes upstream for a specific ruleset.

POST /api/ruleset-git/ Parameters

Field	Type	Required	Location	Description
`commit_message`	`string`	Yes	Request Body	The Git commit message for the ruleset changes.
`ruleset_name`	`string`	Yes	Request Body	The name of the ruleset. Used to build the path as per How File Paths Are Built above.
`extension`	`string`	No	Request Body	The extension to save with the ruleset name. Must be `.yml` or `.yaml`. Default to `.yml` if missing.
`ruleset_content`	`string`	Yes	Request Body	The YAML contents of the ruleset.

POST /api/ruleset-git/ Responses

Status Code	Description
`200`	Operation succeeded.

POST /api/ruleset-git/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/ruleset-git/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -d '{
           "commit_message": "Update ruleset",
           "ruleset_name": "<your-ruleset-filename>",
           "extension": ".yml",
           "ruleset_content": "version: \"1.0\"\ntasks:\n  - type: run_data_discovery"
         }'

GET /api/ruleset-git/files/

Authorization: User token only.

This endpoint lists the git_directory_path in the remote repository configured for the DataMasque instance. It considers any files ending in .yml to be ruleset files, and will fetch the list of commits for each of them. It does not enter into subdirectories of git_directory_path.

GET /api/ruleset-git/files/ Parameters

No parameters.

GET /api/ruleset-git/files/ Responses

Example Response

The response is a JSON object with each key being the name of a file with a .yml extension in the git_directory_path. Each file entry has an array objects with a commit ID, commit date and commit message.

{
  "Ruleset1.yml": [
    {"commit": "f061s…46756", "date": "2024-01-10 12:31:45", "message": "Added Column"},
    {"commit": "64c18…1a279", "date": "2024-01-09 10:19:13", "message": "Removed Column"}
  ],
  "Another Ruleset.yml": [
    {"commit": "377f5…b32f4", "date": "2023-12-25 12:31:45", "message": "Update rule"}
  ]
}

Response Codes

Status Code	Description
`200`	A JSON serialized list of ruleset names and their associated Git commit history.

GET /api/ruleset-git/files/ `curl` example

curl "https://<your-datamasque-host>/api/ruleset-git/files/" \
     -H "Authorization: Token <your-api-token>"

Exporting DataMasque Configuration

To keep a backup of the data stored in DataMasque, you can export it to a Zip file. This is done by making a GET request to /api/export/v1/. Optionally, you can also specify the export_type query parameter to select which data to include in the export. The parameter may be specified multiple times to specify different types of data to include in the same Zip file.

The Zip file will have the following structure, but please note that some files/directories may be missing if those files were not included in the export, due to setting an export_type.

Path	Type	Description
`manifest.json`	File	A JSON file containing metadata about the export and other files in the Zip.
`rulesets/database/`	Directory	A directory containing database masking rulesets in YAML format.
`rulesets/file/`	Directory	A directory containing file masking rulesets in YAML format.

Export Types

The following export types may be used to control the data included in the export archive:

Currently, only the export of Rulesets is supported, therefore this is no difference in specifying rulesets as the export_type or omitting the export_type parameter completely.

Export Type	Description
`all`	Include all data described in this table. This is the default if no `export_type` is selected.
`rulesets`	Include only rulesets.

`manifest.json` format

The manifest.json file contains the following information:

metadata: Metadata about the export archive.
- version: The version format of the export file.
- exported_at: The UTC date and time the export was created, in ISO format.
data: Information about the files included in the export archive.
- rulesets: A list of metadata about the exported ruleset. Each object in the list contains the id, name and type (database or file) for each exported ruleset.

Ruleset Export Naming

When rulesets are exported to a Zip archive, they are stored in either the rulesets/database/ directory, (for database rulesets) or rulesets/file/ directory (for file rulesets).

The name of the file is built by appending .yml to the ruleset name. For example:

The database masking ruleset named Ruleset 01 would be exported to rulesets/database/Ruleset 01.yml.
The file masking ruleset named Ruleset F would be exported to rulesets/file/Ruleset F.yml.

Note: Rulesets that have been deleted from DataMasque are not visible in the ruleset list in the DataMasque dashboard, but are still retained in the DataMasque database because runs reference them. These "archived" rulesets are not including the Zip export.

GET /api/export/v1/

Authorization: User token only.

Export DataMasque data to a Zip archive in the Version 1 format. The filename of the archive will be based on the export type selected, and contain the current UTC date and time. For example: datamasque_export_rulesets_20240211-091507.zip.

GET /api/export/v1/ Parameters

Field	Type	Required	Location	Description
`export_type`	`string`	No	Query Parameter	The type of data to export (see Export Types for a full list). Defaults to `all`.

Multiple export types may be specified by using multiple export_type query parameters. For example, /api/export/v1/?export_type=type_a&export_type=type_b.

GET /api/export/v1/ `curl` example

When using curl, specify the -O flag to output the response to disk, and the -J flag to allow the response to specify the name (as per the example above).

curl "https://<your-datamasque-host>/api/export/v1/" \
     -H "Authorization: Token <your-api-token>" \
     -J -O

A Zip file named like datamasque_export_all_20240211-091507.zip will be saved to the current directory.

Importing DataMasque Configuration

A DataMasque export Zip can be imported to a DataMasque install using the /api/export/v1 API endpoint.

For the best import experience, a Zip that has been exported from DataMasque than contains a manifest.json file should be used. However, a Zip with the correct folder structure may also be created, even if missing manifest.json. DataMasque will import the information, but automatic conflict resolution of duplicate rulesets will not work as well. The difference between inclusion/exclusion of manifest.json is explained below.

Zip Exports From DataMasque With `manifest.json`

Since Zip exports created by DataMasque include the UUID of each exported item, this can be used to determine which items already exist.

When importing rulesets:

If a ruleset with a given ID exists during import:
- If ruleset is archived, then it will be restored and its name and content are updated with the imported ruleset.
- If ruleset is not archived, then no action is taken with that ruleset. The content in the DataMasque instance is unchanged.
If a ruleset is found with a matching name, and the contents are identical, then no action is taken. The content in the DataMasque instance is unchanged.
If a ruleset is found with a matching name, but the contents are different, then a new ruleset is created by appending Copy to the name. For example, if Ruleset A exists, then the content will be uploaded to a ruleset Ruleset A Copy. An incrementing number will be added until an unused name is found, for example, Copy 1, Copy 2, etc.
If no ruleset with the given ID or name exists, then it is created.

Because of these rules, imports of the same Zip archive may be repeated multiple times without duplicating content.

Zip Exports Created Without `manifest.json`

A Zip export archive may be created manually, provided the file structure is correct. That is, it matches the structure outlined in Exporting DataMasque Configuration. Without a manifest.json, the ID of rulesets is not known, so matching is done based on the name, using the following rules:

If a ruleset is found with a matching name, and the contents are identical, then no action is taken. The content in the DataMasque instance is unchanged.
If a ruleset is found with a matching name, but the contents are different, then a new ruleset is created by appending Copy to the name. For example, if Ruleset A exists, then the content will be uploaded to a ruleset Ruleset A Copy. An incrementing number will be added until an unused name is found, for example, Copy 1, Copy 2, etc.
If no ruleset with the given name exists, then it is created.

Because the imported IDs of rulesets is not known, re-running an import without a manifest.json may result in duplicated rulesets with identical content.

POST /api/import/v1/

Authorization: User token only.

Import a DataMasque export Zip file. The response will contain a list of actions taken for each included object.

POST /api/import/v1/ Parameters

Field	Type	Required	Location	Description
`zip_archive`	`file`	Yes	Form Field	The exported Zip archive file.

POST /api/import/v1/ Responses

The response of an import request contains information about the resources that were imported, grouped by resource type. An example response is shown below.

{
  "data": {
    "rulesets": {
      "metadata": {"processed":  6, "created":  2, "restored": 1, "error":  1},
      "data": [
        {
          "exported_name": "Ruleset A", 
          "exported_id": "9d641e97-adf7-4f22-9089-afc3711bf222",
          "imported_name": "Ruleset A", 
          "imported_id": "9d641e97-adf7-4f22-9089-afc3711bf222",
          "ruleset_type": "database",
          "status": "NOT_CREATED", 
          "message": "A ruleset with ID \"9d641e97-adf7-4f22-9089-afc3711bf222\"  already exists, and was not changed."
        },
        {
          "exported_name": "Ruleset B", 
          "exported_id": null,
          "imported_name": "Ruleset B Copy", 
          "imported_id": "04ea20f0-ad4c-498e-881f-b0bc79d83ba7",
          "ruleset_type": "file",
          "status": "CREATED_DUPLICATE", 
          "message": "A ruleset named \"Ruleset B\" already exists, so ruleset \"Ruleset B Copy\" was created."
        },
        {
          "exported_name": "Ruleset C", 
          "exported_id": null,
          "imported_name": "Ruleset C", 
          "imported_id": "7d731d55-68c9-400e-a790-e052afe789cc",
          "ruleset_type": "database", 
          "status": "NOT_CREATED", 
          "message": "A ruleset named \"Ruleset C\" exists with identical content."
        },
        {
          "exported_name": "Ruleset D", 
          "exported_id": null,
          "imported_name": "Ruleset D", 
          "imported_id": "99eeffd3-3f65-4ed7-8ad1-a31a539b7b2c",
          "ruleset_type": "file",
          "status": "CREATED", 
          "message": "Ruleset named \"Ruleset D\" did not exist, and was created."
        },
        {
          "exported_name": "Ruleset E", 
          "exported_id": "c0f5b5bb-a2ce-4cea-9248-1b8ef6539a0e",
          "imported_name": "Ruleset E", 
          "imported_id": "c0f5b5bb-a2ce-4cea-9248-1b8ef6539a0e",
          "ruleset_type": "database",
          "status": "RESTORED", 
          "message": "An archived ruleset with ID \"c0f5b5bb-a2ce-4cea-9248-1b8ef6539a0e\" has been restored and overwritten with the new name and content."
        },
        {
          "exported_name": "Ruleset F", 
          "exported_id": "abc123",
          "imported_name": null, 
          "imported_id": null,
          "ruleset_type": "database",
          "status": "ERROR", 
          "message": "Import of ruleset with ID \"abc123\" due to error: invalid ID." 
        }
      ]
    }
  }
}

The metadata for each item type shows the number of items of that type processed, and how many of each one were created, restored or had an error.

Each data object contains information about the import of that item. The fields are:

exported_name: The name of the ruleset in the export Zip archive.
exported_id: The ID of the ruleset from the export Zip archive. Only available if a manifest.json files is present, otherwise this will be null.
imported_name: The name that the ruleset was imported to. Usually this will match exported_name. This will only be null on error. If the ruleset was not imported due to it already existing, this will still match exported_name.
imported_id: The ID that the ruleset was imported to. This will be generated if exported_id was null, otherwise it will be expected to match exported_id (even if the data was not changed). imported_id will be null on error.
ruleset_type: One of database or file.
status: The status of the import of this ruleset. One of:
- NOT_CREATED: Ruleset was not created due to the ID existing or content being identical.
- CREATED_DUPLICATE: A ruleset with that name existed, so it was imported with a new name (in imported_name).
- CREATED: A ruleset with that ID or name did not exist, so was created.
- RESTORED: An archived ruleset has been restored and overwritten with the new name and content from an imported ruleset.
- ERROR: There was an error creating the ruleset. Check message for details.
message: A human-readable message describing the action taken or error that occurred. Messages may change between DataMasque versions, so they should not be relied on to determine the outcome of an import. Instead, refer to the status field.

The status code of the response, as shown in the table below, gives a quick overview of if any resources were created or not.

Status Code	Description
`200`	The import was successful, indicating either no changes (e.g. the uploaded rulesets already existed) or the successful restoration of some rulesets.
`201`	The import was successful, and one or more rulesets were created.

POST /api/import/v1/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/import/v1/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: multipart/form-data" \
     -F "zip_archive=@</path/to/your/datamasque_export_all_20240211-091507.zip>"

Other API Requests

POST /api/users/admin-install/

Authorization: Anonymous, Only when no user has been created.

Verify the DataMasque installation, and set up an admin account.

POST /api/users/admin-install/ Parameters

Field	Type	Required	Location	Description
`email`	`string`	Yes	Request Body	The email of the user you are logging in as.
`username`	`string`	Yes	Request Body	The username of the user you are logging in as.
`password`	`string`	Yes	Request Body	The password for the user.
`re_password`	`string`	Yes	Request Body	The password for the user again, to confirm the password entered above.
`allowed_hosts`	`array[string]`	Yes	Request Body	A list of hostnames, IP addresses or CIDR networks that will be allowed to access DataMasque upon installation.
`aws_ec2_instance_id`	`string`	Required only for AWS Marketplace installations.	Request Body	The instance id of the AWS EC2.
`contract_license_type`	`string`	Required only for AWS Contract Product installations.	Request Body	For contract products, the type of product to check out. Must be either `business` or `enterprise`.

POST /api/users/admin-install/ Responses

Status Code	Description
`201`	A JSON serialised User object, with an extra `warnings`* item.

* Any non-critical warnings that were generated during installation are included in the warnings item of the response. This is an array of strings.

POST /api/users/admin-install/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/users/admin-install/" \
     -H "Authorization: Token <your-api-token>" \
     -d '{
           "email": "<your-admin-email>",
           "username": "<your-username>",
           "password": "<your-admin-password>",
           "re_password": "<your-admin-password>",
           "allowed_hosts": ["masque.local"],
           "aws_ec2_instance_id": "<your-instance-id>"
         }'

Installation Info Object

A JSON object showing the state of the current installation with the following data:

Field	Type	Description
`is_aws_marketplace`	`boolean`	Whether the current installation has been installed from the AWS marketplace.
`installed`	`boolean`	If the current installation has been successfully installed.
`is_smtp_configured`	`boolean`	If SMTP has been configured on the DataMasque instance.
`is_saml_sso_configured`	`boolean`	Is SSO has been enabled on the DataMasque instance.

Requests that use Installation Info Object:
- GET /api/check/

GET /api/app/check/

Authorization: User token or API token.

Checks to verify if DataMasque has successfully been installed.

GET /api/app/check/ Parameters

No parameters.

GET /api/app/check/ Response

Code 200

Description:

Status Code	Description
`200`	A JSON serialised Installation Info Object object.

GET /api/app/check/ `curl` example

curl "https://<your-datamasque-host>/api/app/check/" \
     -H "Authorization: Token <your-api-token>"

POST /api/license-upload/

Authorization: User token only.

Uploads a licence file to DataMasque.

POST /api/license-upload/ Parameters

No parameters.

POST /api/license-upload/ Responses

Status Code	Description
`200`	Operation succeeded.

POST /api/license-upload/ `curl` example

curl -X POST "https://<your-datamasque-host>/api/license-upload/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json" \
     -F "license_file=@</path/to/your/license_file.lic>"

GET /api/license/contract-type/

Authorization: User token only.

For Cloud Contract Offer licenses, retrieve the type of license that has been configured to be used.

GET /api/license/contract-type/ Parameters

No parameters.

GET /api/license/contract-type/ Responses

Status Code	Description
`200`	License type retrieved.
`400`	The licensing method is not of Cloud Contract type, so setting the license type is not supported.
`404`	The license type has not yet been specified.

An example response is shown below.

{
  "contract_license_type": "business"
}

contract_license_type must be one of:

business
enterprise

GET /api/license/contract-type/ `curl` example

curl "https://<your-datamasque-host>/api/license/contract-type/" \
     -H "Authorization: Token <your-api-token>" \
     -H "Content-Type: application/json"

PUT /api/license/contract-type/

Authorization: Admin User token only.

For Cloud Contract Offer licenses, set the type of license to check out.

PUT /api/license/contract-type/ Parameters

Field	Type	Required	Location	Description
`contract_license_type`	`string`	Yes	Request Body	The type of license to check out. Must be one of `business` or `enterprise`.

PUT /api/license/contract-type/ Responses

Status Code	Description
`201`	License type updated.
`400`	The licensing method is not of Cloud Contract type, so setting the license type is not supported, or the specified license type is invalid.

PUT /api/license/contract-type/ `curl` example

curl -X PUT "https://<your-datamasque-host>/api/license/contract-type/" \
     -H "Authorization: Token <your-api-token>" \
     -d '{"contract_license_type": "business"}'

Health Check Object

Various health statistics about the DataMasque instance:

Field	Type	Description
`worker_running`	`boolean`	`true` if the masking agent worker processes are healthy, `false` if there are no available workers.
`license_expired`	`boolean`	`true` if the licence is expired, `false` if the licence is not expired.
`license_renewal_in_days`	`integer`	Remaining days until licence expiry.
`license_limit_breach`	`object`	An object describing any licence breaches that have occurred. Each property on the object is the type of breach that has occurred. Each property value is an object containing `breach_type`, `message`, and `created_date` properties.

GET /api/health-check/

Authorization: User token or API token.

Get the basic health-check status of DataMasque.

GET /api/health-check/ Parameters

No parameters.

GET /api/health-check/ Responses

Status Code	Description
`200`	A JSON serialised Health Check Object.
`500`	A server error has occurred, such as an invalid license file exists. The known `error` will be returned.

GET /api/health-check/ `curl` example

curl "https://<your-datamasque-host>/api/health-check/" \
     -H "Authorization: Token <your-api-token>"

API Endpoints

Authentication

API Token

User Token

POST /api/auth/token/login/

POST /api/auth/token/login/ Parameters

POST /api/auth/token/login/ Responses

POST /api/auth/token/login/ Postman example

POST /api/auth/token/login/ curl example

User Object

User Roles

GET /api/users/

GET /api/users/ Parameters

GET /api/users/ Responses

GET /api/users/ curl example

GET /api/users/{id}/

GET /api/users/{id}/ Parameters

GET /api/users/{id}/ Responses

GET /api/users/{id}/ curl example

GET /api/users/me/

GET /api/users/me/ Responses

GET /api/users/me/ curl example

POST /api/users/me/ curl example

POST /api/users/

POST /api/users/ Parameters

POST /api/users/ Responses

POST /api/users/ curl example

GET /api/users/me/

GET /api/users/me/ Responses

GET /api/users/me/ curl example

GET /api/users/{id}/

GET /api/users/{id}/ Parameters

GET /api/users/{id}/ Responses

GET /api/users/{id}/ curl example

PATCH /api/users/{id}/

PATCH /api/users/{id}/ Parameters

PATCH /api/users/{id}/ Responses

PATCH /api/users/{id}/ curl example

PUT /api/users/{id}/

PUT /api/users/{id}/ Parameters

PUT /api/users/{id}/ Responses

PUT /api/users/{id}/ curl example

POST /api/users/{id}/reset-password/

POST /api/users/{id}/reset-password/ Parameters

POST /api/users/{id}/reset-password/ Responses

POST /api/users/{id}/reset-password/ curl example

Profile Object

Extra Field Notes

git_directory_path

GET /api/users/me/profile/

GET /api/users/me/profile/ Parameters

GET /api/users/me/profile/ Responses

GET /api/users/me/profile/ Parameters

GET /api/users/me/profile/ curl example

POST /api/users/me/profile/

POST /api/users/me/profile/ Responses

Run Object

GET /api/runs/

GET /api/runs/ Parameters

GET /api/runs/ Responses

GET /api/runs/ curl example

POST /api/runs/

POST /api/runs/ Parameters

POST /api/runs/ Responses

POST /api/runs/ curl example

GET /api/runs/{id}/

GET /api/runs/{id}/ Parameters

GET /api/runs/{id}/ Responses

GET /api/runs/{id}/ curl example

POST /api/runs/{id}/cancel/

GET /api/runs/validate/

GET /api/runs/validate/ Parameters

GET /api/runs/validate/ curl example

POST /api/runs/{id}/cancel/ Parameters

POST /api/runs/{id}/cancel/ Responses

POST /api/runs/{id}/cancel/ curl example

GET /api/runs/{id}/sdd-report/

GET /api/runs/{id}/sdd-report/ Parameters

GET /api/runs/{id}/sdd-report/ Responses

GET /api/runs/{id}/sdd-report/ curl example

POST /api/auth/token/login/ `curl` example

GET /api/users/ `curl` example

GET /api/users/{id}/ `curl` example

GET /api/users/me/ `curl` example

POST /api/users/me/ `curl` example

POST /api/users/ `curl` example

GET /api/users/me/ `curl` example

GET /api/users/{id}/ `curl` example

PATCH /api/users/{id}/ `curl` example

PUT /api/users/{id}/ `curl` example

POST /api/users/{id}/reset-password/ `curl` example

`git_directory_path`

GET /api/users/me/profile/ `curl` example

GET /api/runs/ `curl` example

POST /api/runs/ `curl` example

GET /api/runs/{id}/ `curl` example

GET /api/runs/validate/ `curl` example

POST /api/runs/{id}/cancel/ `curl` example

GET /api/runs/{id}/sdd-report/ `curl` example

GET /api/runs/{id}/run-report/ `curl` example

DELETE /api/runs/{id}/db-discovery-results/ `curl` example

GET /api/runs/{id}/db-discovery-results/report/ `curl` example

GET /api/runs/{id}/log/ `curl` examples

GET /api/runs/{id}/log/download/ `curl` example

Quickstart example using `curl`

POST /api/connections/ `curl` example

PUT /api/connections/{id}/ `curl` example

DELETE /api/connections/{id}/ `curl` example

POST /api/connections/test/ `curl` example

GET /api/connection-filesets/ `curl` example

POST /api/connection-filesets/ `curl` example

PUT /api/connection-filesets/{id}/ `curl` example

DELETE /api/connection-filesets/{id}/ `curl` example