API Endpoints
- Authentication
- User Object
- Profile Object
- Run Object
- GET /api/runs/
- POST /api/runs/
- GET /api/runs/{id}/
- POST /api/runs/{id}/cancel/
- GET /api/runs/validate/
- GET /api/runs/{id}/sdd-report/
- GET /api/runs/{id}/run-report/
- DELETE /api/runs/{id}/db-discovery-results/
- GET /api/runs/{id}/db-discovery-results/report/
- Option Object (Referenced in the
/runs/
POST Request)
- Runlog Object
- Connection Object
- Connection Fileset Object
- Ruleset Object
- Seed Object
- Audit Log Object
- Schema Discovery
- Generating Rulesets
- POST /api/generate-ruleset/
- POST /api/generate-file-ruleset/
- Generate Ruleset Result Object
- GET /api/async-generate-ruleset/{connection_id}/
- POST /api/async-generate-ruleset/{connection_id}/
- DELETE /api/async-generate-ruleset/{connection_id}/
- POST /api/async-generate-ruleset/{connection_id}/from-csv/
- GET /api/async-generate-ruleset/{connection_id}/download-rulesets/
- File Data Discovery
- Oracle Wallets
- Git Related Endpoints
- Exporting DataMasque Configuration
- Importing DataMasque Configuration
- Other API Requests
Authentication
The DataMasque API uses token authentication.
Tokens are 40-character strings containing 0-9 and a-f.
Tokens should be included in the Authorization
HTTP header for each request,
with the word Token
prepended.
For example
GET /runs/123/
Authorization: Token abcdef1234567890abcdef1234567890abcdef12
There are two types of authentication tokens:
- A non-expiring API Token which has access to only some endpoints. You can get this token from the My Account page.
- A User Token that is valid for only 12 hours, but has access to all endpoints.
User tokens are granted by posting your username and password to the
/api/auth/token/login/
endpoint.
The documentation for each endpoint on this page includes the type of token that is required to access it.
If an endpoint does not require the use of the Authorization
header then its authorization is noted as Anonymous.
The purpose and use case of each token type is explained below.
API Token
The API Token is a long-lived credential retrieved from the My Account page. It remains valid indefinitely, unless revoked (also on the My Account page). This token is valid only for use with specific API endpoints.
It is designed to be used in automated scripts whose content may not be stored securely, therefore it mainly has access to controlling masking runs and checking their status.
User Token
The User Token is exclusively issued after a successful login,
either through the user interface or by making a request to /api/auth/token/login/
.
This token offers enhanced security due to its limited lifetime, expiring after 12 hours, and is only accessible after a successful login. When accessing DataMasque through the UI, the token is granted as a cookie which will expire after 1 hour of inactivity.
It can be used against all API endpoints, and grants access based on the user account's permissions.
Both token types serve distinct purposes within the DataMasque API, offering a balance between security and usability.
POST /api/auth/token/login/
Authorization: Anonymous.
Login with a username and password to obtain a user_token
.
POST /api/auth/token/login/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
username |
string |
Yes | Request Body | The username of the user you are logging in as. |
password |
string |
Yes | Request Body | The password for the user. |
POST /api/auth/token/login/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised user object, including a short-lived API key. |
POST /api/auth/token/login/ Postman example
- Open Postman.
- Create a new request.
- Set the method to
POST
and the URL tohttps://<your-datamasque-host>/auth/token/login/
. - Under Headers, add
Content-Type
as a key and set the value asapplication/json
. - Select the Body tab then the raw button.
- Include your DataMasque login details in this format in the text editor shown:
{
"username": "<your-username>",
"password": "<your-password>"
}
- Press the blue Send button to the right of the URL bar.
POST /api/auth/token/login/ curl
example
curl -X POST "https://<your-datamasque-host>/api/auth/token/login/" \
-H "Content-Type: application/json" \
-d '{"username": "<your-username>", "password": "<your-password>"}'
User Object
User
objects have the following fields:
Field | Type | Description |
---|---|---|
id |
integer |
The id of the User . |
username |
string |
The username for the User . Used when logging in. |
email |
string |
The email of the User . |
date_joined |
date |
The date the User was created. |
api_token |
string |
The API token for the User . |
has_temporary_password |
boolean |
Whether user has a temporary password or not. If true, the user has not finalised their account creation. |
is_active |
boolean |
Whether or not the user account is active. If false, the account is disabled. |
is_staff |
boolean |
Whether or not the user is a staff account. |
is_superuser |
boolean |
Whether or not the account is a superuser and has admin privileges. |
is_sso_user |
boolean |
Whether or not the account is an SSO enabled account. |
is_subscribed_to_sdd_updates |
boolean |
Whether or not the user has subscribed to sensitive data discovery updates. |
user_roles |
array[string] |
List of roles assigned to the user. Full list of roles can be found in User Roles |
user_permissions |
array[string] |
List of permissions assigned to the user. |
User Roles
User
objects may be assigned one or none of the below roles, as part of their user_roles
array.
Role | Description |
---|---|
mask_runner |
A user with this role is responsible solely for executing masking operations. |
mask_builder |
In addition to the capabilities of the mask_runner role, this role includes the ability to create and manage rulesets. |
- Requests related to User Object:
GET /api/users/
Authorization: Admin User token only.
Returns a list of user accounts.
GET /api/users/ Parameters
No parameters.
GET /api/users/ Responses
Status Code | Description |
---|---|
200 |
Returns a JSON serialised list of User objects. |
GET /api/users/ curl
example
curl "https://<your-datamasque-host>/api/users/" \
-H "Authorization: Token <your-api-token>"
GET /api/users/{id}/
Authorization: Admin User token or the user themselves.
Retrieve information about a specific user.
GET /api/users/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the user. |
GET /api/users/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized User object for the specified user. |
403 |
Forbidden: If the token does not have the required permissions. |
404 |
Not Found: If the user with the specified id does not exist. |
GET /api/users/{id}/ curl
example
curl "https://<your-datamasque-host>/api/users/{id}/" \
-H "Authorization: Token <your-api-token>"
`
GET /api/users/me/
Authorization: User token only.
Returns the details of the currently logged-in user.
GET /api/users/me/ Responses
Status Code | Description |
---|---|
200 |
Returns a JSON serialised User object for the user that is currently logged in. |
GET /api/users/me/ curl
example
curl "https://<your-datamasque-host>/api/users/me/" \
-H "Authorization: Token <your-api-token>"
POST /api/users/me/ curl
example
curl -X POST "https://<your-datamasque-host>/api/users/me/profile/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{"git_directory_path": "path/to/root"}'
POST /api/users/
Authorization: Admin User token only.
Create a new user account.
POST /api/users/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
username |
string |
Yes | Request Body | The username of the user being created. |
password |
string |
Yes | Request Body | The password for the new user account. |
re_password |
string |
Yes | Request Body | The password for the new user again, to confirm the password entered above. |
email |
string |
Yes | Request Body | The email address of the new user. |
role |
array[string] |
No | Request Body | The role(s) assigned to the user. If provided, the user will be added to the specified group(s). Defaults to no role which has the same permissions as mask_runner . |
POST /api/users/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialized User object for the created user. |
400 |
Bad Request: If the request data is invalid or user creation is disabled. |
403 |
Forbidden: If the token does not have the required permissions. |
POST /api/users/ curl
example
curl -X POST "https://<your-datamasque-host>/api/users/" \
-H "Authorization: Token <your-admin-api-token>" \
-H "Content-Type: application/json" \
-d '{
"username": "<your-new-username>",
"password": "<your-new-password>",
"re_password": "<your-new-password>",
"email": "<your-new-email>",
"role": "<your-user-role>"
}'
GET /api/users/me/
Authorization: User token only.
Returns the details of the currently logged-in user.
GET /api/users/me/ Responses
Status Code | Description |
---|---|
200 |
Returns a JSON serialised User object for the user that is currently logged in. |
GET /api/users/me/ curl
example
curl "https://<your-datamasque-host>/api/users/me/" \
-H "Authorization: Token <your-api-token>"
GET /api/users/{id}/
Authorization: Admin User token (to query any user's details) or the queried user's token.
Retrieve information about a specific user.
GET /api/users/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the user. |
GET /api/users/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized User object for the specified user. |
403 |
Forbidden: If the token does not have the required permissions. |
404 |
Not Found: If the user with the specified id does not exist. |
GET /api/users/{id}/ curl
example
curl "https://<your-datamasque-host>/api/users/{id}/" \
-H "Authorization: Token <your-api-token>"
PATCH /api/users/{id}/
Authorization: Admin User token (to update any user's details) or the updating user's token.
Partially update information for a specified user.
PATCH /api/users/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the user to update. |
username |
string |
No | Request Body | The new username of the user. Only an Admin User can update this. |
email |
string |
No | Request Body | The new email address of the user. An Admin User or the user themselves can update this. |
user_roles |
array[string] |
No | Request Body | The role(s) assigned to the user. If provided, the user will be added to the specified group(s). Only an Admin User can update this. |
PATCH /api/users/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized User object for the updated user. |
400 |
Bad Request: If the request data is invalid. |
403 |
Forbidden: If the token does not have the required permissions. |
404 |
Not Found: If the user with the specified id does not exist. |
PATCH /api/users/{id}/ curl
example
curl -X PATCH "https://<your-datamasque-host>/api/users/{id}/" \
-H "Authorization: Token <your-admin-api-token>" \
-H "Content-Type: application/json" \
-d '{
"username": "<your-new-username>",
"email": "<your-new-email>",
"user_roles": ["<user-role>"]
}'
PUT /api/users/{id}/
Authorization: Admin User token (to update any user's details) or the updating user's token.
Update information for a specified user.
PUT /api/users/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the user to update. |
username |
string |
No | Request Body | The new username of the user. Only an Admin User can update this. |
email |
string |
No | Request Body | The new email address of the user. An Admin User or the user themselves can update this. |
user_roles |
array[string] |
No | Request Body | The role(s) assigned to the user. If provided, the user will be added to the specified group(s). Only an Admin User can update this. |
PUT /api/users/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized User object for the updated user. |
400 |
Bad Request: If the request data is invalid. |
403 |
Forbidden: If the token does not have the required permissions. |
404 |
Not Found: If the user with the specified id does not exist. |
PUT /api/users/{id}/ curl
example
curl -X PUT "https://<your-datamasque-host>/api/users/{id}/" \
-H "Authorization: Token <your-admin-api-token>" \
-H "Content-Type: application/json" \
-d '{
"username": "<your-new-username>",
"email": "<your-new-email>",
"user_roles": ["<user-role>"]
}'
POST /api/users/{id}/reset-password/
Authorization: Admin User token only.
Reset the password for a specified user.
POST /api/users/{id}/reset-password/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the user whose password is being reset. |
POST /api/users/{id}/reset-password/ Responses
Status Code | Description |
---|---|
200 |
Returns a JSON object with the new temporary password. |
403 |
Forbidden: If the token does not have the required permissions. |
404 |
Not Found: If the user with the specified id does not exist. |
POST /api/users/{id}/reset-password/ curl
example
curl -X POST "https://<your-datamasque-host>/api/users/{id}/reset-password/" \
-H "Authorization: Token <your-admin-api-token>" \
-H "Content-Type: application/json"
Profile Object
A Profile
object stores settings for a particular user.
There is a one-to-one relationship between a user and their Profile
.
A Profile
object may only be updated by the user that it belongs to
(i.e. a user can only update their own Profile
, admins cannot update Profile
s of other users).
Profile
objects have the following fields:
Field | Type | Description |
---|---|---|
git_directory_path |
string |
The Git directory path for this user when pushing/pulling rulesets to/from a Git repository. |
Extra Field Notes
git_directory_path
This overrides the global Git directory for the DataMasque instance, for this user only. This value can be set even if Git integration is disabled, it will just have no effect.
GET /api/users/me/profile/
Authorization: User token only.
Returns the Profile Object for the currently logged-in user.
GET /api/users/me/profile/ Parameters
No parameters.
GET /api/users/me/profile/ Responses
Status Code | Description |
---|---|
200 |
Returns a JSON serialised Profile object, with fields as described above. |
GET /api/users/me/profile/ Parameters
No parameters.
GET /api/users/me/profile/ curl
example
curl "https://<your-datamasque-host>/api/users/me/profile/" \
-H "Authorization: Token <your-api-token>"
POST /api/users/me/profile/
Authorization: User token only.
Updates the Profile
object for the current user.
Partial updates are supported: only fields that are contained in the request will be updated
(i.e. if a field is not present in the request then its stored value remains unchanged).
POST /api/users/me/profile/ Responses
Status Code | Description |
---|---|
204 |
The Profile update was successful. |
Run Object
Run
objects have the following fields:
Field | Type | Description |
---|---|---|
id |
integer |
The id of the Run . Use this in API URLs that need a run id . |
name |
string |
The name of the Run . |
status |
string |
Indicates the Run status. The potential values are: queued , running , finished , finished_with_warnings , failed , cancelling , and cancelled . A status of finished or finished_with_warnings indicates the Run completed successfully; failed indicates an error. finished_with_warnings indicates there were warnings during the run, refer to the run log to view them. |
mask_type |
string |
The masking type of the Run , valid options are "database" or "file" . |
connection |
string |
Deprecated, replaced by source_connection . |
connection_name |
string |
Deprecated, replaced by source_connection_name . |
source_connection |
string |
A UUID identifying the source connection used for this Run . For database connections, the source_connection also acts as the destination. |
source_connection_name |
string |
The name of the source connection of the Run . For database connections, the source_connection also acts as the destination. |
destination_connection |
Optional[string] |
A UUID identifying the destination connection used for this Run . Only present for file connections, as the source_connection also acts as the destination for database connections. |
destination_connection_name |
Optional[string] |
The name of the destination connection of the Run . Only present for file connections, as the source_connection also acts as the destination for database connections. |
ruleset |
string |
A UUID identifying the ruleset used for this Run . |
ruleset_name |
string |
Ruleset name of the Run . |
start_time |
string |
Start time of the Run , in ISO 8601 format. |
end_time |
string |
End time of the Run , in ISO 8601 format. |
options |
object |
An Option object of configuration for the Run . |
- Requests related to Run Objects:
- GET /api/runs/
- POST /api/runs/
- GET /api/runs/{id}/
- POST /api/runs/{id}/cancel/
- GET /api/runs/validate/
- GET /api/runs/{id}/sdd-report/
- Option Object (Referenced in the
/api/runs/
POST Request)
GET /api/runs/
Authorization: User token or API token.
Get a list of DataMasque Runs.
GET /api/runs/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
mask_type |
string |
No | Query Parameter | The mask type of the Run . The potential values are: database , file . |
connection_ruleset_name |
string |
No | Query Parameter | The name of the source or destination connection name or the ruleset name of the Run . |
status |
string |
No | Query Parameter | The status of the Run . The potential values are: queued , running , finished , finished_with_warnings , failed , cancelling , and cancelled . |
GET /api/runs/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised list of Run objects. |
GET /api/runs/ curl
example
curl "https://<your-datamasque-host>/api/runs/" \
-H "Authorization: Token <your-api-token>"
POST /api/runs/
Authorization: User token or API token.
Start a new masking run.
POST /api/runs/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
name |
string |
Yes | Request Body | The name of the Run . |
connection |
string |
No | Request Body | Deprecated, replaced by source_connection . |
source_connection |
string |
Yes | Request Body | A UUID identifying the source connection to be used for this Run . For database connections, the source_connection also acts as the destination. |
destination_connection |
string |
Required only for runs on file connections. | Request Body | A UUID identifying the connection to be used for this Run . |
ruleset |
string |
Yes | Request Body | A UUID identifying the ruleset to be used for this Run . |
options |
object |
Yes | Request Body | An Option object of configuration for this Run . |
POST /api/runs/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Run object. |
POST /api/runs/ curl
example
curl -X POST "https://<your-datamasque-host>/api/runs/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "<run-name>",
"source_connection": "<source-connection-uuid>",
"destination_connection": "<destination-connection-uuid>", # Include this only if required
"ruleset": "<ruleset-uuid>",
"options": {
#... option object details ...
}
}'
GET /api/runs/{id}/
Authorization: User token or API token.
Retrieve information about a masking run.
GET /api/runs/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
GET /api/runs/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Run object. |
GET /api/runs/{id}/ curl
example
curl "https://<your-datamasque-host>/api/runs/{id}/" \
-H "Authorization: Token <your-api-token>"
POST /api/runs/{id}/cancel/
Authorization: User token or API token.
Cancel a masking run.
GET /api/runs/validate/
Authorization: User token or API token.
Validate that the run actually occurred.
GET /api/runs/validate/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
run_hash |
string |
Yes | Query Parameter | The hash of the run that can be retrieved from run_hash column in the DATAMASQUE_RUN_HISTORY table. |
run_completion_time |
string |
Yes | Query Parameter | The finish time of the run that can be retrieved from the run log or from the completion_time column in the DATAMASQUE_RUN_HISTORY table. It must be in the datetime format: %Y-%m-%d %H:%M:%S |
ruleset_content_sha256 |
string |
Yes | Query Parameter | The hash of the ruleset that can be retrieved from the run log or from the ruleset_content_sha256 column in the DATAMASQUE_RUN_HISTORY table. |
GET /api/runs/validate/ curl
example
Given the run log contains:
SHA256 hash of ruleset: 7ee08ef63db7fed2baf577f16d74427c2250ba05f6858b0a27b70e05ccbff6eb
Finished At: 2024-05-22 22:11:35 UTC
The DATAMASQUE_RUN_HISTORY
table has:
run_hash
: 8d34cc930ce7eae40a633e95aef3aee5d2108511eb20ac35805f2e0834115bb9
curl -X GET "https://<your-datamasque-host>/api/runs/validate/?run_hash=8d34cc930ce7eae40a633e95aef3aee5d2108511eb20ac35805f2e0834115bb9&run_completion_time=2024-05-22 22:11:35&ruleset_content_sha256=7ee08ef63db7fed2baf577f16d74427c2250ba05f6858b0a27b70e05ccbff6eb" \
-H "Authorization: Token <your-api-token>"
POST /api/runs/{id}/cancel/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
POST /api/runs/{id}/cancel/ Responses
Status Code | Description |
---|---|
201 |
Operation succeeded |
POST /api/runs/{id}/cancel/ curl
example
curl -X POST "https://<your-datamasque-host>/api/runs/{id}/cancel/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json"
GET /api/runs/{id}/sdd-report/
Authorization: User token only.
A binary serialised SDD Report object.
GET /api/runs/{id}/sdd-report/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
GET /api/runs/{id}/sdd-report/ Responses
Status Code | Description |
---|---|
200 |
The server will return the SDD Report in the response body which can be downloaded as a CSV file. |
404 |
If there is no SDD Report for a run, the server will return 404 status code. |
GET /api/runs/{id}/sdd-report/ curl
example
curl "https://<your-datamasque-host>/api/runs/{id}/sdd-report/" \
-H "Authorization: Token <your-api-token>"
GET /api/runs/{id}/run-report/
Authorization: User token only.
A binary serialised Run Report object.
GET /api/runs/{id}/run-report/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
GET /api/runs/{id}/run-report/ Responses
Status Code | Description |
---|---|
200 |
The server will return the Run Report in the response body which can be downloaded as a CSV file. |
404 |
If there is no Run Report for a run, the server will return 404 status code. |
GET /api/runs/{id}/run-report/ curl
example
curl "https://<your-datamasque-host>/api/runs/{id}/run-report/" \
-H "Authorization: Token <your-api-token>"
DELETE /api/runs/{id}/db-discovery-results/
Deletes the database discovery results for a run. Use this only when the results are no longer needed, for instance because you have completed another discovery run on the same database more recently.
Warning! Deletion of results is irreversible.
Note: This endpoint can only be used to delete discovery results that were created on versions of DataMasque v2.22 and later. It is not possible to delete discovery results from versions prior to v2.22.
DELETE /api/runs/{id}/db-discovery-results/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
DELETE /api/runs/{id}/db-discovery-results/report/ Responses
Status Code | Description |
---|---|
204 |
Deletion was successful. |
404 |
Not Found: There are no database discovery results for this run, or a run with the specified ID does not exist. |
DELETE /api/runs/{id}/db-discovery-results/ curl
example
curl -X DELETE "https://<your-datamasque-host>/api/runs/{id}/db-discovery-results/" \
-H "Authorization: Token <your-api-token>"
GET /api/runs/{id}/db-discovery-results/report/
Downloads database schema discovery results as a CSV.
GET /api/runs/{id}/db-discovery-results/report/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
GET /api/runs/{id}/db-discovery-results/report/ Responses
Status Code | Description |
---|---|
200 |
The server will return the discovery results in the response body which can be downloaded as a CSV file. |
404 |
Not Found: There are no database discovery results for this run, or a run with the specified ID does not exist. |
GET /api/runs/{id}/db-discovery-results/report/ curl
example
curl -o report.csv "https://<your-datamasque-host>/api/runs/{id}/db-discovery-results/report/" \
-H "Authorization: Token <your-api-token>"
Option Object
Option
objects have the following fields:
Field | Type | Description |
---|---|---|
batch_size |
integer |
An argument to specify the number of rows to fetch in each batch retrieved from the database for masking. This is ignored for file masking. |
dry_run |
boolean |
Indicates a dry run where no data in the database is actually changed. Values should either be true to indicate a dry run, or false to run normally. Default value is false . More information on dry runs is available in the Masking runs documentation. |
max_rows |
integer |
A parameter to specify the maximum number of rows that will be masked by each mask_table task1. Defaults to no limit. This is ignored for file masking. |
continue_on_failure |
boolean |
If there is a task failure, and this option is false , DataMasque will skip all remaining unstarted tasks. If this option is true , DataMasque will continue performing other tasks even if there is a task failure. Default value is false . |
run_secret |
string |
The run secret is used in the random generation of masked values. If left unspecified, a random secret will be automatically generated and returned in the API response 2. Masking runs performed on the same DataMasque instance with the same run secret will produce the same masked values for identical unmasked database inputs. You should only specify a run secret if you require consistent masking across runs, otherwise it is more secure to allow a new run secret to be automatically generated for each run. Run secrets must be at least 20 characters long. |
disable_instance_secret |
boolean |
If this option is set to true , DataMasque will exclude its instance-specific secret and generate masked values based solely on the run secret. You may wish to disable the instance in order to achieve consistent masking across DataMasque instances. However, by disabling the instance secret, any DataMasque instance using the same run_secret could replicate your data masking. |
diagnostic_logging |
boolean |
If set to true , the run log will include information to help diagnose errors. This includes information about the tables, columns and keys being masked, memory usage information and more verbose output. Defaults to false . |
buffer_size (deprecated; will be removed in release 3.0.0) |
integer |
Replaced by batch_size . |
1
max_rows
does not apply tomask_unique_key
tasks.2 The
run_secret
contained in the API response can be provided in subsequent API calls to start runs, facilitating consistent masking across those runs.
Additionally, the following options apply to schema discovery runs (i.e. runs that include at least one run_schema_discovery
task):
Field | Type | Description |
---|---|---|
custom_keywords |
array[string] |
List of keywords that, where a column's name matches one or more of the keywords, indicates the column contains sensitive data. Default value is an empty list. |
ignored_keywords |
array[string] |
List of keywords that, where a column's name matches one or more of the keywords, indicates the column should be excluded from the schema discovery results. Default value is an empty list. |
disable_global_custom_keywords |
boolean |
If set to true , then the user-defined global set of custom keywords will not be used to flag columns as sensitive. Default value is false . |
disable_global_ignored_keywords |
boolean |
If set to true , then the user-defined global set of ignored keywords will not be used to exclude columns from the schema discovery results. Default value is false . |
disable_built_in_keywords |
boolean |
If set to true , then DataMasque's built-in list of keywords will not be used to flag columns as sensitive. Default value is false . |
schemas |
array[string] |
List of schema (database for MySQL/MariaDB) names against which to perform schema discovery. Default value is an empty list, meaning schema discovery will run against the schema configured on the database connection, or the database user's default schema. Default value is an empty list. |
- Requests related to Option Object:
Runlog Object
Runlog
objects have the following fields:
Field | Type | Description |
---|---|---|
run |
integer |
ID of the Run this Runlog was generated for. |
timestamp |
string |
Timestamp of this Runlog 's generation, in ISO 8601 format. |
message |
string |
The log message passed from the masking worker. |
log_level |
integer |
Numeric representation of the log level, values are 20 for INFO, 30 for WARNING, and 40 for ERROR. |
status |
string |
Indicates the Run status. The potential values are: queued , running , finished , finished_with_warnings , failed , cancelling , and cancelled . A status of finished or finished_with_warnings indicates the Run completed successfully; failed indicates an error. finished_with_warnings indicates there were warnings during the run, refer to the run log to view them. |
is_dry_run |
boolean |
Indicates whether the Run is a dry run. |
- Requests related to Runlog Object:
GET /api/runs/{id}/log/
Authorization: User token or API token.
List all logs for a specified Run
in a JSON response.
GET /api/runs/{id}/log/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
limit |
integer |
No | Query Parameter | The maximum number of RunLog entries to return. |
offset |
integer |
No | Query Parameter | The starting position of the query in relation to the complete set of RunLogs for this Run . |
ordering |
integer |
No | Query Parameter | Controls the order of the results. Available fields to order by are id and timestamp . Reverse the order by prefixing the field name with - . Multiple orderings may be specified separated by a comma. |
GET /api/runs/{id}/log/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised list of Runlog objects. Default is to return the all the logs for the run. |
GET /api/runs/{id}/log/ curl
examples
Fetch the complete run log:
curl "https://<your-datamasque-host>/api/runs/{id}/log/" \
-H "Authorization: Token <your-api-token>"
Fetch the first 25 logs:
curl "https://<your-datamasque-host>/api/runs/{id}/log/?limit=25&offset=0" \
-H "Authorization: Token <your-api-token>"
Fetch logs from 50-100:
curl "https://<your-datamasque-host>/api/runs/{id}/log/?limit=50&offset=50" \
-H "Authorization: Token <your-api-token>"
Order by timestamp and id descending (newest first):
curl "https://<your-datamasque-host>/api/runs/{id}/log/?ordering=-timestamp,-id" \
-H "Authorization: Token <your-api-token>"
GET /api/runs/{id}/log/download/
Authorization: User token only.
All logs for a specified Run
in a plain text file.
GET /api/runs/{id}/log/download/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
timezone |
string |
Yes | Query Parameter | Timezone offset to use for the Run logs in the format +HH:MM or -HH:MM. Example: +07:00, -05:00. |
GET /api/runs/{id}/log/download/ Responses
Status Code | Description |
---|---|
200 |
The server will return the Run Log content in the response body which can be downloaded as a log file. |
GET /api/runs/{id}/log/download/ curl
example
curl "https://<your-datamasque-host>/api/runs/{id}/log/download/?timezone=+07:00" \
-H "Authorization: Token <your-api-token>"
Connection Object
Database Connection
objects have the following fields:
Field | Type | Description |
---|---|---|
version |
string |
The connection version. This should be set to `1.0'. |
id |
integer |
The id of the Connection . Use this in API URLs that need a connection id . |
name |
string |
The name of the Connection . |
user |
string |
The name of the user in the database connection. |
db_type |
string |
The type of database the connection is connecting to. |
database |
string |
The database the connection is connecting to. |
host |
string |
The hostname of the database connection. |
port |
integer |
The database port being connected through. |
dbpassword |
string |
The password for the user connecting to the database. |
schema |
string |
The schema of the database to connect to. |
options |
object |
An Option object of configuration for the Run |
service_name |
string |
The service name for the connection. Only used for Oracle. (Optional) |
connection_fileset |
string |
The connection fileset attached to this connection. Currently only used for MySQL and MariaDB. (Optional) |
mask_type |
string |
The type of masking the connection can perform, only database or file are valid. (Optional) Should be set to database for database Connections . |
last_discovery_run_date |
string |
The created_time of the last run on this connection including a run_schema_discovery task, or null if no such run has been performed. |
last_discovery_run_id |
string |
The ID of the last run on this connection including a run_schema_discovery task, or null if no such run has been performed. |
is_read_only |
boolean |
Whether or not the connection to the database is read-only. |
data_encoding |
string |
Only for Oracle, Postgres, MySQL, and MariaDB connections An encoding to be used when retrieving data containing different character sets from the database. Should match the encoding of the data stored, not the character set of the database. The list of supported encodings can be found on the Database Connections page. |
iam_role_arn |
string |
Only for Amazon DynamoDB connections The IAM role ARN for DataMasque to assume role |
File Connection
objects have the following fields:
Field | Type | Description |
---|---|---|
version |
string |
The connection version. This should be set to `1.0'. |
id |
integer |
The id of the Connection . Use this in API URLs that need a connection id . |
name |
string |
The name of the Connection . |
type |
string |
The type of file system the connection is connecting to. Valid options are "s3_connection" , "azure_blob_connection" or "mounted_share_connection" . |
base_directory |
string |
The root file path where files intended to be masked are stored. |
bucket |
string |
The name of the S3 bucket containing the base_directory . Only for S3 Connections . |
container |
string |
The name of the Azure Blob Storage container containing the base_directory . Only for Azure Blob Connections . |
connection_string |
string |
The connection string configured with the authorization information to access data in your Azure Storage account. Only for Azure Blob Connections . |
mask_type |
string |
The type of masking the connection can perform, only database or file are valid. (Optional) Should be set to file for file Connections . |
is_file_mask_source |
boolean |
A boolean if the connection is a source Connection for file masking. (Optional) Defaults to false if not provided. |
is_file_mask_destination |
boolean |
A boolean if the connection is a destination Connection for file masking. (Optional) Defaults to false if not provided. |
- Requests related to Connection Object:
GET /api/connections/
Authorization: User token only.
Get a list of all DataMasque connections.
Optionally, you can add an {id}
to the end of the request to only return the details of the connection with that
specific id
.
GET /api/connections/ Parameters
Can optionally follow the URL with the id
of a specific connection to only return information on that connection.
GET /api/connections/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Connection object. |
Quickstart example using curl
curl "https://<your-datamasque-host>/api/connections/" \
-H "Authorization: Token <your-api-token>"
POST /api/connections/
Authorization: User token only.
Create a new connection object.
POST /api/connections/ Parameters
Database Connections
Field | Type | Required | Location | Description |
---|---|---|---|---|
version |
string |
Yes | Request Body | The connection version. This should be set to 1.0 . |
name |
string |
Yes | Request Body | The name of the Connection . |
user |
string |
Yes | Request Body | The name of the user in the database connection. |
db_type |
string |
Yes | Request Body | The type of database the connection is connecting to. |
database |
string |
Yes | Request Body | The database the connection is connecting to. |
host |
string |
Yes | Request Body | The hostname of the database connection. |
port |
integer |
Yes | Request Body | The database port being connected through. |
dbpassword |
string |
Yes | Request Body | The password for the user connecting to the database. |
schema |
string |
Yes | Request Body | The schema of the database to connect to. |
service_name |
string |
No | Request Body | The service name for the connection. Only applies to Oracle. |
connection_fileset |
string |
No | Request Body | The connection fileset attached to this connection. Only applies to MySQL and MariaDB. |
mask_type |
string |
No, defaults to database if not provided. |
Request Body | The type of masking the connection can perform, only database or file are valid. |
is_read_only |
boolean |
No, defaults to false if not provided. |
Request Body | Whether or not the connection to the database read-only. |
data_encoding |
string |
No, defaults to None if not provided. |
Request Body | Only for Oracle, Postgres, MySQL, and MariaDB connections An encoding to be used when retrieving data containing different character sets from the database. Should match the encoding of the data stored, not the character set of the database. The list of supported encodings can be found on the Database Connections page. |
iam_role_arn |
string |
No, role assumption will only take place if provided. | Request Body | Only for Amazon DynamoDB connections The IAM role ARN for DataMasque to assume role |
File Connections
Field | Type | Required | Location | Description |
---|---|---|---|---|
version |
string |
Yes | Request Body | The connection version. This should be set to `1.0'. |
name |
string |
Yes | Request Body | The name of the Connection . |
type |
string |
Yes | Request Body | The type of file system the connection is connecting to. Valid options are "s3_connection" , "azure_blob_connection" or "mounted_share_connection" . |
base_directory |
string |
Yes | Request Body | The root file path where files intended to be masked are stored. |
bucket |
string |
Required only for S3 Connections . |
Request Body | The name of the S3 bucket containing the base_directory . |
container |
string |
Required only for Azure Blob Connections . |
Request Body | The name of the Azure Blob Storage container containing the base_directory . |
connection_string |
string |
Required only for Azure Blob Connections . |
Request Body | The connection string configured with the authorization information to access data in your Azure Storage account. |
mask_type |
string |
No, defaults to database if not provided. |
Request Body | The type of masking the connection can perform, only database or file are valid. |
is_file_mask_source |
boolean |
No, defaults to false if not provided. |
Request Body | A boolean if the connection is a source Connection for file masking. |
is_file_mask_destination |
boolean |
No, defaults to false if not provided. |
Request Body | A boolean if the connection is a destination Connection for file masking. |
iam_role_arn |
string |
No, role assumption will only take place if provided. | Request Body | The IAM role ARN for DataMasque to assume role as for S3 connections. |
POST /api/connections/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Connection object. |
POST /api/connections/ curl
example
curl -X POST "https://<your-datamasque-host>/api/connections/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"version": "1.0",
"name": "<connection_name>",
"user": "<database_user>",
"db_type": "<database_type>",
"database": "<database_name>",
"host": "<database_host>",
"port": <database_port>,
"password": "<database_password>",
"schema": "<database_schema>",
"service_name": "<oracle_service_name>",
"connection_fileset": "<connection_fileset>",
"mask_type": "database"
}'
PUT /api/connections/{id}/
Authorization: User token only.
Update a connection with a specified id with new values.
PUT /api/connections/{id}/ Parameters
Database Connections
Field | Type | Required | Location | Description |
---|---|---|---|---|
version |
string |
Yes | Request Body | The connection version. This should be set to 1.0 . |
name |
string |
Yes | Request Body | The name of the Connection . |
user |
string |
Yes | Request Body | The name of the user in the database connection. |
db_type |
string |
Yes | Request Body | The type of database the connection is connecting to. |
database |
string |
Yes | Request Body | The database the connection is connecting to. |
host |
string |
Yes | Request Body | The hostname of the database connection. |
port |
integer |
Yes | Request Body | The database port being connected through. |
dbpassword |
string |
Yes | Request Body | The password for the user connecting to the database. |
schema |
string |
Yes | Request Body | The schema of the database to connect to. |
service_name |
string |
No | Request Body | The service name for the connection. Only applies to Oracle. |
connection_fileset |
string |
No | Request Body | The connection fileset attached to this connection. Only applies to MySQL and MariaDB. |
mask_type |
string |
No, defaults to database if not provided. |
Request Body | The type of masking the connection can perform, only database or file are valid. |
is_read_only |
boolean |
No, defaults to false if not provided. |
Request Body | Whether or not the connection to the database is read-only. |
iam_role_arn |
string |
No, role assumption will only take place if provided. | Request Body | The IAM role ARN for DataMasque to assume role as for S3 connections. |
File Connections
Field | Type | Required | Location | Description |
---|---|---|---|---|
version |
string |
Yes | Request Body | The connection version. This should be set to `1.0'. |
name |
string |
Yes | Request Body | The name of the Connection . |
type |
string |
Yes | Request Body | The type of file system the connection is connecting to. Valid options are "s3_connection" , "azure_blob_connection" or "mounted_share_connection" . |
base_directory |
string |
Yes | Request Body | The root file path where files intended to be masked are stored. |
bucket |
string |
Required only for S3 Connections . |
Request Body | The name of the S3 bucket containing the base_directory . |
container |
string |
Required only for Azure Blob Connections . |
Request Body | The name of the Azure Blob Storage container containing the base_directory . |
connection_string |
string |
Required only for Azure Blob Connections . |
Request Body | The connection string configured with the authorization information to access data in your Azure Storage account. |
mask_type |
string |
No, defaults to database if not provided. |
Request Body | The type of masking the connection can perform, only database or file are valid. |
is_file_mask_source |
boolean |
No, defaults to false if not provided. |
Request Body | A boolean if the connection is a source Connection for file masking. |
is_file_mask_destination |
boolean |
No, defaults to false if not provided. |
Request Body | A boolean if the connection is a destination Connection for file masking. |
iam_role_arn |
string |
No, role assumption will only take place if provided. | Request Body | The IAM role ARN for DataMasque to assume role as for S3 connections. |
PUT /api/connections/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Connection object with the new updated values. |
PUT /api/connections/{id}/ curl
example
curl -X PUT "https://<your-datamasque-host>/api/connections/{connection_id}/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"version": "1.0",
"name": "<connection_name>",
"user": "<database_user>",
"db_type": "<database_type>",
"database": "<database_name>",
"host": "<database_host>",
"port": <database_port>,
"password": "<database_password>",
"schema": "<database_schema>",
"service_name": "<oracle_service_name>",
"connection_fileset": "<connection_fileset>",
"mask_type": "database"
}'
DELETE /api/connections/{id}/
Authorization: User token only.
Delete the connection with the specified id.
DELETE /api/connections/{id}/ Parameters
No parameters.
DELETE /api/connections/{id}/ Responses
Status Code | Description |
---|---|
204 |
Operation succeeded |
DELETE /api/connections/{id}/ curl
example
curl -X DELETE "https://<your-datamasque-host>/api/connections/{id}/" \
-H "Authorization: Token <your-api-token>"
POST /api/connections/test/
Authorization: User token only.
Test a connection to validate that it is able to successfully connect to the target database.
POST /api/connections/test/ Parameters
Database Connections
Field | Type | Required | Location | Description |
---|---|---|---|---|
version |
string |
Yes | Request Body | The connection version. This should be set to 1.0 . |
name |
string |
Yes | Request Body | The name of the Connection . |
user |
string |
Yes | Request Body | The name of the user in the database connection. |
db_type |
string |
Yes | Request Body | The type of database the connection is connecting to. |
database |
string |
Yes | Request Body | The database the connection is connecting to. |
host |
string |
Yes | Request Body | The hostname of the database connection. |
port |
integer |
Yes | Request Body | The database port being connected through. |
dbpassword |
string |
Yes | Request Body | The password for the user connecting to the database. |
schema |
string |
Yes | Request Body | The schema of the database to connect to. |
service_name |
string |
No | Request Body | The service name for the connection. Only applies to Oracle. |
connection_fileset |
string |
No | Request Body | The connection fileset attached to this connection. Only applies to MySQL and MariaDB. |
is_read_only |
boolean |
No, defaults to false if not provided. |
Request Body | Whether or not the connection to the database is read-only. |
iam_role_arn |
string |
No, role assumption will only take place if provided. | Request Body | The IAM role ARN for DataMasque to assume role as for S3 connections. |
File Connections
Field | Type | Required | Location | Description |
---|---|---|---|---|
version |
string |
Yes | Request Body | The connection version. This should be set to `1.0'. |
name |
string |
Yes | Request Body | The name of the Connection . |
type |
string |
Yes | Request Body | The type of file system the connection is connecting to. Valid options are "s3_connection" , "azure_blob_connection" or "mounted_share_connection" . |
base_directory |
string |
Yes | Request Body | The root file path where files intended to be masked are stored. |
bucket |
string |
Required only for S3 Connections . |
Request Body | The name of the S3 bucket containing the base_directory . |
container |
string |
Required only for Azure Blob Connections . |
Request Body | The name of the Azure Blob Storage container containing the base_directory . |
connection_string |
string |
Required only for Azure Blob Connections . |
Request Body | The connection string configured with the authorization information to access data in your Azure Storage account. |
mask_type |
string |
No, defaults to database if not provided. |
Request Body | The type of masking the connection can perform, only database or file are valid. |
is_file_mask_source |
boolean |
No, defaults to false if not provided. |
Request Body | A boolean if the connection is a source Connection for file masking. |
is_file_mask_destination |
boolean |
No, defaults to false if not provided. |
Request Body | A boolean if the connection is a destination Connection for file masking. |
iam_role_arn |
string |
No, role assumption will only take place if provided. | Request Body | The IAM role ARN for DataMasque to assume role as for S3 connections. |
POST /api/connections/test/ Responses
Status Code | Description |
---|---|
200 |
Operation succeeded |
Connection Fileset Object
Connection Fileset
objects have the following fields:
Field | Type | Description |
---|---|---|
id |
integer |
The id of the Connection Fileset . Use this in API URLs that need a connection_fileset id . |
name |
string |
The name of the Connection Fileset . |
database_type |
string |
The type of database the Connection Fileset is associated with (currently only mysql is supported; this will work with both MySQL and MariaDB connections). |
zip_archive |
string |
The location of the Zip archive. |
- Requests related to Connection Fileset:
POST /api/connections/test/ curl
example
curl -X POST "https://<your-datamasque-host>/api/connections/test/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "<your-connection-name>",
"user": "<your-connection-user>",
"db_type": "oracle",
"database": "<your-database>",
"host": "<your-host>",
"port": 1433,
"dbpassword": "<your-password>",
"schema": "<optional-schema>",
"service_name": "<optional-service-name>",
"connection_fileset": "<optional-connection-fileset>",
"version": "1.0"
}'
GET /api/connection-filesets/
Authorization: User token only.
Returns a list of Connection Filesets. These may be used to encrypt connections to MySQL and MariaDB databases.
GET /api/connection-filesets/ Parameters
No parameters.
GET /api/connection-filesets/ Responses
Status Code | Description |
---|---|
201 |
A list of JSON serialised Connection Filesets. |
GET /api/connection-filesets/ curl
example
curl "https://<your-datamasque-host>/api/connection-filesets/" \
-H "Authorization: Token <your-api-token>"
POST /api/connection-filesets/
Authorization: User token only.
Create a new Connection Fileset.
POST /api/connection-filesets/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
name |
string |
Yes | Form Field | The name of the Connection Fileset . |
database_type |
string |
Yes | Form Field | The type of database the Connection Fileset is associated with (currently only mysql is supported; this will work with both MySQL and MariaDB connections). |
zip_archive |
file |
Yes | Form Field | The Zip archive file. |
POST /api/connection-filesets/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised object of the Connection Fileset that was created. |
POST /api/connection-filesets/ curl
example
curl -X POST "https://<your-datamasque-host>/api/connection-filesets/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: multipart/form-data" \
-F "name=<fileset_name>" \
-F "database_type=<database_type>" \
-F "zip_archive=@</path/to/your/file.zip>"
PUT /api/connection-filesets/{id}/
Authorization: User token only.
Update a Connection Fileset.
PUT /api/connection-filesets/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
name |
string |
Yes | Form Field | The name of the Connection Fileset . |
database_type |
string |
Yes | Form Field | The type of database the Connection Fileset is associated with (currently only mysql is supported; this will work with both MySQL and MariaDB connections). |
zip_archive |
file |
Yes | Form Field | The Zip archive file. |
PUT /api/connection-filesets/{id}/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised object of the Connection Fileset that was created. |
PUT /api/connection-filesets/{id}/ curl
example
curl -X PUT "https://<your-datamasque-host>/api/connection-filesets/{id}/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: multipart/form-data" \
-F "name=<fileset_name>" \
-F "database_type=<database_type>" \
-F "zip_archive=@</path/to/your/file.zip>"
DELETE /api/connection-filesets/{id}/
Authorization: User token only.
Deletes the Connection Fileset with the specified id
. You may not delete a Connection Fileset associated to an
existing connection.
DELETE /api/connection-filesets/{id}/ Parameters
No parameters.
DELETE /api/connection-filesets/{id}/ Responses
Status Code | Description |
---|---|
204 |
Operation succeeded. |
DELETE /api/connection-filesets/{id}/ curl
example
curl -X DELETE "https://<your-datamasque-host>/api/connection-filesets/{id}/" \
-H "Authorization: Token <your-api-token>"
`
Ruleset Object
Ruleset
objects have the following fields:
Field | Type | Description |
---|---|---|
id |
integer |
The id of the Ruleset . Use this in API URLs that need a ruleset id . |
name |
string |
The name of the Ruleset . |
config_yaml |
string |
The contents of the Ruleset , including of all the masking rules. |
is_valid |
boolean |
Whether or not the Ruleset is valid, and can be used for masking runs. |
mask_type |
string |
The masking type of the Ruleset . This can be "database" or "file" . |
- Requests related to Ruleset Object:
GET /api/rulesets/
Authorization: User token only.
Returns a list of all rulesets.
GET /api/rulesets/ Parameters
No parameters.
GET /api/rulesets/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised list of Ruleset objects. |
GET /api/rulesets/ curl
example
curl "https://<your-datamasque-host>/api/rulesets/" \
-H "Authorization: Token <your-api-token>"
GET /api/rulesets/{id}/
GET /api/rulesets/{id}/ Parameters
No parameters.
GET /api/rulesets/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Ruleset object. |
curl "https://<your-datamasque-host>/api/rulesets/{id}/" \
-H "Authorization: Token <your-api-token>"
POST /api/rulesets/
Authorization: User token only.
Creates a new ruleset.
POST /api/rulesets/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
name |
string |
Yes | Request Body | The name of the Ruleset . |
config_yaml |
string |
Yes | Request Body | The YAML contents of the Ruleset . |
mask_type |
string |
No | Request Body | The masking type of the Ruleset . Valid options are "database" or "file" . |
POST /api/rulesets/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Ruleset object. |
POST /api/rulesets/ curl
example
curl -X POST "https://<your-datamasque-host>/api/rulesets/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "<your-new-name>",
"config_yaml": "version: \"1.0\"\ntasks:\n - type: run_data_discovery"
}'
PUT /api/rulesets/{id}/
Authorization: User token only.
Update an existing ruleset.
PUT /api/rulesets/{id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
name |
string |
Yes | Request Body | The name of the Ruleset . |
config_yaml |
string |
Yes | Request Body | The YAML contents of the Ruleset . |
mask_type |
string |
No | Request Body | The masking type of the Ruleset . Valid options are "database" or "file" . |
PUT /api/rulesets/{id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Ruleset object with the updated values. |
PUT /api/rulesets/{id}/ curl
example
curl -X PUT "https://<your-datamasque-host>/api/rulesets/{id}/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"name": "<your-new-name>",
"config_yaml": "version: \"1.0\"\ntasks:\n - type: run_data_discovery"
}'
DELETE /api/rulesets/{id}/
Authorization: User token only.
Deletes the ruleset with the specified id
.
DELETE /api/rulesets/{id}/ Parameters
No parameters.
DELETE /api/rulesets/{id}/ Responses
Status Code | Description |
---|---|
200 |
Operation succeeded |
DELETE /api/rulesets/{id}/ curl
example
curl -X DELETE "https://<your-datamasque-host>/api/rulesets/{id}/" \
-H "Authorization: Token <your-api-token>" \
Seed Object
Field | Type | Description |
---|---|---|
id |
integer |
The id of the Seed . |
name |
string |
The name of the Seed . |
seed_file |
string |
The location of the Seed . |
created date |
datetime |
The date that the Seed was uploaded. |
filename |
string |
The file name of the uploaded Seed . |
- Requests that use Seed Object:
GET /api/seeds/
Authorization: User token only.
Get a list of all DataMasque seed files.
Optionally, you can add an {id}
to the end of the request to only return the details of the seed with that specific
id
.
GET /api/seeds/ Parameters
No parameters.
GET /api/seeds/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised list of Seed objects. |
GET /api/seeds/ curl
example
curl "https://<your-datamasque-host>/api/seeds/" \
-H "Authorization: Token <your-api-token>"
POST /api/seeds/
Authorization: User token only.
Create a new seed from a csv file.
POST /api/seeds/ Parameters
Field | Type | Required | Description |
---|---|---|---|
name |
string |
No | The name of the csv file. |
description |
string |
No | A description of the seed file to displayed on the files menu. |
seed_file |
file |
No | The seed file. |
POST /api/seeds/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Seed object. |
POST /api/seeds/ curl
example
curl -X POST "https://<your-datamasque-host>/api/seeds/" \
-H "Authorization: Token <your-api-token>" \
-F "name=<fileset_name>" \
-F "seed_file=@</path/to/your/seed_file.csv>"
Audit Log Object
Field | Type | Description |
---|---|---|
id |
integer |
The id of the audit log. |
timestamp |
datetime |
The timestamp of when the audit log was created. |
username |
string |
The username which created the audit log. |
category |
string |
The category for the audit log, one of the following: auth , run , ruleset , or connection |
action |
string |
The action taken. One of the following: logged_in logged_out , for auth actions, started , cancelled , for masking run actions, created , modified , deleted for connection or ruleset actions. |
description |
string |
A short description of what happened during the action. |
- Requests that use Audit Log Object:
Audit Log CSV
A CSV representation of the Audit Log Object
The CSV file contains the following headers:
Field | Type | Description |
---|---|---|
timestamp |
datetime |
The timestamp of when the audit log was created. |
username |
string |
The username which created the audit log. |
category |
string |
The category for the audit log, one of the following: auth , run , ruleset , or connection |
action |
string |
The action taken. One of the following: logged_in logged_out , for auth actions, started , cancelled , for masking run actions, created , modified , deleted for connection or ruleset actions. |
description |
string |
A short description of what happened during the action. |
- Requests that use Audit Log CSV:
GET /api/audit-logs/
Authorization: User token only.
Retrieve all Audit Logs.
GET /api/audit-logs/ Parameters
No parameters.
GET /api/audit-logs/ Response
Status Code | Description |
---|---|
200 |
A list of JSON serialised list of Audit Log objects |
GET /api/audit-logs/ curl
example
curl "https://<your-datamasque-host>/api/audit-logs/" \
-H "Authorization: Token <your-api-token>"
GET /api/audit-logs/download/
Authorization: User token only.
Retrieve all Audit Logs.
GET /api/audit-logs/download/ Parameters
No parameters.
GET /api/audit-logs/download/ Response
Status Code | Description |
---|---|
200 |
The server will return the audit logs in the response body which can be then downloaded as a CSV file. |
GET /api/audit-logs/download/ curl
example
curl -o <your-downloads-path>/<your-download-name>.csv -X GET "https://<your-datamasque-host>/api/audit-logs/" \
-H "Authorization: Token <your-api-token>"
Schema Discovery
POST /api/schema-discovery/
Authorization: User token only.
Executes schema discovery against a database connection.
POST /api/schema-discovery/ Parameters
Field | Type | Required | Description |
---|---|---|---|
connection |
string |
Yes | The id of the Connection . |
custom_keywords |
array[string] |
Yes | List of keywords that, where a column name matches one or more of the keywords, indicates the column contains sensitive data. |
disable_built_in_keywords |
boolean |
Yes | If set to true , then DataMasque's built-in list of keywords will not be used to flag columns as sensitive. |
disable_global_custom_keywords |
boolean |
Yes | If set to true , then the user-defined global set of custom keywords will not be used to flag columns as sensitive. |
disable_global_ignored_keywords |
boolean |
Yes | If set to true , then the user-defined global set of ignored keywords will not be used to exclude columns from the discovery results. |
ignored_keywords |
array[string] |
Yes | List of keywords that, where a column name matches one or more of the keywords, indicates the column should be excluded from the schema discovery results. |
in_data_discovery |
object |
No | In-data discovery options. An object containing enabled , row_sample_size , custom_rules , non_sensitive_rules and force options. Defaults to {enabled: false} . |
schemas |
array[string] |
Yes | List of schema names (or database for MySQL/MariaDB) against which to perform schema discovery. Send an empty list to run against the schema configured on the database connection, or the database user's default schema if one is not specified for the connection. |
POST /api/schema-discovery/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Run object. |
POST /api/schema-discovery/ Example
curl -X POST "https://<your-datamasque-host>/api/schema-discovery/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"connection": "<your-connection-id>",
"custom_keywords": [],
"ignored_keywords": [],
"disable_global_custom_keywords": false,
"disable_global_ignored_keywords": false,
"disable_built_in_keywords": false,
"in_data_discovery": {
"enabled": true,
"row_sample_size": 500,
"custom_rules": [
{
"name": "temp_staff",
"pattern": "temp.*"
}
],
"non_sensitive_rules": [
{"pattern": "retired.*"}
],
}
}'
GET /api/schema-discovery/{connection_id}/
Authorization: User token or API token.
Retrieve schema discovery results.
GET /api/schema-discovery/{connection_id}/ Parameters
None
GET /api/schema-discovery/{connection_id}/ Response
Status Code | Description |
---|---|
200 |
A JSON serialised object containing a Schema Discovery object and a Run object. |
Field | Type | Description |
---|---|---|
data |
object |
A Schema Discovery. |
last_sdd_run |
object |
A JSON serialised Run object. |
Schema Discovery Object
Schema Discovery
objects have the following fields:
Field | Type | Description |
---|---|---|
options |
object |
List of ignored_keywords and customised_keywords . |
schemas |
list[object] |
List of schema objects each with name and list of tables . tables contain name and a list of columns . |
sd_version |
string |
Schema discovery version e.g. "1.1.1". |
Schema Discovery Column Object
Column
objects have the following fields:
Field | Type | Description |
---|---|---|
name |
string |
The column name |
data_type |
string |
The data type for this field e.g varchar , integer , numeric , timestamp without time zone . |
categories |
list[string] |
A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom. |
max_length |
number |
The column length |
description |
string |
The reason the column was flagged as sensitive. |
foreign_keys |
list[object] |
A list of foreign key objects containing name and referenced_column . |
is_unique_key |
boolean |
Is the column a unique key. |
numeric_scale |
number |
If the data_type is numeric this refers to the maximum number of decimal places. |
ruleset_match |
boolean |
The type of information detected by sensitive data discovery, used internally by the the ruleset generator to suggest a suitable masking rule. |
in_data_result |
list[object] |
A list of In Data matches. |
is_primary_key |
boolean |
Is the column a primary key. |
numeric_precision |
number |
If the data_type is numeric this refers to the maximum number of digits present. |
constraint_columns |
list[string] |
A list of column names participating in the constraint. |
pk_constraint_name |
string |
The name of the primary key constraint. |
uk_constraint_name |
string |
The name of the unique key constraint. |
unique_index_names |
list[string] |
A list of index names for this column. |
allow_in_data_override |
boolean |
A boolean representing that a Sensitive Data match can be overridden by an In Data match. |
referencing_foreign_keys |
list[string] |
A list of foreign keys referencing this column. |
GET /api/schema-discovery/v2/{run_id}/
Authorization: User token or API token.
Retrieve schema discovery results with server-side pagination, sorting, filtering and searching.
GET /api/schema-discovery/v2/{run_id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
limit |
number |
No | Query Parameter | The maximum number of results to return. Defaults to 50 if not set. |
offset |
number |
No | Query Parameter | The index of the first item to be returned within the whole set of results. Defaults to 0 if not set. |
ordering |
string |
No | Query Parameter | Controls the sort order of results. Specify one or more columns separated by commas. To specify descending sort order, prefix the field name with '-'. Defaults to ?ordering=schema,table,column . |
search |
string |
No | Query Parameter | Performs a case-insensitive partial match on the schema, table or column name. |
categories |
string |
No | Query Parameter | Filters the categories (Data Classifications) using an exact match. Valid values are PII , PHI or PCI . |
data_type |
string |
No | Query Parameter | Filters the data type name (excluding the length or numeric precision/scale) e.g ?data_type=varchar . |
description |
string |
No | Query Parameter | Searches the description using a case-insensitive partial match. |
flagged_by |
string |
No | Query Parameter | Filters the Flagged By field using an exact match. Valid values are In-Data Discovery or Metadata Discovery . |
is_sensitive |
boolean |
No | Query Parameter | Filters the results for sensitive matches. Set to true to return only sensitive results, or false for only non-sensitive. |
constraint |
string |
No | Query Parameter | Filters for results with either Primary or Unique constraints. Valid values are primary or unique (case-insensitive). |
GET /api/schema-discovery/v2/{run_id}/ Response
Status Code | Description |
---|---|
200 |
A JSON serialised object containing pagination meta-data and a list of Schema Discovery Result objects. |
Field | Type | Description |
---|---|---|
count |
number |
Total number of unpaginated results. |
next |
string |
Pagination link to the next page of results. |
previous |
string |
Pagination link to the previous page of results. |
results |
list[object] |
A list of Schema Discovery Result objects. |
Schema Discovery Result object
Schema Discovery Result
objects have the following fields:
Field | Type | Description |
---|---|---|
id |
number |
A unique id for the result. |
column |
string |
The column name. |
table |
string |
The table name. |
schema |
string |
The schema name. |
data |
object |
A v2 Schema Discovery Column Object. |
v2 Schema Discovery Column Object
v2 Schema Discovery Column
objects have the following fields:
Field | Type | Description |
---|---|---|
data_type |
string |
The data type for this field e.g varchar , integer , numeric , timestamp without time zone with the max_length or numeric_precision and numeric_scale appended. |
max_length |
number |
The column length. |
foreign_keys |
list[object] |
A list of foreign key objects containing name and referenced_column as a string containing schema.table.column . |
discovery_matches |
list[object] |
A list of Discovery Match objects sorted by priority. |
numeric_precision |
number |
The numeric precision of the column, the meaning of which depends on the database and data type. |
numeric_scale |
number |
The numeric scale of the column, the meaning of which depends on the database and data type. Default is null . |
constraint_columns |
list[string] |
A list of column names participating in the constraint. |
pk_constraint_name |
string |
The name of the primary key constraint. Default is null . |
uk_constraint_name |
string |
The name of the unique key constraint. Default is null . |
unique_index_names |
list[string] |
A list of index names for this column. |
referencing_foreign_keys |
list[object] |
A list of foreign keys referencing this column. The objects contain a name and referencing_column as a string containing schema.table.column . |
categories |
list[string] |
A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom. |
description |
string |
The reason the column was flagged as sensitive (blank for non-sensitive columns). |
flagged_by |
string |
Indicates whether the column was flagged by In-Data Discovery or Metadata Discovery (or blank for non-sensitive columns). |
constraint |
string |
Indicates whether the column is either a Primary or Unique key. |
Discovery Match Object
Discovery Match
objects have the following fields:
Field | Type | Description |
---|---|---|
label |
string |
A name for the rule that flagged the match. Can also be custom , custom_non_sensitive or ignore for user-defined match rules. |
categories |
list[string] |
A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom. |
flagged_by |
string |
Indicates whether the column was flagged by In-Data Discovery or Metadata Discovery . |
description |
string |
The reason the column was flagged as sensitive. |
Generating Rulesets
POST /api/generate-ruleset/
Authorization: User token only.
Returns a ruleset string for selected columns of a connection.
Prerequisite: Make sure you have the schema-discovery report for the connection specified in the post data.
POST /api/generate-ruleset/[v1/|v2/] curl
example
curl -X POST "https://<your-datamasque-host>/api/generate-ruleset/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"connection": "<your-connection-id>",
"selected_columns": {
"schema_name": {
"table_name": [
"column_name_1",
"column_name_2"
]
}
}
}'
POST /api/generate-ruleset/[v1/] Response
The default response for a version 1 request is a json encoded string containing the ruleset yaml. The trailing /v1/
is optional for version 1.
POST /api/generate-ruleset/v2/ Response
The version 2 response is a plain text containing the ruleset yaml.
POST /api/generate-file-ruleset/
Authorization: User token only.
Returns a ruleset string for selected data of a file connection.
The selected data is a list of file groups, each of which contains:
- A list of
files
which are the full paths relative to the base directory of the connection. - A list of
locators
, which are either JSON locators or strings containing a single header column name. JSON locators must be formatted as lists even if they consist of a single element.
Each file group will generate at least one task in the ruleset
(either mask_file
or mask_tabular_file
).
Generally, only one task will be generated per file group, but in cases where files have different extensions, delimiters or encodings, multiple tasks will be generated to cater for these settings.
File groups should only contain files of the same type, that is, don't specify object files, multi-record files, or tabular files in the same file group. If multiple file types are mixed, then the generated ruleset will attempt to split into multiple tasks, but the results may be unexpected.
Prerequisite: Make sure you have the file-discovery report for the connection specified in the POST data so that a discovery run has been completed on the connection and the files can be selected from the report.
POST /api/generate-file-ruleset/ curl
example
curl -X POST "https://<your-datamasque-host>/api/generate-file-ruleset/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"connection": "<your-connection-id>",
"selected_data": [
{
"files": ["file1.json", "file2.json"],
"locators": [["age"], ["users", "*", "name"]]
},
{
"files": ["file1.csv", "file2.csv"],
"locators": ["gender", "address"]
},
[repeated for different file groups…]
],
}'
POST /api/generate-file-ruleset/ Response
The response is plain text containing the ruleset yaml.
Generate Ruleset Result Object
Generate Ruleset Result
objects are returned by DataMasque for the async-generate-ruleset
family of APIs.
They have the following fields:
Field | Type | Description |
---|---|---|
connection |
string |
The ID of the connection for which a ruleset is being generated. |
generated_ruleset |
string |
The ruleset that has been generated. Not applicable if ruleset generation was started using the from-csv API. |
status |
string |
The status of the ruleset generation task. One of queued , running , finished , failed , or cancelled . |
status_message |
string |
A status message describing the progress of the ruleset generation task. |
error_message |
string |
The error message when generating the ruleset has failed. |
last_updated |
string |
The timestamp of the last update to this Generate Ruleset Result , in ISO 8601 format. |
- Requests that use Generate Ruleset Result Object:
Endpoint to query to get the generated ruleset:
- When ruleset generation is started using POST /api/async-generate-ruleset/{connection_id}/ and completes successfully, the generated ruleset is available in the
generated_ruleset
field of the response from the GET /api/async-generate-ruleset/{connection_id}/ API endpoint.- When ruleset generation is started using POST /api/async-generate-ruleset/{connection_id}/from-csv/ and completes successfully, a ZIP file of all generated rulesets can be downloaded from the GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ API endpoint.
GET /api/async-generate-ruleset/{connection_id}/
Authorization: User token only.
Returns result of generating ruleset progress.
GET /api/async-generate-ruleset/{connection_id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
connection_id |
string |
Yes | URL Path | The id of the Connection . |
GET /api/async-generate-ruleset/{connection_id}/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Generate Ruleset Result Object. |
404 |
Not Found: No connection with the specified ID exists. |
GET /api/async-generate-ruleset/{connection_id}/ curl
example
curl "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
-H "Authorization: Token <your-api-token>"
POST /api/async-generate-ruleset/{connection_id}/
Authorization: User token only.
Start generating ruleset for selected columns of a database connection or for selected data of a file connection.
POST /api/async-generate-ruleset/{connection_id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
connection_id |
string |
Yes | URL Path | The id of the Connection . |
POST /api/async-generate-ruleset/{connection_id}/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Generate Ruleset Result Object. |
404 |
Not Found: No connection with the specified ID exists. |
POST /api/async-generate-ruleset/{connection_id}/ curl
example
For generating rulesets on database connections:
curl -X POST "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"selected_columns": {
"schema_name": {
"table_name": [
"column_name_1",
"column_name_2"
]
}
}
}'
For generating rulesets for file connections:
POST /api/async-generate-ruleset/{connection_id}/ curl
example
curl -X POST "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"selected_data": [
{
"files": ["file1.json", "file2.json"],
"locators": [["age"], ["users", "*", "name"]]
},
{
"files": ["file1.csv", "file2.csv"],
"locators": ["gender", "address"]
},
[repeated for different file groups…]
],
}'
DELETE /api/async-generate-ruleset/{connection_id}/
Authorization: User token only.
Cancels ruleset generation currently in progress for a connection. If the ruleset generation has already finished, deletes any generated ruleset.
Warning! Deletion of the generated ruleset is irreversible.
DELETE /api/async-generate-ruleset/{connection_id}/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
connection_id |
string |
Yes | URL Path | The id of the Connection . |
DELETE /api/async-generate-ruleset/{connection_id}/ Responses
Status Code | Description |
---|---|
200 |
Ruleset generation cancelled before any results were processed. |
204 |
Ruleset generation had finished. The generated ruleset has been deleted. |
404 |
Not Found: No connection with the specified ID exists. |
DELETE /api/async-generate-ruleset/{connection_id}/ curl
example
curl -X DELETE "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
-H "Authorization: Token <your-api-token>"
POST /api/async-generate-ruleset/{connection_id}/from-csv/
Authorization: User token only.
Start generating a ruleset for selected columns of a database connection.
The columns are specified by modifying the CSV report retrieved from the /api/runs/{run_id}/db-discovery-results/report/ endpoint.
Specifically, there is one discovered database column detailed in each row of the CSV report,
and if that column is to be included in ruleset generation,
the Selected
column of the CSV should be marked with 1
, true
, y
or yes
(case-insensitive).
POST /api/async-generate-ruleset/{connection_id}/from-csv/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
connection_id |
string |
Yes | URL Path | The id of the Connection . |
csv_or_zip_file |
file |
Yes | Request Body | The byte content of the CSV, or the ZIP file containing one or more CSVs. |
target_size_bytes |
int |
No | Request Body | Generate rulesets of approximately this size in bytes. Defaults to 512,000 (500 KiB). |
force_run |
boolean |
No | Request Body | If set to true , cancel any existing ruleset generation and restart it. Defaults to false . |
POST /api/async-generate-ruleset/{connection_id}/from-csv/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Generate Ruleset Result Object. |
404 |
Not Found: No connection with the specified ID exists. |
POST /api/async-generate-ruleset/{connection_id}/from-csv/ curl
example
curl -X POST "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/" \
-H "Authorization: Token <your-api-token>" \
-F "csv_file=@selected_report.csv" \
-F "target_size_bytes=250000"
GET /api/async-generate-ruleset/{connection_id}/download-rulesets/
Authorization: User token only.
Once ruleset generation invoked via POST /api/async-generate-ruleset/{connection_id}/from-csv/ is completed, query this endpoint to download the rulesets in a ZIP file.
GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
connection_id |
string |
Yes | URL Path | The id of the Connection . |
GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ Responses
Status Code | Description |
---|---|
200 |
Returns a streamed ZIP file containing the generated rulesets. |
400 |
Bad Request: The ruleset generation is still in progress, or has failed. |
If an error response is received, query the GET /api/async-generate-ruleset/{connection_id}/ endpoint to check the status of ruleset generation.
GET /api/async-generate-ruleset/{connection_id}/download-rulesets/ curl
example
curl -o rulesets.zip "https://<your-datamasque-host>/api/async-generate-ruleset/{connection_id}/download-rulesets/" \
-H "Authorization: Token <your-api-token>"
File Data Discovery
POST /api/run-file-data-discovery/
Authorization: User token only.
Executes data discovery against files on a file connection. The file connection must already be configured. Use the UUID of the file connection in the request, which can be found:
- at the top of the page when you view the connection in the DataMasque UI, or
- in the URL when you view the connection in the DataMasque UI, or
- in the
id
field of the Connection Object.
Discovery keywords
By default, DataMasque's extensive list of built-in keywords
is used to identify which fields and attributes in the files are considered sensitive.
DataMasque matches the name of the field or attribute against each keyword using a case-insensitive, partial match.
For example, a field named credit_CARD_NUMBER
will match the Credit card
keyword.
You can use various options to refine the set of discovery keywords.
- Setting
disable_built_in_keywords
totrue
means that the built-in keyword list linked above will not be used. In this case, the discovery process will use only the keywords given incustom_keywords
and any configured global custom keywords. - The
custom_keywords
option allows you to specify a list of additional keywords to match on. Any fields or attributes whose name includes one or more of those keywords will be flagged as sensitive. - A match between a field or attribute's name and a value in the
ignored_keywords
list will cause a field or attribute to be completely excluded from the results, even if its name suggests that the field may contain sensitive data. - Global keywords, as configured through the Settings page of the DataMasque UI, are also considered
unless
disable_global_custom_keywords
and/ordisable_global_ignored_keywords
(as appropriate) are set totrue
.
Warning! Ignored keywords have priority. If a field or attribute name matches both a built-in, global, or custom keyword and also matches an entry in
ignore_keywords
or a global ignored keyword, the field or attribute will not be included in the discovery results.
Specifying files to discover
Supported filetypes for discovery are:
- JSON (
.json
) - NDJSON (
.ndjson
) - Parquet (
.parquet
) - CSV (
.csv
)
Note: Files' types are determined solely by the file extension, not by their content.
Use the include
, skip
and recurse
options to control which files are included in the discovery process.
These have the same syntax and meaning as in a
from_file
task definition.
If none of these options are included, DataMasque will run discovery against all files (of the supported filetypes)
in the base directory specified on the connection, but will not recurse into subdirectories.
See also Choosing files to mask with include
/skip
for an exact specification of the behaviour of, and some common examples of, include
and skip
rules.
Warning! If a file matches both an
include
and askip
rule, that file will not be included in data discovery.Note: Take care to correctly escape backslashes in
include
orskip
regexes. For example, if you want to match a literal dot (.
) in a filename, the regex needs to escape the dot with a backslash and this backslash must itself be escaped as part of JSON encoding rules, since the request body is in JSON format. So you might use the JSON object{"regex": "file\\.[0-9]+\\.csv"}
, representing the regexfile\.[0-9]+\.csv
which will matchfile.53.csv
but notfilex53.csv
.
Encoding of CSV files
The encoding
option controls how DataMasque interprets CSV files.
The default encoding
is utf-8
.
Refer to Python Standard Encodings for a list of supported encodings.
Supported Parquet column types
The list of Parquet column data types supported by file data discovery is the same as the list of supported data types for Parquet masking. See the list of supported data types here.
For complex columns (those of struct
, map
and list
type), also called nested columns,
all fields of scalar data type within the columns are discovered separately.
In the file discovery reports,
the locators for the individual scalar fields are given as JSON paths
with the column name as the first element.
Note: This differs from the syntax used for masking these fields where the column name must be specified separately from the path to the field within the column.
For example, with a column named staff
of type map<string, struct<name: string, employee_id: int64, salary_history: list<float>>>
(a map where the keys are strings
and the values are a structure type with keys name
, employee_id
, and salary_history
, the latter being a list),
the discovered fields will all have one of the following path formats:
staff/<key value>/name
staff/<key value>/employee_id
staff/<key value>/salary_history/*
where <key value>
is a key in the top-level map
.
Notice that all list indices are replaced with the wildcard *
and treated as a single field.
Custom and ignored keywords match on the name of the individual field (such as name
in the above example),
not the name of the column.
For list fields, they match on the last string element of the path (ignoring list indices), for example salary_history
.
In-data discovery options
The in_data_discovery
parameter on the API request body
allows you to control whether and how the discovery process uses in-data discovery
to refine sensitive data matches.
It is an object parameter with the following fields.
- You must specify the
enabled
parameter (true
orfalse
). - Optional parameters are a
row_sample_size
(positive integer),force
(a boolean), a list of zero or morecustom_rules
, and a list of zero or morenon_sensitive_rules
. - Each entry in
custom_rules
is an object with parametersname
andpattern
, wherename
is any user-defined name andpattern
is a regex. - Each entry in
non_sensitive_rules
is an object with apattern
parameter, again a regex. row_sample_size
defaults to1000
.force
defaults tofalse
.custom_rules
andnon_sensitive_rules
are empty by default.
When enabled, in-data discovery applies the built-in rules,
alongside any specified custom_rules
and non_sensitive_rules
,
matching against the data within tabular file columns, or scalar values within JSON documents or complex Parquet columns.
Warning! Non-sensitive rules have priority. If a field or attribute name matches a keyword, built-in IDD rule or custom IDD rule, and also matches a non-sensitive rule, the field or attribute will be marked in the discovery results as
Custom Non-Sensitive
.
The row_sample_size
controls how many samples the in-data discovery process will examine to try to identify the type of data.
Configure the row_sample_size
according to your needs,
bearing in mind that in-data discovery samples only the first <row_sample_size>
rows or values encountered when processing the file
(so the first 1000 rows in a CSV file, for example, with the default sample size).
Use of very large sample sizes can slow down data discovery and consume a lot of RAM
(see also this table of memory limits for in-data discovery).
- If your files are small and/or consistent in that they have the same kind of data present in most or all rows, then a sample size of 100-500 rows is sufficient.
- If you have large files with sparse data (many nulls) and/or differing data formats within a column or JSON path, use a larger sample size.
When enabled, force
will run IDD on a column even if schema discovery has already flagged the column as containing sensitive data.
POST /api/run-file-data-discovery/ Parameters
Field | Type | Required | Description |
---|---|---|---|
connection |
string |
Yes | The id of the Connection . |
in_data_discovery |
object |
No | In-data discovery options. An object containing enabled , row_sample_size , custom_rules , ignore_rules and force options. Defaults to {enabled: false} . |
custom_keywords |
array[string] |
No | List of keywords that, where a field or attribute's name matches one or more of the keywords, indicates the column contains sensitive data. Default value is an empty list. |
ignored_keywords |
array[string] |
No | List of keywords that, where a field or attribute's name matches one or more of the keywords, indicates the field or attribute should be excluded from the schema discovery results. Default value is an empty list. |
disable_global_custom_keywords |
boolean |
No | If set to true , then the user-defined global set of custom keywords will not be used to flag fields or attributes as sensitive. Default value is false . |
disable_global_ignored_keywords |
boolean |
No | If set to true , then the user-defined global set of ignored keywords will not be used to exclude fields or attributes from the discovery results. Default value is false . |
disable_built_in_keywords |
boolean |
No | If set to true , then DataMasque's built-in list of keywords will not be used to flag fields or attributes as sensitive. Default value is false . |
include |
array[object] |
No | Files to discover, specified as glob or regex . Default value is an empty list, meaning everything will be included. |
skip |
array[object] |
No | Files to exclude, specified as glob or regex . Default value is an empty list, meaning everything will be included. |
recurse |
boolean |
No | Whether to recurse into subdirectories of the base directory, or of items matched by include . Default value is false . |
encoding |
string |
No | File byte encoding. Only applies to CSV files. Default value is utf-8 . |
workers |
integer |
No | Number of workers. Refer to the File Ruleset Generator page for information. Allowed range is 1-32. Defaults to 1. |
POST /api/run-file-data-discovery/ Responses
Data discovery runs asynchronously as a special type of masking run.
This API endpoint returns a Run object which contains an id
field.
Use the GET /api/runs/{id}/ endpoint with this run ID to query the status of the data discovery process.
To retrieve the file discovery results when the run is complete, use the GET /api/runs/{id}/file-discovery-results/ endpoint with this run ID.
Status Code | Description |
---|---|
201 |
A JSON serialised Run object. |
POST /api/run-file-data-discovery/ curl
example
curl -X POST "https://<your-datamasque-host>/api/run-file-data-discovery" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"connection": "<your-connection-id>",
"in_data_discovery": {
"enabled": true,
"row_sample_size": 500,
"custom_rules": [
{
"name": "temp_staff",
"pattern": "temp.*"
}
],
"non_sensitive_rules": [
{"pattern": "retired.*"}
],
"force": false
},
"custom_keywords": ["id1", "id2"],
"ignored_keywords": ["ignore1"],
"include": [
{"glob": "*.ndjson"},
{"glob": "*.json"},
],
"skip": [
{"regex": "backup/staff[0-9]+\\.json"},
],
"recurse": true,
"workers": 4
}'
GET /api/runs/{id}/file-discovery-results/
Authorization: User token or API token.
Retrieve file discovery results.
GET /api/runs/{id}/file-discovery-results/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
id |
integer |
Yes | URL Path | The id of the Run . |
GET /api/runs/{id}/file-discovery-results/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised list of File Discovery Result objects. |
GET /api/runs/{id}/file-discovery-results/ curl
example
curl "https://<your-datamasque-host>/api/runs/{id}/file-discovery-results/" \
-H "Authorization: Token <your-api-token>"
GET /api/runs/{id}/file-discovery-results/ Example response
This shows a group of results where one file was discovered with a Metadata match on Passenger ID
, an In-Data match on Name
and no matches on Ticket
.
[
{
"id": 1,
"connection": {
"id": "f795b7f1-d654-41c8-bb7c-db741d81dc19",
"name": "example_file_source"
},
"file_type": "csv",
"files": [
{
"path": "example.csv",
"delimiter": ",",
"encoding": "utf-8",
"file_type": "csv"
}
],
"results": [
{
"locator": "PassengerId",
"matches": [
{
"label": "identifiers",
"categories": ["PII", "PHI"],
"flagged_by": "Metadata Discovery",
"description": "Identification"
}
],
"data_types": ["int"]
},
{
"locator": "Name",
"matches": [
{
"label": "name",
"categories": ["PII", "PCI", "PHI"],
"flagged_by": "In-Data Discovery",
"description": "Full Names"
}
],
"data_types": ["str"]
},
{
"locator": "Ticket",
"matches": [],
"data_types": ["str"]
}
]
}
]
File Discovery Result Object
File Discovery Result
objects have the following fields:
Field | Type | Description |
---|---|---|
id |
integer |
The id of the File Discovery Result . |
connection |
object |
The UUID and name identifying the connection used for this File Discovery Result . |
file_type |
string |
The file type (csv , parquet , json , or ndjson ). File Discovery Results are grouped by file type. |
files |
array[object] |
A list of File objects. |
results |
array[object] |
A list of Result objects. |
File Object
File
objects have the following fields:
Field | Type | Description |
---|---|---|
path |
string |
The discovered file's path, relative to the base directory of the connection. |
file_type |
string |
The file type (csv , parquet , json , or ndjson ). |
delimiter |
Optional[string] |
For delimited text files, the field separator. e.g "," for csv |
encoding |
Optional[string] |
The file encoding, for example "utf-8". |
Result Object
Result
objects have the following fields:
Field | Type | Description |
---|---|---|
locator |
array['string' or 'int'] or string |
Either a JSON locator or a column name. |
matches |
array['object'] |
A list of Match objects. |
data_types |
array['string'] |
The list of data types found for this field: int , long , str , date , time , year , timestamp , boolean , float , or decimal . |
Match Object
Match
objects have the following fields:
Field | Type | Description |
---|---|---|
categories |
array['string'] |
A list of classifications for the flagged sensitive data: PII, PHI, PCI and/or Custom. |
flagged_by |
string |
Whether the column was flagged for sensitive data through in-data discovery or through the standard sensitive data discovery / keyword matching process. Metadata Discovery or In-Data Discovery . |
description |
string |
The name of the rule which caused the column to be flagged for sensitive data. |
label |
string |
Machine-readable representation of description . |
Oracle Wallets
GET /api/oracle-wallets/
Authorization: User token only.
Returns a list of Oracle wallets. These are used to connect to encrypted Oracle connections.
GET /api/oracle-wallets/ Parameters
No parameters.
GET /api/oracle-wallets/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised list of Oracle wallets. |
GET /api/oracle-wallets/ curl
example
curl "https://<your-datamasque-host>/api/oracle-wallets/" \
-H "Authorization: Token <your-api-token>"
POST /api/oracle-wallets/
Authorization: User token only.
Create a new Oracle wallet.
POST /api/oracle-wallets/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
name |
string |
Yes | Form Field | The name of the Oracle Wallet. |
zip_archive |
file |
Yes | Form Field | The Zip archive file. |
POST /api/oracle-wallets/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised Oracle wallet object of the wallet created. |
POST /api/oracle-wallets/ curl
example
curl -X POST "https://<your-datamasque-host>/api/oracle-wallets/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: multipart/form-data" \
-F "name=<fileset_name>" \
-F "zip_archive=@</path/to/your/file.zip>"
DELETE /api/oracle-wallets/{id}/
Authorization: User token only.
Delete the Oracle wallet with the specified id.
DELETE /api/oracle-wallets/{id}/ Parameters
No parameters.
DELETE /api/oracle-wallets/{id}/ Responses
Status Code | Description |
---|---|
204 |
Operation succeeded. |
DELETE /api/oracle-wallets/{id}/ curl
example
curl -X DELETE "https://<your-datamasque-host>/api/oracle-wallets/{id}/" \
-H "Authorization: Token <your-api-token>"
Git Related Endpoints
Git Setting Object
Git settings are global for the DataMasque instance and can only be updated by an admin user. Git settings are updated on the Settings page in the DataMasque UI.
Git Setting
objects have the following fields:
Field | Type | Description |
---|---|---|
git_repository_url |
string |
The URL of where the Git repository is hosted. |
git_branch |
string |
The name of the Git branch from which DataMasque will push or pull. |
git_directory_path |
string |
The directory that DataMasque will push and pull rulesets to, relative to the root of the repository. Note that DataMasque does not support pushing/pulling rulesets in subdirectories of this directory. |
GET /api/git-setting/
Authorization: User token only.
Retrieve a Git Setting Object with information about the DataMasque instance's Git settings.
GET /api/git-setting/ Parameters
No parameters.
GET /api/git-setting/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized Git Setting Object for the DataMasque instance. |
GET /api/git-setting/ curl
example
curl "https://<your-datamasque-host>/api/git-setting/" \
-H "Authorization: Token <your-api-token>"
GET /api/git-setting/user/
Authorization: User token only.
Retrieve a Git Setting Object with information about the DataMasque instance's Git settings.
If the current user has specified a git_directory_path
,
this will be present in the response.
Otherwise, the git_directory_path
will be the global one for the DataMasque instance.
GET /api/git-setting/user/ Parameters
No parameters.
GET /api/git-setting/user/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized Git Setting Object for the DataMasque instance. |
GET /api/git-setting/user/ curl
example
curl "https://<your-datamasque-host>/api/git-setting/user/" \
-H "Authorization: Token <your-api-token>"
SSH Key Object
SSH Key
objects have the following fields:
Field | Type | Description |
---|---|---|
name |
string |
The specified filename of the SSH Key file. |
date_uploaded |
string |
The ISO 8601 datetime string of when the user uploaded the SSH key. |
GET /api/git-ssh-key/
Authorization: User token only.
Retrieve an SSH Key Object for information about the current user's uploaded SSH Key.
GET /api/git-ssh-key/ Parameters
No parameters.
GET /api/git-ssh-key/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized SSH Key Object which is the most recent SSH Key Upload for the user which made the request. |
GET /api/git-ssh-key/ curl
example
curl "https://<your-datamasque-host>/api/git-ssh-key/" \
-H "Authorization: Token <your-api-token>"
PUT /api/git-ssh-key/
Authorization: User token only.
Upload an SSH Key to be used to access a Git remote repository.
Warning: A user may have only one SSH key at a time, so the existing key will be deleted and replaced with the uploaded key for the user making the request.
PUT /api/git-ssh-key/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
key_file |
file |
Yes | Form Field | The SSH Key file. |
name |
string |
Yes | Form Field | The name of the file. |
PUT /api/git-ssh-key/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialized SSH Key Object, which is the most recent SSH Key Upload for the user making the request. |
PUT /api/git-ssh-key/ curl
example
curl -X PUT "https://<your-datamasque-host>/api/git-ssh-key/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: multipart/form-data" \
-F "key_file=@</path/to/your/file>" \
-F "name=<your-ssh-key-filename>"
DELETE /api/git-ssh-key/
Authorization: User token only.
Delete the current user's uploaded SSH key.
DELETE /api/git-ssh-key/ Parameters
No parameters.
DELETE /api/git-ssh-key/ Responses
Status Code | Description |
---|---|
204 |
The SSH key associated with the requesting user has been deleted. |
DELETE /api/git-ssh-key/ curl
example
curl DELETE -X "https://<your-datamasque-host>/api/git-ssh-key/" \
-H "Authorization: Token <your-api-token>"
GET /api/ruleset-git/
Authorization: User token only.
Pull the content of a specific ruleset given its commit ID. The current user's Git SSH key is used for authentication.
How File Paths Are Built
Internally, DataMasque generates the name of the file by appending the specified extension
to ruleset_name
.
The file name is then appended to git_directory_path
(from the DataMasque Git Settings)
to build the full file path.
For example, for a ruleset_name
of My Ruleset
, extension
of .yml
and git_directory_path
of masking/rulesets
,
the file masking/rulesets/My Ruleset.yml
will be retrieved.
Its contents will be that as at the specified commit ID.
GET /api/ruleset-git/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
commit_id |
string |
Yes | Query Parameter | The Git commit ID for the ruleset. |
ruleset_name |
string |
Yes | Query Parameter | The name of the ruleset. Used to build the path as per How File Paths Are Built above. |
extension |
string |
No | Query Parameter | The extension to save with the ruleset name. Must be .yml or .yaml . Default to .yml if missing. |
GET /api/ruleset-git/ Responses
Status Code | Description |
---|---|
200 |
A JSON object with a single key, config_yaml , that contains the ruleset content |
GET /api/ruleset-git/ curl
example
curl "https://<your-datamasque-host>/api/ruleset-git/?commit_id=<your-full-commit-id>&ruleset_name=<your-ruleset-name>&extension=.yaml" \
-H "Authorization: Token <your-api-token>"
POST /api/ruleset-git/
Authorization: User token only.
Commit then push changes upstream for a specific ruleset.
POST /api/ruleset-git/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
commit_message |
string |
Yes | Request Body | The Git commit message for the ruleset changes. |
ruleset_name |
string |
Yes | Request Body | The name of the ruleset. Used to build the path as per How File Paths Are Built above. |
extension |
string |
No | Request Body | The extension to save with the ruleset name. Must be .yml or .yaml . Default to .yml if missing. |
ruleset_content |
string |
Yes | Request Body | The YAML contents of the ruleset. |
POST /api/ruleset-git/ Responses
Status Code | Description |
---|---|
200 |
Operation succeeded. |
POST /api/ruleset-git/ curl
example
curl -X POST "https://<your-datamasque-host>/api/ruleset-git/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-d '{
"commit_message": "Update ruleset",
"ruleset_name": "<your-ruleset-filename>",
"extension": ".yml",
"ruleset_content": "version: \"1.0\"\ntasks:\n - type: run_data_discovery"
}'
GET /api/ruleset-git/files/
Authorization: User token only.
This endpoint lists the git_directory_path
in the remote repository configured for the DataMasque instance.
It considers any files ending in .yml
to be ruleset files,
and will fetch the list of commits for each of them.
It does not enter into subdirectories of git_directory_path
.
GET /api/ruleset-git/files/ Parameters
No parameters.
GET /api/ruleset-git/files/ Responses
Example Response
The response is a JSON object with each key being the name of a file with a .yml
extension in the
git_directory_path
.
Each file entry has an array objects with a commit
ID, commit date
and commit message
.
{
"Ruleset1.yml": [
{"commit": "f061s…46756", "date": "2024-01-10 12:31:45", "message": "Added Column"},
{"commit": "64c18…1a279", "date": "2024-01-09 10:19:13", "message": "Removed Column"}
],
"Another Ruleset.yml": [
{"commit": "377f5…b32f4", "date": "2023-12-25 12:31:45", "message": "Update rule"}
]
}
Response Codes
Status Code | Description |
---|---|
200 |
A JSON serialized list of ruleset names and their associated Git commit history. |
GET /api/ruleset-git/files/ curl
example
curl "https://<your-datamasque-host>/api/ruleset-git/files/" \
-H "Authorization: Token <your-api-token>"
Exporting DataMasque Configuration
To keep a backup of the data stored in DataMasque, you can export it to a Zip file.
This is done by making a GET
request to /api/export/v1/
.
Optionally, you can also specify the export_type
query parameter to select which data to include in the export.
The parameter may be specified multiple times to specify different types of data to include in the same Zip file.
The Zip file will have the following structure, but please note that some files/directories may be missing if
those files were not included in the export, due to setting an export_type
.
Path | Type | Description |
---|---|---|
manifest.json |
File | A JSON file containing metadata about the export and other files in the Zip. |
rulesets/database/ |
Directory | A directory containing database masking rulesets in YAML format. |
rulesets/file/ |
Directory | A directory containing file masking rulesets in YAML format. |
Export Types
The following export types may be used to control the data included in the export archive:
Currently, only the export of Rulesets is supported, therefore this is no difference in specifying
rulesets
as theexport_type
or omitting theexport_type
parameter completely.
Export Type | Description |
---|---|
all |
Include all data described in this table. This is the default if no export_type is selected. |
rulesets |
Include only rulesets. |
manifest.json
format
The manifest.json
file contains the following information:
metadata
: Metadata about the export archive.version
: The version format of the export file.exported_at
: The UTC date and time the export was created, in ISO format.
data
: Information about the files included in the export archive.rulesets
: A list of metadata about the exported ruleset. Each object in the list contains theid
,name
andtype
(database
orfile
) for each exported ruleset.
Ruleset Export Naming
When rulesets are exported to a Zip archive, they are stored in either the rulesets/database/
directory,
(for database rulesets) or rulesets/file/
directory (for file rulesets).
The name of the file is built by appending .yml
to the ruleset name. For example:
- The database masking ruleset named
Ruleset 01
would be exported torulesets/database/Ruleset 01.yml
. - The file masking ruleset named
Ruleset F
would be exported torulesets/file/Ruleset F.yml
.
Note: Rulesets that have been deleted from DataMasque are not visible in the ruleset list in the DataMasque dashboard, but are still retained in the DataMasque database because runs reference them. These "archived" rulesets are not including the Zip export.
GET /api/export/v1/
Authorization: User token only.
Export DataMasque data to a Zip archive in the Version 1 format.
The filename of the archive will be based on the export type selected,
and contain the current UTC date and time. For example: datamasque_export_rulesets_20240211-091507.zip
.
GET /api/export/v1/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
export_type |
string |
No | Query Parameter | The type of data to export (see Export Types for a full list). Defaults to all . |
Multiple export types may be specified by using multiple export_type
query parameters.
For example, /api/export/v1/?export_type=type_a&export_type=type_b
.
GET /api/export/v1/ curl
example
When using curl
, specify the -O
flag to output the response to disk,
and the -J
flag to allow the response to specify the name (as per the example above).
curl "https://<your-datamasque-host>/api/export/v1/" \
-H "Authorization: Token <your-api-token>" \
-J -O
A Zip file named like datamasque_export_all_20240211-091507.zip
will be saved to the current directory.
Importing DataMasque Configuration
A DataMasque export Zip can be imported to a DataMasque install using the /api/export/v1
API
endpoint.
For the best import experience, a Zip that has been exported from DataMasque than contains a manifest.json
file
should be used.
However, a Zip with the correct folder structure may also be created, even if missing manifest.json
.
DataMasque will import the information,
but automatic conflict resolution of duplicate rulesets will not work as well.
The difference between inclusion/exclusion of manifest.json
is explained below.
Zip Exports From DataMasque With manifest.json
Since Zip exports created by DataMasque include the UUID of each exported item, this can be used to determine which items already exist.
When importing rulesets:
- If a ruleset with a given ID exists during import:
- If ruleset is archived, then it will be restored and its name and content are updated with the imported ruleset.
- If ruleset is not archived, then no action is taken with that ruleset. The content in the DataMasque instance is unchanged.
- If a ruleset is found with a matching name, and the contents are identical, then no action is taken. The content in the DataMasque instance is unchanged.
- If a ruleset is found with a matching name, but the contents are different,
then a new ruleset is created by appending
Copy
to the name. For example, ifRuleset A
exists, then the content will be uploaded to a rulesetRuleset A Copy
. An incrementing number will be added until an unused name is found, for example,Copy 1
,Copy 2
, etc. - If no ruleset with the given ID or name exists, then it is created.
Because of these rules, imports of the same Zip archive may be repeated multiple times without duplicating content.
Zip Exports Created Without manifest.json
A Zip export archive may be created manually, provided the file structure is correct.
That is, it matches the structure outlined in Exporting DataMasque Configuration.
Without a manifest.json
, the ID of rulesets is not known,
so matching is done based on the name, using the following rules:
- If a ruleset is found with a matching name, and the contents are identical, then no action is taken. The content in the DataMasque instance is unchanged.
- If a ruleset is found with a matching name, but the contents are different,
then a new ruleset is created by appending
Copy
to the name. For example, ifRuleset A
exists, then the content will be uploaded to a rulesetRuleset A Copy
. An incrementing number will be added until an unused name is found, for example,Copy 1
,Copy 2
, etc. - If no ruleset with the given name exists, then it is created.
Because the imported IDs of rulesets is not known,
re-running an import without a manifest.json
may result in duplicated rulesets with identical content.
POST /api/import/v1/
Authorization: User token only.
Import a DataMasque export Zip file. The response will contain a list of actions taken for each included object.
POST /api/import/v1/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
zip_archive |
file |
Yes | Form Field | The exported Zip archive file. |
POST /api/import/v1/ Responses
The response of an import request contains information about the resources that were imported, grouped by resource type. An example response is shown below.
{
"data": {
"rulesets": {
"metadata": {"processed": 6, "created": 2, "restored": 1, "error": 1},
"data": [
{
"exported_name": "Ruleset A",
"exported_id": "9d641e97-adf7-4f22-9089-afc3711bf222",
"imported_name": "Ruleset A",
"imported_id": "9d641e97-adf7-4f22-9089-afc3711bf222",
"ruleset_type": "database",
"status": "NOT_CREATED",
"message": "A ruleset with ID \"9d641e97-adf7-4f22-9089-afc3711bf222\" already exists, and was not changed."
},
{
"exported_name": "Ruleset B",
"exported_id": null,
"imported_name": "Ruleset B Copy",
"imported_id": "04ea20f0-ad4c-498e-881f-b0bc79d83ba7",
"ruleset_type": "file",
"status": "CREATED_DUPLICATE",
"message": "A ruleset named \"Ruleset B\" already exists, so ruleset \"Ruleset B Copy\" was created."
},
{
"exported_name": "Ruleset C",
"exported_id": null,
"imported_name": "Ruleset C",
"imported_id": "7d731d55-68c9-400e-a790-e052afe789cc",
"ruleset_type": "database",
"status": "NOT_CREATED",
"message": "A ruleset named \"Ruleset C\" exists with identical content."
},
{
"exported_name": "Ruleset D",
"exported_id": null,
"imported_name": "Ruleset D",
"imported_id": "99eeffd3-3f65-4ed7-8ad1-a31a539b7b2c",
"ruleset_type": "file",
"status": "CREATED",
"message": "Ruleset named \"Ruleset D\" did not exist, and was created."
},
{
"exported_name": "Ruleset E",
"exported_id": "c0f5b5bb-a2ce-4cea-9248-1b8ef6539a0e",
"imported_name": "Ruleset E",
"imported_id": "c0f5b5bb-a2ce-4cea-9248-1b8ef6539a0e",
"ruleset_type": "database",
"status": "RESTORED",
"message": "An archived ruleset with ID \"c0f5b5bb-a2ce-4cea-9248-1b8ef6539a0e\" has been restored and overwritten with the new name and content."
},
{
"exported_name": "Ruleset F",
"exported_id": "abc123",
"imported_name": null,
"imported_id": null,
"ruleset_type": "database",
"status": "ERROR",
"message": "Import of ruleset with ID \"abc123\" due to error: invalid ID."
}
]
}
}
}
The metadata
for each item type shows the number of items of that type processed
,
and how many of each one were created
, restored
or had an error
.
Each data
object contains information about the import of that item. The fields are:
exported_name
: The name of the ruleset in the export Zip archive.exported_id
: The ID of the ruleset from the export Zip archive. Only available if amanifest.json
files is present, otherwise this will benull
.imported_name
: The name that the ruleset was imported to. Usually this will matchexported_name
. This will only benull
on error. If the ruleset was not imported due to it already existing, this will still matchexported_name
.imported_id
: The ID that the ruleset was imported to. This will be generated ifexported_id
wasnull
, otherwise it will be expected to matchexported_id
(even if the data was not changed).imported_id
will benull
on error.ruleset_type
: One ofdatabase
orfile
.status
: The status of the import of this ruleset. One of:NOT_CREATED
: Ruleset was not created due to the ID existing or content being identical.CREATED_DUPLICATE
: A ruleset with that name existed, so it was imported with a new name (inimported_name
).CREATED
: A ruleset with that ID or name did not exist, so was created.RESTORED
: An archived ruleset has been restored and overwritten with the new name and content from an imported ruleset.ERROR
: There was an error creating the ruleset. Checkmessage
for details.
message
: A human-readable message describing the action taken or error that occurred. Messages may change between DataMasque versions, so they should not be relied on to determine the outcome of an import. Instead, refer to thestatus
field.
The status code of the response, as shown in the table below, gives a quick overview of if any resources were created or not.
Status Code | Description |
---|---|
200 |
The import was successful, indicating either no changes (e.g. the uploaded rulesets already existed) or the successful restoration of some rulesets. |
201 |
The import was successful, and one or more rulesets were created. |
POST /api/import/v1/ curl
example
curl -X POST "https://<your-datamasque-host>/api/import/v1/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: multipart/form-data" \
-F "zip_archive=@</path/to/your/datamasque_export_all_20240211-091507.zip>"
Other API Requests
POST /api/users/admin-install/
Authorization: Anonymous, Only when no user has been created.
Verify the DataMasque installation, and set up an admin account.
POST /api/users/admin-install/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
email |
string |
Yes | Request Body | The email of the user you are logging in as. |
username |
string |
Yes | Request Body | The username of the user you are logging in as. |
password |
string |
Yes | Request Body | The password for the user. |
re_password |
string |
Yes | Request Body | The password for the user again, to confirm the password entered above. |
allowed_hosts |
array[string] |
Yes | Request Body | A list of hostnames, IP addresses or CIDR networks that will be allowed to access DataMasque upon installation. |
aws_ec2_instance_id |
string |
Required only for AWS Marketplace installations. | Request Body | The instance id of the AWS EC2. |
contract_license_type |
string |
Required only for AWS Contract Product installations. | Request Body | For contract products, the type of product to check out. Must be either business or enterprise . |
POST /api/users/admin-install/ Responses
Status Code | Description |
---|---|
201 |
A JSON serialised User object, with an extra warnings * item. |
* Any non-critical warnings that were generated during installation
are included in the warnings
item of the response.
This is an array
of string
s.
POST /api/users/admin-install/ curl
example
curl -X POST "https://<your-datamasque-host>/api/users/admin-install/" \
-H "Authorization: Token <your-api-token>" \
-d '{
"email": "<your-admin-email>",
"username": "<your-username>",
"password": "<your-admin-password>",
"re_password": "<your-admin-password>",
"allowed_hosts": ["masque.local"],
"aws_ec2_instance_id": "<your-instance-id>"
}'
Installation Info Object
A JSON object showing the state of the current installation with the following data:
Field | Type | Description |
---|---|---|
is_aws_marketplace |
boolean |
Whether the current installation has been installed from the AWS marketplace. |
installed |
boolean |
If the current installation has been successfully installed. |
is_smtp_configured |
boolean |
If SMTP has been configured on the DataMasque instance. |
is_saml_sso_configured |
boolean |
Is SSO has been enabled on the DataMasque instance. |
- Requests that use Installation Info Object:
GET /api/app/check/
Authorization: User token or API token.
Checks to verify if DataMasque has successfully been installed.
GET /api/app/check/ Parameters
No parameters.
GET /api/app/check/ Response
Code 200
Description:
Status Code | Description |
---|---|
200 |
A JSON serialised Installation Info Object object. |
GET /api/app/check/ curl
example
curl "https://<your-datamasque-host>/api/app/check/" \
-H "Authorization: Token <your-api-token>"
POST /api/license-upload/
Authorization: User token only.
Uploads a licence file to DataMasque.
POST /api/license-upload/ Parameters
No parameters.
POST /api/license-upload/ Responses
Status Code | Description |
---|---|
200 |
Operation succeeded. |
POST /api/license-upload/ curl
example
curl -X POST "https://<your-datamasque-host>/api/license-upload/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json" \
-F "license_file=@</path/to/your/license_file.lic>"
GET /api/license/contract-type/
Authorization: User token only.
For Cloud Contract Offer licenses, retrieve the type of license that has been configured to be used.
GET /api/license/contract-type/ Parameters
No parameters.
GET /api/license/contract-type/ Responses
Status Code | Description |
---|---|
200 |
License type retrieved. |
400 |
The licensing method is not of Cloud Contract type, so setting the license type is not supported. |
404 |
The license type has not yet been specified. |
An example response is shown below.
{
"contract_license_type": "business"
}
contract_license_type
must be one of:
business
enterprise
GET /api/license/contract-type/ curl
example
curl "https://<your-datamasque-host>/api/license/contract-type/" \
-H "Authorization: Token <your-api-token>" \
-H "Content-Type: application/json"
PUT /api/license/contract-type/
Authorization: Admin User token only.
For Cloud Contract Offer licenses, set the type of license to check out.
PUT /api/license/contract-type/ Parameters
Field | Type | Required | Location | Description |
---|---|---|---|---|
contract_license_type |
string |
Yes | Request Body | The type of license to check out. Must be one of business or enterprise . |
PUT /api/license/contract-type/ Responses
Status Code | Description |
---|---|
201 |
License type updated. |
400 |
The licensing method is not of Cloud Contract type, so setting the license type is not supported, or the specified license type is invalid. |
PUT /api/license/contract-type/ curl
example
curl -X PUT "https://<your-datamasque-host>/api/license/contract-type/" \
-H "Authorization: Token <your-api-token>" \
-d '{"contract_license_type": "business"}'
Health Check Object
Various health statistics about the DataMasque instance:
Field | Type | Description |
---|---|---|
worker_running |
boolean |
true if the masking agent worker processes are healthy, false if there are no available workers. |
license_expired |
boolean |
true if the licence is expired, false if the licence is not expired. |
license_renewal_in_days |
integer |
Remaining days until licence expiry. |
license_limit_breach |
object |
An object describing any licence breaches that have occurred. Each property on the object is the type of breach that has occurred. Each property value is an object containing breach_type , message , and created_date properties. |
GET /api/health-check/
Authorization: User token or API token.
Get the basic health-check status of DataMasque.
GET /api/health-check/ Parameters
No parameters.
GET /api/health-check/ Responses
Status Code | Description |
---|---|
200 |
A JSON serialised Health Check Object. |
500 |
A server error has occurred, such as an invalid license file exists. The known error will be returned. |
GET /api/health-check/ curl
example
curl "https://<your-datamasque-host>/api/health-check/" \
-H "Authorization: Token <your-api-token>"