State Machine Execution
This document explains how to use DataMasque's Automation feature to clone and mask AWS databases using state machines. You'll learn about required permissions, how to start executions, and how to access execution logs.
Overview
The Automation feature of DataMasque allows you to clone and then mask an AWS database using state machines. These state machines are created outside DataMasque by using the DataMasque AWS RDS Masking Step Function Blueprint.
This automation process:
- Creates a clone of your source database
- Applies masking rules to sensitive data
- Manages the entire workflow through AWS Step Functions
Note: This automation is currently only available for use with AWS, support for Azure will be available in a future version of DataMasque.
Required Permissions
The DataMasque instance will need the following IAM permissions to list and execute state machines:
- List state machines:
states:ListStateMachines
- List state machine executions:
states:ListExecutions
- Start an execution of a state machine:
states:StartExecution
Below is an example of an IAM policy JSON document that grants these permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"states:ListStateMachines",
"states:ListExecutions",
"states:StartExecution"
],
"Resource": "*"
}
]
}
Start execution
On the Automation page:
Select a state machine from Step Function dropdown, or type an existing state machine ARN if the required state machine is not listed.
Enter the Preferred Availability Zone (AZ). Selecting the correct AZ is important to reduce cross-AZ data transfer costs. Please refer to Availability zone selection. Example format:
us-east-1a
.Enter the Database Identifier, which is the ARN for your source database. This is the database that will be cloned and then masked.
Enter the Database Secret Identifier, which is the ARN for your database secrets in AWS Secrets Manager.
Select a masking ruleset from the dropdown box. Make sure the ruleset is compatible with your chosen database.
Click the Start button to start masking.
Availability zone selection
It is important to consider the availability zone (AZ) for the cloned database. When the availability zone is not set, the cloned database will be created in the same AZ as DataMasque. However, this will depend on how DataMasque is deployed.
For Docker deployments on an EC2, the entire DataMasque deployment runs on the same EC2, so the AZ is easily determined.
For EKS deployments,
the admin-server
pod and the masque-agent
pods may be on different EC2s in different AZs.
The admin-server
pod will initiate the state machine execution,
so the cloned database will be in the same AZ as the EC2 running this pod.
If the masque-agent
, which performs the masking,
is on a different EC2 in a different AZ,
then transfer charges may be incurred during masking.
Furthermore, due to AWS capacity limits it may not be possible to create a new RDS in the default AZ.
To avoid these potential issues, explicitly specifying your preferred AZ is recommended.
Execution logs
To access the execution logs for a state machine, on the Automation page:
- Select a state machine by either:
- Choosing its ARN from the Step Function dropdown menu, or
- Manually entering the ARN.
- Scroll down to the Execution Logs section to view the logs associated with the selected state machine.
Important: If DataMasque's masking fails, the execution's status will still show as successful. Therefore, be sure to check the run logs for masking runs to ensure they have completed successfully.
Troubleshooting
There are two times when an execution may fail:
- During the clone of the RDS instance.
- During masking, after the clone has completed.
If the clone fails, then check the State Machine Console in the AWS console which will show any errors that have occurred.
If the clone was successful, but masking failed, check the run logs in your DataMasque dashboard.