Skip to content

Manage data quality rules

Experimental feature

Data quality rules are currently an experimental feature. The functionality may change in future releases.

Create data quality rules

Data quality rules can be created using three different creator methods depending on the type of rule you want to create:

  1. Column level rules: For rules that apply to specific columns (e.g., Freshness, Null Count)
  2. Table level rules: For rules that apply to entire tables (e.g., Row Count)
  3. Custom SQL rules: For Custom SQL rule only

Column level rules

7.1.5

Column level rules are used for data quality checks that apply to specific columns within a asset.

Coming soon

Create a column level data quality rule
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.assets import alpha_DQRule, Table, Column
from pyatlan.model.enums import (
    alpha_DQRuleAlertPriority,
    alpha_DQRuleThresholdCompareOperator,
    alpha_DQDimension,
    alpha_DQRuleThresholdUnit
)
from pyatlan.model.assets import Asset

client = AtlanClient()

# Create a Freshness rule for a specific column
dq_rule = alpha_DQRule.column_level_rule_creator( # (1)
    client=client, # (2)
    rule_type="Freshness", # (3)
    asset=Table.ref_by_qualified_name(qualified_name="default/databricks/1750768309/dq/weather/monitoring"), # (4)
    column=Column.ref_by_qualified_name(qualified_name="default/databricks/1750768309/dq/weather/monitoring/evaluated_at"), # (5)
    threshold_value=1, # (6)
    alert_priority=alpha_DQRuleAlertPriority.URGENT, # (7)
    threshold_unit=alpha_DQRuleThresholdUnit.DAYS # (8)
)

response = client.asset.save(dq_rule) # (9)

# Create a Null Count rule for a specific column
dq_rule_null = alpha_DQRule.column_level_rule_creator( 
    client=client,
    rule_type="Null Count",
    asset=Table.ref_by_qualified_name(qualified_name="default/databricks/1750768309/dq/weather/monitoring"),
    column=Column.ref_by_qualified_name(qualified_name="default/databricks/1750768309/dq_poc/accuweather/_quality_monitoring_summary/catalog"),
    threshold_compare_operator=alpha_DQRuleThresholdCompareOperator.LESS_THAN_EQUAL, # (10)
    threshold_value=5, 
    alert_priority=alpha_DQRuleAlertPriority.HIGH
)

response = client.asset.save(dq_rule_null)
  1. Use the column_level_rule_creator method to create column-level data quality rules.
  2. Provide the Atlan client instance.
  3. Specify the rule type (e.g., "Freshness", "Null Count"). The rule type must match exactly what is shown in the UI.
  4. Reference the asset using its qualified name to which you want to apply this rule.
  5. Reference the specific column using its qualified name of that asset to which you want to apply this rule. Ensure the column data type is compatible with the rule type (e.g., date/time columns for Freshness rules).
  6. Set the threshold value for the rule (same as you would in the UI).
  7. Set the alert priority level (same as you would in the UI).
  8. Optional: Specify the threshold unit (e.g., DAYS, HOURS) for rules that support units (such as Freshness). For rules without units (such as Null Count), omit this parameter.
  9. Save the data quality rule to Atlan.
  10. Optional: Specify the threshold compare operator (same as you would in the UI).

Coming soon

Coming soon

Table Level rules

7.1.5

Table level rules are used for data quality checks that apply to entire table.

Coming soon

Create a Table Level data quality rule
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.assets import alpha_DQRule, Table
from pyatlan.model.enums import (
    alpha_DQRuleAlertPriority,
    alpha_DQRuleThresholdCompareOperator
)

client = AtlanClient()

# Create a Row Count rule for a table
dq_rule = alpha_DQRule.table_level_rule_creator( # (1)
    client=client, # (2)
    rule_type="Row Count", # (3)
    asset=Table.ref_by_qualified_name(qualified_name="default/databricks/1750768309/dq_poc/accuweather/_quality_monitoring_summary"), # (4)
    threshold_compare_operator=alpha_DQRuleThresholdCompareOperator.EQUAL, # (5)
    threshold_value=15, # (6)
    alert_priority=alpha_DQRuleAlertPriority.URGENT # (7)
)

response = client.asset.save(dq_rule) # (8)
  1. Use the table_level_rule_creator method to create table-level data quality rules.
  2. Provide the Atlan client instance.
  3. Specify the rule type (e.g., "Row Count"). The rule type must match exactly what is shown in the UI.
  4. Reference the asset using its qualified name to which you want to apply this rule.
  5. Set the threshold comparison operator (e.g., EQUAL, LESS_THAN_EQUAL).
  6. Set the threshold value for the rule (same as you would in the UI).
  7. Set the alert priority level (same as you would in the UI).
  8. Save the data quality rule to Atlan.

Coming soon

Coming soon

Custom SQL rule

7.1.5

Custom SQL rule allow you to define data quality check using custom SQL queries.

Coming soon

Create a Custom SQL data quality rule
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.assets import alpha_DQRule, Table
from pyatlan.model.enums import (
    alpha_DQRuleAlertPriority,
    alpha_DQRuleThresholdCompareOperator,
    alpha_DQDimension
)

client = AtlanClient()

# Create a Custom SQL rule
dq_rule = alpha_DQRule.custom_sql_creator( # (1)
    client=client, # (2)
    rule_name="Test SQL Rule", # (3)
    asset=Table.ref_by_qualified_name(qualified_name="default/databricks/1750768309/dq_poc/accuweather/_quality_monitoring_summary"), # (4)
    custom_sql="SELECT count(*) FROM `dq_poc`.`accuweather`.`_quality_monitoring_summary`", # (5)
    threshold_compare_operator=alpha_DQRuleThresholdCompareOperator.LESS_THAN_EQUAL, # (6)
    threshold_value=10, # (7)
    alert_priority=alpha_DQRuleAlertPriority.URGENT, # (8)
    dimension=alpha_DQDimension.COMPLETENESS, # (9)
    description="Custom SQL rule for completeness check" # (10)
)

response = client.asset.save(dq_rule) # (11)
  1. Use the custom_sql_creator method to create custom SQL data quality rules.
  2. Provide the Atlan client instance.
  3. Provide a name for the custom rule (same as you would in the UI).
  4. Reference the asset using its qualified name to which you want to apply this rule.
  5. Provide the custom SQL query for the rule (same as you would in the UI).
  6. Set the threshold comparison operator (same as you would in the UI).
  7. Set the threshold value for the rule (same as you would in the UI).
  8. Set the alert priority level (same as you would in the UI).
  9. Set the data quality dimension (e.g., COMPLETENESS, ACCURACY) (same as you would in the UI).
  10. Optional: Provide a description for the rule.
  11. Save the data quality rule to Atlan.

Coming soon

Coming soon

Update data quality rules

7.1.5

To update an existing data quality rule, you only need to provide the qualified name and the Atlan Client. All other parameters are optional and will only be updated if provided.

Coming soon

Update a data quality rule
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.assets import alpha_DQRule
from pyatlan.model.enums import (
    alpha_DQRuleAlertPriority,
    alpha_DQRuleThresholdCompareOperator,
    alpha_DQDimension,
    alpha_DQRuleThresholdUnit
)

client = AtlanClient()

# Update specific fields of an existing data quality rule
updated_rule = alpha_DQRule.updater( # (1)
    client=client, # (2)
    qualified_name="default/databricks/1750768309/dq_poc/accuweather/_quality_monitoring_summary/rule/40e01c39-dcb8-4348-9259-041f353a8348", # (3)
    threshold_compare_operator=alpha_DQRuleThresholdCompareOperator.LESS_THAN_EQUAL, # (4) 
    threshold_value=20, # (5)
    alert_priority=alpha_DQRuleAlertPriority.HIGH, # (6)
    threshold_unit=alpha_DQRuleThresholdUnit.DAYS, # (7)
    dimension=alpha_DQDimension.COMPLETENESS, # (8)
    custom_sql="SELECT count(*) FROM updated_table", # (9)
    rule_name="Updated Rule Name", # (10)
    description="Updated description for the rule" # (11)
)

response = client.asset.save(updated_rule) # (12)
  1. Use the updater method to update an existing data quality rule.
  2. Provide the Atlan client instance.
  3. Provide the qualified name of the existing rule.
  4. Optional: Update the threshold comparison operator.
  5. Optional: Update the threshold value for the rule.
  6. Optional: Update the alert priority level.
  7. Optional: Update the threshold unit.
  8. Optional: Update the data quality dimension (for custom SQL rules).
  9. Optional: Update the custom SQL query (for custom SQL rules).
  10. Optional: Update the name of the rule (for custom SQL rules).
  11. Optional: Update the description of the rule (for custom SQL rules).
  12. Save the updated data quality rule to Atlan.

Coming soon

Coming soon

UI compatibility required

When updating data quality rules, only update parameters that are applicable to your specific rule type as shown in the UI. Updating parameters that don't apply to your rule type may cause the operation to fail or produce unexpected results.

Retrieve data quality rules

7.1.5

To retrieve data quality rules, you can use fluent search to retireve data quality rules.

Coming soon

Retrieve data quality rules
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.assets import Asset, Connection, alpha_DQRule
from pyatlan.model.fluent_search import FluentSearch

client = AtlanClient()

# Example 1: Retrieve all data quality rules on a connection
search_request = ( # (1)
    FluentSearch()
    .select(include_archived=False) 
    .where_some(Connection.QUALIFIED_NAME.eq("default/databricks/1750768309"))
    .where_some(Asset.TYPE_NAME.eq("alpha_DQRule"))
    .include_on_results(alpha_DQRule.GUID)
    .include_on_results(alpha_DQRule.QUALIFIED_NAME)
).to_request()

results = client.asset.search(search_request) # (2)
for result in results: # (3)
    print(f"Rule GUID: {result.guid}")
    print(f"Rule Qualified Name: {result.qualified_name}")

# Example 2: Retrieve all information of a specific data quality rule using its qualified name
search_request = (
    FluentSearch()
    .where(alpha_DQRule.QUALIFIED_NAME.eq("default/databricks/1750768309/dq_poc/accuweather/_quality_monitoring_summary/rule/a481d03a-7fb9-48c1-a752-3aad4f6a98c1"))
    .include_on_results(alpha_DQRule.GUID)
    .include_on_results(alpha_DQRule.QUALIFIED_NAME)
    .include_on_results(alpha_DQRule.ALPHADQ_RULE_BASE_COLUMN_QUALIFIED_NAME)
    .include_on_results(alpha_DQRule.ALPHADQ_RULE_ALERT_PRIORITY)
    .include_on_results(alpha_DQRule.ALPHADQ_RULE_DIMENSION)
).to_request()

result = client.asset.search(search_request)
search_result = result.current_page()[0] 

print(f"GUID: {search_result.guid}")
print(f"Qualified Name: {search_result.qualified_name}")
print(f"Column Qualified Name: {search_result.alpha_dq_rule_base_column_qualified_name}")
print(f"Alert Priority: {search_result.alpha_dq_rule_alert_priority}")
print(f"Dimension: {search_result.alpha_dq_rule_dimension}")
  1. Create a Fluent Search request to retrieve data quality rules from a specific connection.
  2. Execute the search request to retrieve the data quality rules.
  3. Iterate through all matching data quality rules and print their details.

Coming soon

Coming soon

Delete data quality rules

7.1.5

To delete data quality rules, you can use the standard asset deletion method.

Coming soon

Soft-delete data quality rules
1
2
3
4
5
6
7
from pyatlan.client.atlan import AtlanClient
from pyatlan.model.assets import alpha_DQRule

client = AtlanClient()
response = client.asset.delete_by_guid(guid="b4113341-251b-4adc-81fb-2420501c30e6") # (1)
if deleted := response.assets_deleted(asset_type=alpha_DQRule): # (2)
    term = deleted[0] # (3)
  1. Use the asset.delete_by_guid() method to delete a data quality rule. Provide the GUID of the rule you want to delete.
  2. The assets_deleted(asset_type=alpha_DQRule) method returns a list of the assets of the given type that were deleted.
  3. If an asset of the given type was deleted, then the deleted form of the asset is available.

Coming soon

Coming soon