Data contracts specification¶
Backwards compatibility
While we are in a closed preview state, we are not guaranteeing backwards compatibility. Version 0.0.2
is not backwards compatible with 0.0.1
.
Following is the template for a data contract, where the highlighted lines are mandatory:
---
kind: DataContract # (1)
status: draft # (2)
template_version: 0.0.2 # (3)
dataset: sale_txn # (4)
type: Table # (5)
description: This is the ... # (6)
datasource: snowflake # (7)
owners: # (8)
users:
- jdoe
- jsmith
groups:
- data_producers_group
certification: # (9)
status: VERIFIED # (10)
message: Verified by data producers
announcement: # (11)
type: Informational # (12)
title: Informational announcement
description: Explanation of the ...
terms: # (13)
- Sales
- Transactions
tags: # (14)
- name: PII
propagate: false
restrict_propagation_through_lineage: true
restrict_propagation_through_hierarchy: false
- name: GDPR
propagate: false
restrict_propagation_through_lineage: true
restrict_propagation_through_hierarchy: false
custom_metadata: # (15)
Data Quality:
Completeness Score: 100
Failed Checks:
- 884438be-82cc-4e04-bfe1-fba59276df38
- afa0e560-a916-4862-a2f2-c491f19f39f5
columns:
- name: txn_credit_card_number # (16)
business_name: credit card number # (17)
description: some description... # (18)
data_type: NUMBER # (19)
terms: # (20)
- ARR
tags: # (21)
- name: PII
propagate: false
restrict_propagation_through_lineage: false
primary: false # (22)
required: true # (23)
scale: 0 # (24)
precision: 16 # (25)
- name: txn_ref_dt
business_name: transaction date
description: transaction date description...
data_type: DATE
terms: []
tags: []
primary: false
required: true
checks: # (26)
- missing_count(txn_ref_dt) = 0
- missing_count(txn_ref_dt) = 100
- current_time - date(record_date) < 5
...
- Must always be
DataContract
. - State of the contract:
draft
: contract is still being defined (work in progress)verified
: contract is published and ready to be used
- Version of the template for the data contract.
- Name of the asset as it exists inside Atlan.
- (Optional) Type of the dataset in Atlan:
Table
: a database tableView
: a database viewMaterialisedView
: a materialized view in a database
- (Optional) Description of this dataset, which can be synced to the asset being governed.
- Name that must match a data source defined in your config file.
- (Optional) Owners of the dataset, which can include users (by username) and / or groups (by internal Atlan alias), and can be synced to the asset being governed.
- (Optional) Certification to apply to the dataset, which can be synced to the asset being governed.
- Valid values:
DRAFT
: dataset is still being defined (work in progress)VERIFIED
: dataset is trusted and ready to be usedDEPRECATED
: dataset should no longer be trusted or used
- (Optional) Announcement to apply to the dataset, which can be synced to the asset being governed.
- Valid values:
information
: something should be noted about the dataset (appears blue in the UI)warning
: something is problematic with the dataset (appears yellow in the UI)issue
: something is wrong with the dataset (appears red in UI)
- (Optional) Glossary terms to assign to the dataset, which can be synced to the asset being governed.
- (Optional) List of the names of tags for this dataset, which can be synced to the asset being governed. For each tag you can optionally also specify:
propagate
: whether the tag should propagate to other assetsrestrict_propagation_through_lineage
: if propagation is enabled, whether the tag should propagate through lineagerestrict_propagation_through_hierarchy
: if propagation is enabled, whether the tag should propagate to child assets
- (Optional) Dictionary of custom metadata for this dataset, which can be synced to the asset being governed. Specify the name of the custom metadata and its attributes using their human-readable names. Multi-valued attributes should have their values provided as a list.
- Name of the column as it is defined in the source system (often technical).
- (Optional) Alias for the column, to make it's name more readable.
- (Optional) Description of this column, for documentation purposes.
- (Optional) Data type of values in this column (e.g.
DATE
,NUMBER
,STRING
etc). - (Optional) Glossary terms to assign to this column.
- (Optional) List of the names of tags for this column. For each tag you can optionally also specify:
propagate
: whether the tag should propagate to other assetsrestrict_propagation_through_lineage
: if propagation is enabled, whether the tag should propagate through lineagerestrict_propagation_through_hierarchy
: if propagation is enabled, whether the tag should propagate to child assets
- (Optional) When
true
, this column is the primary key for the table. - (Optional) When
true
, the values in this column can't be null. - (Optional) Number of digits allowed to the right of the decimal point, when the
data_type
is numeric. - (Optional) Total number of digits allowed, when the
data_type
is numeric. - (Optional) List of checks to run to verify data quality of this dataset.