SparkJob¶
Instance of a Spark Job run in Atlan.
Complete reference
This is a complete reference for the SparkJob
object in Atlan, showing every possible property and relationship that can exist for these objects. For an introduction, you probably want to start with:
SparkJob
inherits its attributes and relationships from these other types:
classDiagram
direction RL
class SparkJob
link SparkJob "../sparkjob"
class Spark {
<<abstract>>
}
link Spark "../spark"
Spark <|-- SparkJob : extends
class Catalog {
<<abstract>>
}
link Catalog "../catalog"
Catalog <|-- Spark : extends
class Asset {
<<abstract>>
}
link Asset "../asset"
Asset <|-- Catalog : extends
class Referenceable {
<<abstract>>
}
link Referenceable "../referenceable"
Referenceable <|-- Asset : extends
Properties¶
Inherited properties
These attributes are inherited from SparkJob
's supertypes (shown above):
typeName ¶
Type of this asset.
guid ¶
Globally-unique identifier for this asset.
classifications ¶
Tags assigned to the asset. (1)
-
Uses a different name in SDKs
atlanTags
atlan_tags
For more information, see the tag assets snippets.
businessAttributes ¶
Map of custom metadata attributes and values defined on the asset. (1)
-
Uses a different name in SDKs
customMetadataSets
custom_metadata
For more information, see the change custom metadata snippets.
status ¶
Status of the asset. (1)
createdBy ¶
User or account that created the asset.
updatedBy ¶
User or account that last updated the asset.
createTime ¶
Time (epoch) at which the asset was created, in milliseconds.
updateTime ¶
Time (epoch) at which the asset was last updated, in milliseconds.
deleteHandler ¶
Details on the handler used for deletion of the asset. (1)
-
Treat as read-only
You should not try to set
deleteHandler
on an asset. Instead, see the asset CRUD snippets on deleting assets.
classificationNames ¶
Hashed-string names of the Atlan tags that exist on the asset. (1)
-
Uses a different name in SDKs
atlanTagNames
atlan_tag_names
Use classifications to make changes to tags.
isIncomplete ¶
Unused.
meaningNames ¶
Human-readable names of terms that have been linked to this asset.
meanings ¶
Details of terms that have been linked to this asset. (1)
-
Do not use
These should not be used, as they can be inconsistent. Instead, see the link terms and assets snippets.
pendingTasks ¶
Unique identifiers (GUIDs) for any background tasks that are yet to operate on this asset.
adminGroups ¶
List of groups who administer this asset. (This is only used for certain asset types.)
adminRoles ¶
List of roles who administer this asset. (This is only used for Connection assets.)
adminUsers ¶
List of users who administer this asset. (This is only used for certain asset types.)
announcementMessage ¶
Detailed message to include in the announcement on this asset.
announcementTitle ¶
Brief title for the announcement on this asset. Required when announcementType is specified.
announcementType ¶
Type of announcement on this asset.
announcementUpdatedAt ¶
Time (epoch) at which the announcement was last updated, in milliseconds.
announcementUpdatedBy ¶
Name of the user who last updated the announcement.
assetCoverImage ¶
TBC
assetDbtAccountName ¶
Name of the account in which this asset exists in dbt.
assetDbtAlias ¶
Alias of this asset in dbt.
assetDbtEnvironmentDbtVersion ¶
Version of the environment in which this asset is materialized in dbt.
assetDbtEnvironmentName ¶
Name of the environment in which this asset is materialized in dbt.
assetDbtJobLastRun ¶
Time (epoch) at which the job that materialized this asset in dbt last ran, in milliseconds.
assetDbtJobLastRunArtifactS3Path ¶
Path in S3 to the artifacts saved from the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunArtifactsSaved ¶
Whether artifacts were saved from the last run of the job that materialized this asset in dbt (true) or not (false).
assetDbtJobLastRunCreatedAt ¶
Time (epoch) at which the job that materialized this asset in dbt was last created, in milliseconds.
assetDbtJobLastRunDequedAt ¶
Time (epoch) at which the job that materialized this asset in dbt was dequeued, in milliseconds.
assetDbtJobLastRunExecutedByThreadId ¶
Thread ID of the user who executed the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunGitBranch ¶
Branch in git from which the last run of the job that materialized this asset in dbt ran.
assetDbtJobLastRunGitSha ¶
SHA hash in git for the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunHasDocsGenerated ¶
Whether docs were generated from the last run of the job that materialized this asset in dbt (true) or not (false).
assetDbtJobLastRunHasSourcesGenerated ¶
Whether sources were generated from the last run of the job that materialized this asset in dbt (true) or not (false).
assetDbtJobLastRunNotificationsSent ¶
Whether notifications were sent from the last run of the job that materialized this asset in dbt (true) or not (false).
assetDbtJobLastRunOwnerThreadId ¶
Thread ID of the owner of the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunQueuedDuration ¶
Total duration the job that materialized this asset in dbt spent being queued.
assetDbtJobLastRunQueuedDurationHumanized ¶
Human-readable total duration of the last run of the job that materialized this asset in dbt spend being queued.
assetDbtJobLastRunRunDuration ¶
Run duration of the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunRunDurationHumanized ¶
Human-readable run duration of the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunStartedAt ¶
Time (epoch) at which the job that materialized this asset in dbt was started running, in milliseconds.
assetDbtJobLastRunStatusMessage ¶
Status message of the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunTotalDuration ¶
Total duration of the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunTotalDurationHumanized ¶
Human-readable total duration of the last run of the job that materialized this asset in dbt.
assetDbtJobLastRunUpdatedAt ¶
Time (epoch) at which the job that materialized this asset in dbt was last updated, in milliseconds.
assetDbtJobLastRunUrl ¶
URL of the last run of the job that materialized this asset in dbt.
assetDbtJobName ¶
Name of the job that materialized this asset in dbt.
assetDbtJobNextRun ¶
Time (epoch) when the next run of the job that materializes this asset in dbt is scheduled.
assetDbtJobNextRunHumanized ¶
Human-readable time when the next run of the job that materializes this asset in dbt is scheduled.
assetDbtJobSchedule ¶
Schedule of the job that materialized this asset in dbt.
assetDbtJobScheduleCronHumanized ¶
Human-readable cron schedule of the job that materialized this asset in dbt.
assetDbtJobStatus ¶
Status of the job that materialized this asset in dbt.
assetDbtMeta ¶
Metadata for this asset in dbt, specifically everything under the 'meta' key in the dbt object.
assetDbtPackageName ¶
Name of the package in which this asset exists in dbt.
assetDbtProjectName ¶
Name of the project in which this asset exists in dbt.
assetDbtSemanticLayerProxyUrl ¶
URL of the semantic layer proxy for this asset in dbt.
assetDbtSourceFreshnessCriteria ¶
Freshness criteria for the source of this asset in dbt.
assetDbtTags ¶
List of tags attached to this asset in dbt.
assetDbtTestStatus ¶
All associated dbt test statuses.
assetDbtUniqueId ¶
Unique identifier of this asset in dbt.
assetDbtWorkflowLastUpdated ¶
Name of the DBT workflow in Atlan that last updated the asset.
assetIcon ¶
Name of the icon to use for this asset. (Only applies to glossaries, currently.)
assetMcIncidentNames ¶
List of Monte Carlo incident names attached to this asset.
assetMcIncidentQualifiedNames ¶
List of unique Monte Carlo incident names attached to this asset.
assetMcIncidentSeverities ¶
List of Monte Carlo incident severities associated with this asset.
assetMcIncidentStates ¶
List of Monte Carlo incident states associated with this asset.
assetMcIncidentSubTypes ¶
List of Monte Carlo incident sub-types associated with this asset.
assetMcIncidentTypes ¶
List of Monte Carlo incident types associated with this asset.
assetMcLastSyncRunAt ¶
Time (epoch) at which this asset was last synced from Monte Carlo.
assetMcMonitorNames ¶
List of Monte Carlo monitor names attached to this asset.
assetMcMonitorQualifiedNames ¶
List of unique Monte Carlo monitor names attached to this asset.
assetMcMonitorScheduleTypes ¶
Schedules of all associated Monte Carlo monitors.
assetMcMonitorStatuses ¶
Statuses of all associated Monte Carlo monitors.
assetMcMonitorTypes ¶
Types of all associated Monte Carlo monitors.
assetSodaCheckCount ¶
Number of checks done via Soda.
assetSodaCheckStatuses ¶
All associated Soda check statuses.
assetSodaDQStatus ¶
Status of data quality from Soda.
assetSodaLastScanAt ¶
TBC
assetSodaLastSyncRunAt ¶
TBC
assetSodaSourceURL ¶
TBC
assetTags ¶
List of tags attached to this asset.
assetThemeHex ¶
Color (in hexadecimal RGB) to use to represent this asset.
certificateStatus ¶
Status of this asset's certification.
certificateStatusMessage ¶
Human-readable descriptive message used to provide further detail to certificateStatus.
certificateUpdatedAt ¶
Time (epoch) at which the certification was last updated, in milliseconds.
certificateUpdatedBy ¶
Name of the user who last updated the certification of this asset.
connectionName ¶
Simple name of the connection through which this asset is accessible.
connectionQualifiedName ¶
Unique name of the connection through which this asset is accessible.
connectorName ¶
Type of the connector through which this asset is accessible. (1)
-
Uses a different name in SDKs
connectorType
connector_type
dbtQualifiedName ¶
Unique name of this asset in dbt.
description ¶
Description of this asset, for example as crawled from a source. Fallback for display purposes, if userDescription is empty.
displayName ¶
Human-readable name of this asset used for display purposes (in user interface).
hasContract ¶
Whether this asset has contract (true) or not (false).
__hasLineage ¶
Whether this asset has lineage (true) or not (false). (1)
-
Uses a different name in SDKs
hasLineage
has_lineage
isAIGenerated ¶
TBC
isDiscoverable ¶
Whether this asset is discoverable through the UI (true) or not (false).
isEditable ¶
Whether this asset can be edited in the UI (true) or not (false).
isPartial ¶
TBC
lastRowChangedAt ¶
Time (epoch) of the last operation that inserted, updated, or deleted rows, in milliseconds.
lastSyncRun ¶
Name of the last run of the crawler that last synchronized this asset.
lastSyncRunAt ¶
Time (epoch) at which this asset was last crawled, in milliseconds.
lastSyncWorkflowName ¶
Name of the crawler that last synchronized this asset.
name ¶
Name of this asset. Fallback for display purposes, if displayName is empty.
ownerGroups ¶
List of groups who own this asset.
ownerUsers ¶
List of users who own this asset.
popularityScore ¶
Popularity score for this asset.
sampleDataUrl ¶
URL for sample data for this asset.
sourceCostUnit ¶
The unit of measure for sourceTotalCost.
sourceCreatedAt ¶
Time (epoch) at which this asset was created in the source system, in milliseconds.
sourceCreatedBy ¶
Name of the user who created this asset, in the source system.
sourceEmbedURL ¶
URL to create an embed for a resource (for example, an image of a dashboard) within Atlan.
sourceLastReadAt ¶
Timestamp of most recent read operation.
sourceOwners ¶
List of owners of this asset, in the source system.
sourceQueryComputeCostRecordList ¶
List of most expensive warehouses with extra insights. (1)
-
Uses a different name in SDKs
sourceQueryComputeCostRecords
source_query_compute_cost_records
sourceQueryComputeCostList ¶
List of most expensive warehouse names. (1)
-
Uses a different name in SDKs
sourceQueryComputeCosts
source_query_compute_costs
sourceReadCount ¶
Total count of all read operations at source.
sourceReadExpensiveQueryRecordList ¶
List of the most expensive queries that accessed this asset. (1)
-
Uses a different name in SDKs
sourceReadExpensiveQueryRecords
source_read_expensive_query_records
sourceReadPopularQueryRecordList ¶
List of the most popular queries that accessed this asset. (1)
-
Uses a different name in SDKs
sourceReadPopularQueryRecords
source_read_popular_query_records
sourceReadQueryCost ¶
Total cost of read queries at source.
sourceReadRecentUserRecordList ¶
List of usernames with extra insights for the most recent users who read this asset. (1)
-
Uses a different name in SDKs
sourceReadRecentUserRecords
source_read_recent_user_records
sourceReadRecentUserList ¶
List of usernames of the most recent users who read this asset. (1)
-
Uses a different name in SDKs
sourceReadRecentUsers
source_read_recent_users
sourceReadSlowQueryRecordList ¶
List of the slowest queries that accessed this asset. (1)
-
Uses a different name in SDKs
sourceReadSlowQueryRecords
source_read_slow_query_records
sourceReadTopUserRecordList ¶
List of usernames with extra insights for the users who read this asset the most. (1)
-
Uses a different name in SDKs
sourceReadTopUserRecords
source_read_top_user_records
sourceReadTopUserList ¶
List of usernames of the users who read this asset the most. (1)
-
Uses a different name in SDKs
sourceReadTopUsers
source_read_top_users
sourceReadUserCount ¶
Total number of unique users that read data from asset.
sourceTotalCost ¶
Total cost of all operations at source.
sourceURL ¶
URL to the resource within the source application, used to create a button to view this asset in the source application.
sourceUpdatedAt ¶
Time (epoch) at which this asset was last updated in the source system, in milliseconds.
sourceUpdatedBy ¶
Name of the user who last updated this asset, in the source system.
starredBy ¶
Users who have starred this asset.
starredCount ¶
Number of users who have starred this asset.
starredDetailsList ¶
List of usernames with extra information of the users who have starred an asset. (1)
-
Uses a different name in SDKs
starredDetails
starred_details
subType ¶
Subtype of this asset.
tenantId ¶
Name of the Atlan workspace in which this asset exists.
userDescription ¶
Description of this asset, as provided by a user. If present, this will be used for the description in user interface.
viewScore ¶
View score for this asset.
viewerGroups ¶
List of groups who can view assets contained in a collection. (This is only used for certain asset types.)
viewerUsers ¶
List of users who can view assets contained in a collection. (This is only used for certain asset types.)
sparkRunEndTime ¶
End time of the Spark Job eg. 1695673598218
sparkRunOpenLineageState ¶
OpenLineage state of the Spark Job run eg. COMPLETE
sparkRunOpenLineageVersion ¶
OpenLineage Version of the Spark Job run eg. 1.1.0
sparkRunStartTime ¶
Start time of the Spark Job eg. 1695673598218
sparkRunVersion ¶
Spark Version for the Spark Job run eg. 3.4.1
These attributes are specific to instances of SparkJob
(and all of its subtypes).
sparkAppName ¶
Name of the Spark app containing this Spark Job For eg. extract_raw_data
sparkMaster ¶
The Spark master URL eg. local, local[4], or spark://master:7077
Relationships¶
Inherited relationships
These relationships are inherited from SparkJob
's supertypes:
meanings (AtlasGlossaryTerm)¶
Glossary terms that are linked to this asset. (1)
-
Uses a different name in SDKs
assignedTerms
assigned_terms
dataContractLatest (DataContract)¶
Latest version of the data contract (in any status) for this asset.
dataContractLatestCertified (DataContract)¶
Latest certified version of the data contract for this asset.
files (File)¶
TBC
inputPortDataProducts (DataProduct)¶
Data products for which this asset is an input port.
links (Link)¶
Links that are attached to this asset.
mcIncidents (MCIncident)¶
TBC
mcMonitors (MCMonitor)¶
Monitors that observe this asset.
metrics (Metric)¶
TBC
outputPortDataProducts (DataProduct)¶
Data products for which this asset is an output port.
readme (Readme)¶
README that is linked to this asset.
schemaRegistrySubjects (SchemaRegistrySubject)¶
TBC
sodaChecks (SodaCheck)¶
TBC
inputToAirflowTasks (AirflowTask)¶
Tasks to which this asset provides input.
inputToProcesses (Process)¶
Processes to which this asset provides input.
inputToSparkJobs (SparkJob)¶
TBC
outputFromAirflowTasks (AirflowTask)¶
Tasks from which this asset is output.
outputFromProcesses (Process)¶
Processes from which this asset is produced as output.
outputFromSparkJobs (SparkJob)¶
TBC
These relationships are specific to instances of SparkJob
(and all of its subtypes).
inputs (Catalog)¶
Assets that are inputs to this task.
outputs (Catalog)¶
Assets that are outputs from this task.
process (Process)¶
TBC