Skip to content

SparkJob

Instance of a Spark Job run in Atlan.

Complete reference

This is a complete reference for the SparkJob object in Atlan, showing every possible property and relationship that can exist for these objects. For an introduction, you probably want to start with:

  • Snippets — small, atomic examples of single-step use cases.
  • Patterns — walkthroughs of common multi-step implementation patterns.

SparkJob inherits its attributes and relationships from these other types:

classDiagram
    direction RL
    class SparkJob
    link SparkJob "../sparkjob"
    class Spark {
        <<abstract>>
    }
    link Spark "../spark"
    Spark <|-- SparkJob : extends
    class Catalog {
        <<abstract>>
    }
    link Catalog "../catalog"
    Catalog <|-- Spark : extends
    class Asset {
        <<abstract>>
    }
    link Asset "../asset"
    Asset <|-- Catalog : extends
    class Referenceable {
        <<abstract>>
    }
    link Referenceable "../referenceable"
    Referenceable <|-- Asset : extends

Properties

Inherited properties

These attributes are inherited from SparkJob's supertypes (shown above):

typeName

Type of this asset.

guid

Globally-unique identifier for this asset.

classifications

Tags assigned to the asset. (1)

  1. Uses a different name in SDKs

    atlanTags
    atlan_tags

    For more information, see the tag assets snippets.

businessAttributes

Map of custom metadata attributes and values defined on the asset. (1)

  1. Uses a different name in SDKs

    customMetadataSets
    custom_metadata

    For more information, see the change custom metadata snippets.

status

Status of the asset. (1)

  1. Treat as read-only

    You should not try to set status on an asset. Instead, see the asset CRUD snippets on deleting and restoring assets.

createdBy

User or account that created the asset.

updatedBy

User or account that last updated the asset.

createTime

Time (epoch) at which the asset was created, in milliseconds.

updateTime

Time (epoch) at which the asset was last updated, in milliseconds.

deleteHandler

Details on the handler used for deletion of the asset. (1)

  1. Treat as read-only

    You should not try to set deleteHandler on an asset. Instead, see the asset CRUD snippets on deleting assets.

classificationNames

Hashed-string names of the Atlan tags that exist on the asset. (1)

  1. Uses a different name in SDKs

    atlanTagNames
    atlan_tag_names

    Use classifications to make changes to tags.

isIncomplete

Unused.

meaningNames

Human-readable names of terms that have been linked to this asset.

meanings

Details of terms that have been linked to this asset. (1)

  1. Do not use

    These should not be used, as they can be inconsistent. Instead, see the link terms and assets snippets.

pendingTasks

Unique identifiers (GUIDs) for any background tasks that are yet to operate on this asset.

adminGroups

List of groups who administer this asset. (This is only used for certain asset types.)

adminRoles

List of roles who administer this asset. (This is only used for Connection assets.)

adminUsers

List of users who administer this asset. (This is only used for certain asset types.)

announcementMessage

Detailed message to include in the announcement on this asset.

announcementTitle

Brief title for the announcement on this asset. Required when announcementType is specified.

announcementType

Type of announcement on this asset.

announcementUpdatedAt

Time (epoch) at which the announcement was last updated, in milliseconds.

announcementUpdatedBy

Name of the user who last updated the announcement.

assetCoverImage

TBC

assetDbtAccountName

Name of the account in which this asset exists in dbt.

assetDbtAlias

Alias of this asset in dbt.

assetDbtEnvironmentDbtVersion

Version of the environment in which this asset is materialized in dbt.

assetDbtEnvironmentName

Name of the environment in which this asset is materialized in dbt.

assetDbtJobLastRun

Time (epoch) at which the job that materialized this asset in dbt last ran, in milliseconds.

assetDbtJobLastRunArtifactS3Path

Path in S3 to the artifacts saved from the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunArtifactsSaved

Whether artifacts were saved from the last run of the job that materialized this asset in dbt (true) or not (false).

assetDbtJobLastRunCreatedAt

Time (epoch) at which the job that materialized this asset in dbt was last created, in milliseconds.

assetDbtJobLastRunDequedAt

Time (epoch) at which the job that materialized this asset in dbt was dequeued, in milliseconds.

assetDbtJobLastRunExecutedByThreadId

Thread ID of the user who executed the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunGitBranch

Branch in git from which the last run of the job that materialized this asset in dbt ran.

assetDbtJobLastRunGitSha

SHA hash in git for the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunHasDocsGenerated

Whether docs were generated from the last run of the job that materialized this asset in dbt (true) or not (false).

assetDbtJobLastRunHasSourcesGenerated

Whether sources were generated from the last run of the job that materialized this asset in dbt (true) or not (false).

assetDbtJobLastRunNotificationsSent

Whether notifications were sent from the last run of the job that materialized this asset in dbt (true) or not (false).

assetDbtJobLastRunOwnerThreadId

Thread ID of the owner of the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunQueuedDuration

Total duration the job that materialized this asset in dbt spent being queued.

assetDbtJobLastRunQueuedDurationHumanized

Human-readable total duration of the last run of the job that materialized this asset in dbt spend being queued.

assetDbtJobLastRunRunDuration

Run duration of the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunRunDurationHumanized

Human-readable run duration of the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunStartedAt

Time (epoch) at which the job that materialized this asset in dbt was started running, in milliseconds.

assetDbtJobLastRunStatusMessage

Status message of the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunTotalDuration

Total duration of the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunTotalDurationHumanized

Human-readable total duration of the last run of the job that materialized this asset in dbt.

assetDbtJobLastRunUpdatedAt

Time (epoch) at which the job that materialized this asset in dbt was last updated, in milliseconds.

assetDbtJobLastRunUrl

URL of the last run of the job that materialized this asset in dbt.

assetDbtJobName

Name of the job that materialized this asset in dbt.

assetDbtJobNextRun

Time (epoch) when the next run of the job that materializes this asset in dbt is scheduled.

assetDbtJobNextRunHumanized

Human-readable time when the next run of the job that materializes this asset in dbt is scheduled.

assetDbtJobSchedule

Schedule of the job that materialized this asset in dbt.

assetDbtJobScheduleCronHumanized

Human-readable cron schedule of the job that materialized this asset in dbt.

assetDbtJobStatus

Status of the job that materialized this asset in dbt.

assetDbtMeta

Metadata for this asset in dbt, specifically everything under the 'meta' key in the dbt object.

assetDbtPackageName

Name of the package in which this asset exists in dbt.

assetDbtProjectName

Name of the project in which this asset exists in dbt.

assetDbtSemanticLayerProxyUrl

URL of the semantic layer proxy for this asset in dbt.

assetDbtSourceFreshnessCriteria

Freshness criteria for the source of this asset in dbt.

assetDbtTags

List of tags attached to this asset in dbt.

assetDbtTestStatus

All associated dbt test statuses.

assetDbtUniqueId

Unique identifier of this asset in dbt.

assetDbtWorkflowLastUpdated

Name of the DBT workflow in Atlan that last updated the asset.

assetIcon

Name of the icon to use for this asset. (Only applies to glossaries, currently.)

assetMcIncidentNames

List of Monte Carlo incident names attached to this asset.

assetMcIncidentQualifiedNames

List of unique Monte Carlo incident names attached to this asset.

assetMcIncidentSeverities

List of Monte Carlo incident severities associated with this asset.

assetMcIncidentStates

List of Monte Carlo incident states associated with this asset.

assetMcIncidentSubTypes

List of Monte Carlo incident sub-types associated with this asset.

assetMcIncidentTypes

List of Monte Carlo incident types associated with this asset.

assetMcLastSyncRunAt

Time (epoch) at which this asset was last synced from Monte Carlo.

assetMcMonitorNames

List of Monte Carlo monitor names attached to this asset.

assetMcMonitorQualifiedNames

List of unique Monte Carlo monitor names attached to this asset.

assetMcMonitorScheduleTypes

Schedules of all associated Monte Carlo monitors.

assetMcMonitorStatuses

Statuses of all associated Monte Carlo monitors.

assetMcMonitorTypes

Types of all associated Monte Carlo monitors.

assetSodaCheckCount

Number of checks done via Soda.

assetSodaCheckStatuses

All associated Soda check statuses.

assetSodaDQStatus

Status of data quality from Soda.

assetSodaLastScanAt

TBC

assetSodaLastSyncRunAt

TBC

assetSodaSourceURL

TBC

assetTags

List of tags attached to this asset.

assetThemeHex

Color (in hexadecimal RGB) to use to represent this asset.

certificateStatus

Status of this asset's certification.

certificateStatusMessage

Human-readable descriptive message used to provide further detail to certificateStatus.

certificateUpdatedAt

Time (epoch) at which the certification was last updated, in milliseconds.

certificateUpdatedBy

Name of the user who last updated the certification of this asset.

connectionName

Simple name of the connection through which this asset is accessible.

connectionQualifiedName

Unique name of the connection through which this asset is accessible.

connectorName

Type of the connector through which this asset is accessible. (1)

  1. Uses a different name in SDKs

    connectorType
    connector_type

dbtQualifiedName

Unique name of this asset in dbt.

description

Description of this asset, for example as crawled from a source. Fallback for display purposes, if userDescription is empty.

displayName

Human-readable name of this asset used for display purposes (in user interface).

hasContract

Whether this asset has contract (true) or not (false).

__hasLineage

Whether this asset has lineage (true) or not (false). (1)

  1. Uses a different name in SDKs

    hasLineage
    has_lineage

isAIGenerated

TBC

isDiscoverable

Whether this asset is discoverable through the UI (true) or not (false).

isEditable

Whether this asset can be edited in the UI (true) or not (false).

isPartial

TBC

lastRowChangedAt

Time (epoch) of the last operation that inserted, updated, or deleted rows, in milliseconds.

lastSyncRun

Name of the last run of the crawler that last synchronized this asset.

lastSyncRunAt

Time (epoch) at which this asset was last crawled, in milliseconds.

lastSyncWorkflowName

Name of the crawler that last synchronized this asset.

name

Name of this asset. Fallback for display purposes, if displayName is empty.

ownerGroups

List of groups who own this asset.

ownerUsers

List of users who own this asset.

popularityScore

Popularity score for this asset.

sampleDataUrl

URL for sample data for this asset.

sourceCostUnit

The unit of measure for sourceTotalCost.

sourceCreatedAt

Time (epoch) at which this asset was created in the source system, in milliseconds.

sourceCreatedBy

Name of the user who created this asset, in the source system.

sourceEmbedURL

URL to create an embed for a resource (for example, an image of a dashboard) within Atlan.

sourceLastReadAt

Timestamp of most recent read operation.

sourceOwners

List of owners of this asset, in the source system.

sourceQueryComputeCostRecordList

List of most expensive warehouses with extra insights. (1)

  1. Uses a different name in SDKs

    sourceQueryComputeCostRecords
    source_query_compute_cost_records

sourceQueryComputeCostList

List of most expensive warehouse names. (1)

  1. Uses a different name in SDKs

    sourceQueryComputeCosts
    source_query_compute_costs

sourceReadCount

Total count of all read operations at source.

sourceReadExpensiveQueryRecordList

List of the most expensive queries that accessed this asset. (1)

  1. Uses a different name in SDKs

    sourceReadExpensiveQueryRecords
    source_read_expensive_query_records

sourceReadPopularQueryRecordList

List of the most popular queries that accessed this asset. (1)

  1. Uses a different name in SDKs

    sourceReadPopularQueryRecords
    source_read_popular_query_records

sourceReadQueryCost

Total cost of read queries at source.

sourceReadRecentUserRecordList

List of usernames with extra insights for the most recent users who read this asset. (1)

  1. Uses a different name in SDKs

    sourceReadRecentUserRecords
    source_read_recent_user_records

sourceReadRecentUserList

List of usernames of the most recent users who read this asset. (1)

  1. Uses a different name in SDKs

    sourceReadRecentUsers
    source_read_recent_users

sourceReadSlowQueryRecordList

List of the slowest queries that accessed this asset. (1)

  1. Uses a different name in SDKs

    sourceReadSlowQueryRecords
    source_read_slow_query_records

sourceReadTopUserRecordList

List of usernames with extra insights for the users who read this asset the most. (1)

  1. Uses a different name in SDKs

    sourceReadTopUserRecords
    source_read_top_user_records

sourceReadTopUserList

List of usernames of the users who read this asset the most. (1)

  1. Uses a different name in SDKs

    sourceReadTopUsers
    source_read_top_users

sourceReadUserCount

Total number of unique users that read data from asset.

sourceTotalCost

Total cost of all operations at source.

sourceURL

URL to the resource within the source application, used to create a button to view this asset in the source application.

sourceUpdatedAt

Time (epoch) at which this asset was last updated in the source system, in milliseconds.

sourceUpdatedBy

Name of the user who last updated this asset, in the source system.

starredBy

Users who have starred this asset.

starredCount

Number of users who have starred this asset.

starredDetailsList

List of usernames with extra information of the users who have starred an asset. (1)

  1. Uses a different name in SDKs

    starredDetails
    starred_details

subType

Subtype of this asset.

tenantId

Name of the Atlan workspace in which this asset exists.

userDescription

Description of this asset, as provided by a user. If present, this will be used for the description in user interface.

viewScore

View score for this asset.

viewerGroups

List of groups who can view assets contained in a collection. (This is only used for certain asset types.)

viewerUsers

List of users who can view assets contained in a collection. (This is only used for certain asset types.)

sparkRunEndTime

End time of the Spark Job eg. 1695673598218

sparkRunOpenLineageState

OpenLineage state of the Spark Job run eg. COMPLETE

sparkRunOpenLineageVersion

OpenLineage Version of the Spark Job run eg. 1.1.0

sparkRunStartTime

Start time of the Spark Job eg. 1695673598218

sparkRunVersion

Spark Version for the Spark Job run eg. 3.4.1

These attributes are specific to instances of SparkJob (and all of its subtypes).

sparkAppName

Name of the Spark app containing this Spark Job For eg. extract_raw_data

sparkMaster

The Spark master URL eg. local, local[4], or spark://master:7077

Relationships

Inherited relationships

These relationships are inherited from SparkJob's supertypes:

meanings (AtlasGlossaryTerm)

Glossary terms that are linked to this asset. (1)

  1. Uses a different name in SDKs

    assignedTerms
    assigned_terms

dataContractLatest (DataContract)

Latest version of the data contract (in any status) for this asset.

dataContractLatestCertified (DataContract)

Latest certified version of the data contract for this asset.

files (File)

TBC

inputPortDataProducts (DataProduct)

Data products for which this asset is an input port.

Links that are attached to this asset.

mcIncidents (MCIncident)

TBC

mcMonitors (MCMonitor)

Monitors that observe this asset.

metrics (Metric)

TBC

outputPortDataProducts (DataProduct)

Data products for which this asset is an output port.

readme (Readme)

README that is linked to this asset.

schemaRegistrySubjects (SchemaRegistrySubject)

TBC

sodaChecks (SodaCheck)

TBC

inputToAirflowTasks (AirflowTask)

Tasks to which this asset provides input.

inputToProcesses (Process)

Processes to which this asset provides input.

inputToSparkJobs (SparkJob)

TBC

outputFromAirflowTasks (AirflowTask)

Tasks from which this asset is output.

outputFromProcesses (Process)

Processes from which this asset is produced as output.

outputFromSparkJobs (SparkJob)

TBC

These relationships are specific to instances of SparkJob (and all of its subtypes).

inputs (Catalog)

Assets that are inputs to this task.

outputs (Catalog)

Assets that are outputs from this task.

process (Process)

TBC