Parse SQL queries¶
You can parse your own SQL queries to assist with managing lineage.
Unsupported
Unless you have a specific agreement that states otherwise, this feature is provided as-is without any support. Also note that this does not create lineage directly — the results of the SQL parser could be used to programmatically build your own lineage.
To parse a SQL query:
Parse SQL query | |
---|---|
1 2 3 4 5 6 7 |
|
- Use the
creator()
method to initialize the object with all necessary attributes. - Provide the SQL code itself.
- Specify the data store for the SQL code, so it can be appropriately interpreted.
- (Optional) Provide a name to use for the database for any objects in the query that are not qualified.
- (Optional) Provide a name to use for the schema for any objects in the query that are not qualified.
- Call the
parse()
method to actually parse the query.
Parse SQL query | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
- Use the
create()
method to initialize the object with all necessary attributes. - Provide the SQL code itself.
- Specify the data store for the SQL code, so it can be appropriately interpreted.
- (Optional) Provide a name to use for the database for any objects in the query that are not qualified.
- (Optional) Provide a name to use for the schema for any objects in the query that are not qualified.
- Call the
parse_query()
method to actually parse the query.
POST /api/sql/query/parse | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
- Provide the SQL code itself.
- Specify the data store for the SQL code, so it can be appropriately interpreted.
- (Optional) Provide a name to use for the database for any objects in the query that are not qualified.
- (Optional) Provide a name to use for the schema for any objects in the query that are not qualified.
How to interpret the response¶
The parser interprets the query and provides back details about column-level processing.
Database objects¶
Database objects include tables, views, result sets of queries, and processes.
- For objects like tables and views, the objects list includes the actual columns used.
- For result sets, the objects list may also include system-generated columns that represent changes in the number of records returned or an aggregate value
- Every object is assigned a numeric
id
value, that can be used to cross-reference the flow details in the relationships (below).
Relationships¶
Relationships returned by the parser indicate the flow of data between the database objects.
Each relationship is composed of:
- a
target
, indicating the output column of the flow - one or more
sources
, indicating the input columns used to produce that output - a
type
, which can be one of:fdd
indicates a flow of data direct from source column to target columnfdr
indicates a flow where the source column impacts the target, either:- the number of records in the target (for example, a column used in a
where
clause) - the resulting value of the target (for example, an aggregate function)
- the number of records in the target (for example, a column used in a
- Numeric
id
,parentId
andprocessId
identifiers refer to the same numericid
fields used in the objects (above). So these can be used to lookup the additional object detail.
Interpret the parsed response | |
---|---|
8 9 |
|
- You can retrieve the list of objects from the response using
getObjects()
. - You can retrieve the list of relationships (flows) between the objects using
getRelationships()
.
Interpret the parsed response | |
---|---|
14 15 |
|
- You can retrieve the list of objects from the response using the
dbobjs
property. - You can retrieve the list of relationships (flows) between the objects using the
relationships
property.
POST /api/sql/query/parse | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
- The objects are returned in the
dbobjs
list. - The relationships are returned in the
relationships
list.