Parse SQL queries¶
You can parse your own SQL queries to assist with managing lineage.
Unsupported
Unless you have a specific agreement that states otherwise, this feature is provided as-is without any support. Also note that this does not create lineage directly — the results of the SQL parser could be used to programmatically build your own lineage.
To parse a SQL query:
| Parse SQL query | |
|---|---|
1 2 3 4 5 6 7 | |
- Use the
creator()method to initialize the object with all necessary attributes. - Provide the SQL code itself.
- Specify the data store for the SQL code, so it can be appropriately interpreted.
- (Optional) Provide a name to use for the database for any objects in the query that are not qualified.
- (Optional) Provide a name to use for the schema for any objects in the query that are not qualified.
- Call the
parse()method to actually parse the query.
| Parse SQL query | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
- Use the
create()method to initialize the object with all necessary attributes. - Provide the SQL code itself.
- Specify the data store for the SQL code, so it can be appropriately interpreted.
- (Optional) Provide a name to use for the database for any objects in the query that are not qualified.
- (Optional) Provide a name to use for the schema for any objects in the query that are not qualified.
- Call the
parse_query()method to actually parse the query.
| POST /api/sql/query/parse | |
|---|---|
1 2 3 4 5 6 7 8 9 10 | |
- Provide the SQL code itself.
- Specify the data store for the SQL code, so it can be appropriately interpreted.
- (Optional) Provide a name to use for the database for any objects in the query that are not qualified.
- (Optional) Provide a name to use for the schema for any objects in the query that are not qualified.
How to interpret the response¶
The parser interprets the query and provides back details about column-level processing.
Database objects¶
Database objects include tables, views, result sets of queries, and processes.
- For objects like tables and views, the objects list includes the actual columns used.
- For result sets, the objects list may also include system-generated columns that represent changes in the number of records returned or an aggregate value
- Every object is assigned a numeric
idvalue, that can be used to cross-reference the flow details in the relationships (below).
Relationships¶
Relationships returned by the parser indicate the flow of data between the database objects.
Each relationship is composed of:
- a
target, indicating the output column of the flow - one or more
sources, indicating the input columns used to produce that output - a
type, which can be one of:fddindicates a flow of data direct from source column to target columnfdrindicates a flow where the source column impacts the target, either:- the number of records in the target (for example, a column used in a
whereclause) - the resulting value of the target (for example, an aggregate function)
- the number of records in the target (for example, a column used in a
- Numeric
id,parentIdandprocessIdidentifiers refer to the same numericidfields used in the objects (above). So these can be used to lookup the additional object detail.
| Interpret the parsed response | |
|---|---|
8 9 | |
- You can retrieve the list of objects from the response using
getObjects(). - You can retrieve the list of relationships (flows) between the objects using
getRelationships().
| Interpret the parsed response | |
|---|---|
14 15 | |
- You can retrieve the list of objects from the response using the
dbobjsproperty. - You can retrieve the list of relationships (flows) between the objects using the
relationshipsproperty.
| POST /api/sql/query/parse | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | |
- The objects are returned in the
dbobjslist. - The relationships are returned in the
relationshipslist.