Term-level queries¶
Term-level queries allow you to find results based on precise values in structured data.1 For example, by asset type, status, or GUID.
Unlike full-text queries, the search input you use in a term-level query is not analyzed. This means what you search for is matched exactly against what is stored in an attribute — no fuzzy-matching is applied.2
Details
Below are the various kinds of term-level queries. These are sorted with the most commonly used at the top, and cover their usual usage. Each one is linked to Elasticsearch's own documentation to provide greater details. (In most cases there are many more options for each kind of query than what is documented here.)
You will often combine these queries to create more complex criteria. For the most frequently used, we have defined factory methods to give you the query more quickly in the CompoundQuery
class.
Term¶
Term queries return results where the asset's value for that attribute matches exactly what you're searching.
What if I want it to be a case insensitive match?
You can still use term queries for case insensitive matching, too.
- Java: add a second parameter of
true
to the predicate method - Python: add a named parameter of
case_insensitive=True
to the predicate method - Raw REST API: send through
"case_insensitive": true
to the API directly
Build the query and request | |
---|---|
1 2 3 |
|
- You can search across all assets using the
select()
method of theassets
member on any client. -
Chain a
where()
onto the select, with the static constant representing a field of the type you want to search to start a query, in this case theNAME
of anAsset
. Adding theeq()
predicate creates a term query. You can also optionally send a second parameter astrue
to do a case-insensitive match.Equivalent query through ElasticQuery byTerm = TermQuery.of(t -> t .field("name.keyword") .value("some-name") .caseInsensitive(true)) ._toQuery();
Build the query and request | |
---|---|
1 2 3 4 5 6 |
|
- You can search across all assets using a
FluentSearch()
object. - Chain a
where()
onto the fluent search object, with the class variable representing a field of the type you want to search to start a query, in this case theNAME
of anAsset
. Adding theeq()
predicate creates a term query. You can also optionally send a named parameter ofcase_insensitive=True
to do a case-insensitive match.
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
- Queries must be within the
dsl
object in the API... - ...and within that the
query
object. - For a term query, there needs to be a
term
object embedded within thequery
object. - Within this object should be a key with the name of the Elasticsearch field (Atlan attribute) to match against.
- The value for this field (attribute) to match against should be given through the
value
property. - Optionally, you can enable case-insensitive searching to have an almost exact match by setting
case_insensitive
to true.
Terms¶
Terms queries return results where the asset's value for that attribute matches one or more of the values you're searching exactly.
Build the query and request | |
---|---|
1 2 3 |
|
- You can search across all assets using the
select()
method of theassets
member on any client. -
Chain a
where()
onto the select, with the static constant representing a field of the type you want to search to start a query, in this case theTYPE_NAME
of an asset. Adding thein()
predicate creates a terms query.Equivalent query through ElasticQuery byType = TermsQuery.of(t -> t .field("__typeName.keyword") .terms(TermsQueryField.of(f -> f .value(List.of(FieldValue.of(Table.TYPE_NAME), FieldValue.of(Column.TYPE_NAME)))))) ._toQuery();
Build the query and request | |
---|---|
1 2 3 4 5 6 |
|
- You can search across all assets using a
FluentSearch()
object. - Chain a
where()
onto the fluent search object, with the class variable representing a field of the type you want to search to start a query, in this case theTYPE_NAME
of an asset. Adding thewithin()
predicate creates a terms query.
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 |
|
- The general way to construct a terms query, with all flexibility provided by Elasticsearch. This query would find all Table and Column assets, by exactly-matching either the
Table
orColumn
types.
Exists¶
Exists queries return results where the asset contains a value for that attribute. For example, this query would find all assets that have been changed after being created:
Build the query and request | |
---|---|
1 2 3 |
|
- You can search across all assets using the
select()
method of theassets
member on any client. -
Chain a
where()
onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the person who last updated an asset. Adding thehasAnyValue()
predicate creates an exists query. This will only match results where the field has some value on the asset.Equivalent query through ElasticQuery byExistence = ExistsQuery.of(q -> q .field("__modifiedBy")) ._toQuery();
Build the query and request | |
---|---|
1 2 3 4 5 6 |
|
- You can search across all assets using a
FluentSearch()
object. - Chain a
where()
onto the fluent search object, with the class variable representing a field of the type you want to search to start a query, in this case the person who last updated an asset. Adding thehas_any_value()
predicate creates an exists query. This will only match results where the field has some value on the asset.
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 |
|
Range¶
Range queries return results where the asset's value for that attribute is within the range you're searching. (This works for numeric fields only — which for Atlan includes dates, since they are stored as epoch values.) For example, this query would find all assets that were created between January 1, 2022 to February 1, 2022:
Build the query and request | |
---|---|
1 2 3 |
|
- You can search across all assets using the
select()
method of theassets
member on any client. -
Chain a
where()
onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the time an asset was created. Adding thebetween()
predicate creates a range query. In this examplebetween()
allows you to specify two values any matching assets should be between. You could also use:gt()
for any values strictly greater than a single numbergte()
for any values greater than or equal to a single numberlt()
for values strictly less than a single numberlte()
for values less than or equal to a single numbereq()
for valuess strictly equal to a single number
Equivalent query through ElasticQuery byRange = RangeQuery.of(r -> r .field("__timestamp") .gte(JsonData.of(1640995200000L)) .lt(JsonData.of(1643673600000L))) ._toQuery();
Build the query and request | |
---|---|
1 2 3 4 5 6 |
|
- You can search across all assets using a
FluentSearch()
object. -
Chain a
where()
onto the fluent search object, with the class variable representing a field of the type you want to search to start a query, in this case the time an asset was created. Adding thebetween()
predicate creates a range query. In this examplebetween()
allows you to specify two values any matching assets should be between. You could also use:gt()
for any values strictly greater than a single numbergte()
for any values greater than or equal to a single numberlt()
for values strictly less than a single numberlte()
for values less than or equal to a single numbereq()
for valuess strictly equal to a single number
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
- You do not need to specify both ends of the range, you could use only a single condition.
Prefix¶
Prefix queries return results where the asset's value for that attribute starts with what you're searching. For example, this query would find all columns whose qualifiedName
starts with default/snowflake/1662194632
(in other words, all columns in any table, view, materialized view, schema or database in that connection):
What if I want it to be a case insensitive match?
You can still use term queries for case insensitive matching, too.
- Java: add a second parameter of
true
to the predicate method - Python: add a named parameter of
case_insensitive=True
to the predicate method - Raw REST API: send through
"case_insensitive": true
to the API directly
Build the query and request | |
---|---|
1 2 3 |
|
- You can search across all assets using the
select()
method of theassets
member on any client. -
Chain a
where()
onto the select, with the static constant representing a field of the type you want to search to start a query, in this case theQUALIFIED_NAME
of anAsset
. Adding thestartsWith()
predicate creates a prefix query. This will only match results where the field's value starts with the provided string. You can also optionally send a second parameter astrue
to do a case-insensitive match.Equivalent query through ElasticQuery byPrefix = PrefixQuery.of(p -> p .field("qualifiedName") .value("default/snowflake/1662194632")) ._toQuery();
Build the query and request | |
---|---|
1 2 3 4 5 6 |
|
- You can search across all assets using a
FluentSearch()
object. - Chain a
where()
onto the fluent search object, with the class variable representing a field of the type you want to search to start a query, in this case theQUALIFIED_NAME
of anAsset
. Adding thestartswith()
predicate creates a prefix query. This will only match results where the field's value starts with the provided string. You can also optionally send a named parameter ofcase_insensitive=True
to do a case-insensitive match.
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 |
|
Wildcard¶
Wildcard queries return results where the asset's value for that attribute matches the wildcard pattern you're searching. This can be useful for searching based on simple naming conventions. For example, this query would find all assets whose name starts with C_
and ends with _SK
with any characters in-between:
Build the query | |
---|---|
1 2 3 4 |
|
Build the request | |
---|---|
5 6 7 |
|
Coming soon
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 |
|
Avoid starting the search pattern with a wildcard
Using this to do an ends-with style search (such as *_SK
) can be very slow.
Regexp¶
Regexp queries return results where the asset's value for that attribute matches the regular expression you're searching. This can be useful for searching based on more complicated naming conventions. For example, this query would find all assets whose name starts with C_
and ends with _SK
with the characters ADDR
somewhere in-between:
Build the query | |
---|---|
1 2 3 4 |
|
Build the request | |
---|---|
5 6 7 |
|
Coming soon
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
Performance can vary widely depending on the regular expression
To achieve the best performance, avoid using wildcard patterns such as .*
or .*?+
without any prefix or suffix.
Terms set¶
Terms set queries return results where the asset's values for that attribute matches a minimum number of the values you're searching for exactly. For example, this query would find all assets with at least two of the three specified Atlan tags:
Build the query | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
Build the request | |
---|---|
11 12 13 |
|
Coming soon
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
- In the JSON request, we need to use Atlan's internal hashed string representation of a Atlan tag name. The SDKs can translate to this for us.
Fuzzy¶
Fuzzy queries return results where the asset's value for that attribute is similar to the value you're searching. This is determined by Levenshtein edit distance (the number of one-character changes needed to match what you're searching).
Are you sure this is what you want?
This is a very simplistic fuzzy-matching algorithm, and it may end up matching both more and less than you want it to. For more advanced fuzzy-matching, you probably want to use full-text queries. Since this is possible through Atlan's search, it is included here for completeness.
For example, this query would find all assets whose name is 1-edit away (so would match block
, clock
, lock
, black
, etc):
Build the query | |
---|---|
1 2 3 4 5 |
|
Build the request | |
---|---|
6 7 8 |
|
Coming soon
POST /api/meta/search/indexsearch | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
-
This page is a summary of the details in the Elasticsearch Guide's Term-level queries ↩
-
Ok, that's not strictly true, since as you'll see there are some term-level queries that give very basic fuzziness. And actually, a normalizer can be applied as well, to make these searches case-insensitive. But the intent of term-level queries is to do exact matches with minimal fuzziness. ↩