Querying overview¶

The query defines what you want to find in your search.

At the highest level, what is key to understand is the context of your query.

Query context¶

By default, Elasticsearch sorts results by a relevance score, which measures how well each document matches a query.¹ When you as a person are running a search through a UI this is fantastic — your results are presented back in a logical order even with some fuzziness applied to your search terms.

This works because queries calculate these scores to rank (sort) the results.

Queries that calculate these scores run in query context. They answer the question:

"How well does this result match this query clause?"

Filter context¶

When we're talking about machines running searches, though, this kind of scoring and ranking is often unnecessary. In most cases, in your program you only want to know whether a result matches what you're looking for or not — a much more binary decision.

Queries that include or exclude a result as a binary decision run in filter context. They answer the question:

"Does this result match this query clause (yes or no)?"

Filter context is therefore faster, and in addition is cached automatically by Elasticsearch.

Our SDK interfaces use filter context exclusively

Since programmatic searching rarely (if ever) needs relevance scoring, our SDKs use filter context exclusively. If you have a strong need for relevance scoring of your results when searching programmatically, please let us know your use case!

1.0.0 1.1.0

What this means in practice:

Java Python Kotlin Raw REST API

Build the query and request
IndexSearchRequest index = Table.select() // (1)
    .where(Table.NAME.startsWith("abc")) // (2)
    .sort(Table.UPDATE_TIME.order(SortOrder.Desc)) // (3)
    .toRequest();

Starting with the fluent search's select() helper will construct a query in the background that uses filters to narrow results by type (Table in this example) and to only active assets.
Any other conditions you chain onto the query (through a .where()) will also be translated to filters.
If you are sorting by some property of the results anyway, like when they were last modified, you probably do not need a score for each result — so filters will be the more performant option.

Build the query and request
from pyatlan.model.enums import SortOrder
from pyatlan.model.fluent_search import CompoundQuery, FluentSearch
from pyatlan.model.assets import Table

index = (FluentSearch()  # (1)
     .where(CompoundQuery.asset_type(Table))  # (2)
     .where(CompoundQuery.active_assets())
     .where(Table.NAME.startswith("abc"))
     .sort(Table.UPDATE_TIME.order(SortOrder.DESCENDING))  # (3)
    ).to_request()

Starting with a FluentSearch() will construct a query.
Every chained .where() condition will be translated to a filter in Elastic.
If you are sorting by some property of the results anyway, like when they were last modified, you probably do not need a score for each result — so filters will be the more performant option.

Build the query and request
val index = Table.select() // (1)
    .where(Table.NAME.startsWith("abc")) // (2)
    .sort(Table.UPDATE_TIME.order(SortOrder.Desc)) // (3)
    .toRequest()

Starting with the fluent search's select() helper will construct a query in the background that uses filters to narrow results by type (Table in this example) and to only active assets.
Any other conditions you chain onto the query (through a .where()) will also be translated to filters.
If you are sorting by some property of the results anyway, like when they were last modified, you probably do not need a score for each result — so filters will be the more performant option.

POST /api/meta/search/indexsearch
{
  "dsl": {
    "query": { // (1)
      "bool": {
        "filter": [ // (2)
          { "term": { "__typeName.keyword": "Table" }}
        ]
      }
    },
    "sort": [ // (3)
      { "__modificationTimestamp": { "order": "desc" }}
    ]
  }
}

Although we use a query construct (which we must to get any results)...
...if we are looking for exact matches only (and don't care about scoring), then we should put our search requirements into a filter.
This is particularly true if we are sorting by some property of the results anyway, like when they were last modified.

This page is a summary of the details in the Elasticsearch Guide's Query and filter context ↩