Class: ElasticGraph::GraphQL::Resolvers::NestedRelationshipsSource
- Inherits:
-
GraphQL::Dataloader::Source
- Object
- GraphQL::Dataloader::Source
- ElasticGraph::GraphQL::Resolvers::NestedRelationshipsSource
- Defined in:
- lib/elastic_graph/graphql/resolvers/nested_relationships_source.rb
Overview
A GraphQL dataloader responsible for solving a thorny N+1 query problem related to our ‘NestedRelationships` resolver. The `QuerySource` dataloader implements a basic batching optimization: multiple datastore queries are batched up into a single `msearch` call against the dataastore. This is significantly better than submitting a separate request per query, but is still not optimal–the datastore still must execute N different queries, which could cause significant load.
A significantly improved optimization is possible in one particular situation from our ‘NestedRelationships` resolver. Here’s an example of that situation:
- `Part` documents are indexed in a `parts` index and `Manufacturer` documents are indexed in a `manufacturers` index.
- `Part.manufacturer` is defined as: `t.relates_to_one "manufacturer", "Manufacturer", via: "manufacturer_id", dir: :out`.
- We are processing a GraphQL query like this: `parts(first: 10) { nodes { manufacturer { name } } }`.
- For each of the 10 parts, the `NestedRelationships` resolver has to resolve its related `Part.manufacturer`.
- Without the optimization provided by this class, `NestedRelationships` would have to execute 10 different queries,
each of which is identical except for a different filter: `{id: {equal_to_any_of: [part.manufacturer_id]}}`.
- Instead of executing this as 10 different queries, we can instead execute it as one query with this combined filter:
`{id: {equal_to_any_of: [part1.manufacturer_id, ..., part10.manufacturer_id]}}`
- When we do this, we get a single response, but `NestedRelationships` expects a separate response for each one.
- To satisfy that, we can split the single response into 10 different responses (one per filter).
This optimization, when we can apply it, results in much less load on the datastore. In addition, it also helps to reduce the amount of overhead imposed by ElasticGraph. Profiling has shown that significant overhead is incurred when we repeatedly merge filters into a query (e.g. ‘query.merge_with(internal_filters: [{equal_to_any_of: [part.manufacturer_id]}])` 10 times to produce 10 different queries). This optimization also avoids that overhead.
Note: while the comments discuss the examples in terms of _parent objects_, in the implementation, we deal with id sets. A set of ids is contributed by each parent object.
Constant Summary collapse
- MAX_OPTIMIZED_ATTEMPTS =
The optimization implemented by this class is not guaranteed to get all expected results in a single query for cases where the sorted search results are not well-distributed among each of the parent objects while we’re resolving a ‘relates_to_many` field. (See the comments on `fetch_via_single_query_with_merged_filters` for a detailed description of when this occurs).
To deal with this situation, we retry the query for just the parent objects which may have incomplete results. However, each attempt is run in serial, and we want to put a strict upper bound on how many attempts are made. This constant defines the maximum number of optimized attempts we allow.
When exceeded, we fall back to building and executing a separate query (via a single ‘msearch` request) for each parent object.
3- EXTRA_SIZE_MULTIPLIER =
Reattempts are less likely to be needed when we execute the query with a larger ‘size`, because we are more likely to get back complete results for each parent object. This multiplier is applied to the requested size to achieve that.
4 was chosen somewhat arbitrarily, but should make reattempts needed much less often while avoiding asking for an unreasonably large number of results.
Note: asking the datastore for a larger ‘size` is quite a bit more efficient than needing to execute more queries. Once the datastore has gone to the spot in its inverted index with the matching documents, asking for more results isn’t particularly expensive, compared to needing to re-run an extra query.
4
Class Method Summary collapse
Instance Method Summary collapse
- #fetch(id_sets) ⇒ Object
-
#initialize(query:, join:, context:, monotonic_clock:) ⇒ NestedRelationshipsSource
constructor
A new instance of NestedRelationshipsSource.
Constructor Details
#initialize(query:, join:, context:, monotonic_clock:) ⇒ NestedRelationshipsSource
Returns a new instance of NestedRelationshipsSource.
63 64 65 66 67 68 69 70 71 72 |
# File 'lib/elastic_graph/graphql/resolvers/nested_relationships_source.rb', line 63 def initialize(query:, join:, context:, monotonic_clock:) @query = query @join = join @filter_id_field_name_path = @join.filter_id_field_name.split(".") @context = context elastic_graph_schema = context.fetch(:elastic_graph_schema) @schema_element_names = elastic_graph_schema.element_names @logger = elastic_graph_schema.logger @monotonic_clock = monotonic_clock end |
Class Method Details
.execute_one(ids, query:, join:, context:, monotonic_clock:) ⇒ Object
79 80 81 |
# File 'lib/elastic_graph/graphql/resolvers/nested_relationships_source.rb', line 79 def self.execute_one(ids, query:, join:, context:, monotonic_clock:) context.dataloader.with(self, query:, join:, context:, monotonic_clock:).load(ids) end |
Instance Method Details
#fetch(id_sets) ⇒ Object
74 75 76 77 |
# File 'lib/elastic_graph/graphql/resolvers/nested_relationships_source.rb', line 74 def fetch(id_sets) return fetch_original(id_sets) unless can_merge_filters? fetch_optimized(id_sets) end |