Module: Elasticsearch::Persistence::Model::Find::ClassMethods

Defined in:: lib/elasticsearch/persistence/model/find.rb

Instance Method Summary collapse

#count(query_or_definition = nil, options = {}) ⇒ Integer

Returns the number of models.
#find_each(options = {}) ⇒ String, Enumerator

Iterate effectively over models using the ‘find_in_batches` method.
#find_in_batches(options = {}, &block) ⇒ String, Enumerator

Returns all models efficiently via the Elasticsearch’s scan/scroll API.

Instance Method Details

#count(query_or_definition = nil, options = {}) ⇒ `Integer`

Returns the number of models

Examples:

Return the count of all models


Person.count
# => 2

Return the count of models matching a simple query


Person.count('fox or dog')
# => 1

Return the count of models matching a query in the Elasticsearch DSL


Person.search(query: { match: { title: 'fox dog' } })
# => 1

Returns:

(Integer)



26
27
28

# File 'lib/elasticsearch/persistence/model/find.rb', line 26

def count(query_or_definition = nil, options = {})
  gateway.count(query_or_definition, options)
end

#find_each(options = {}) ⇒ `String`, `Enumerator`

Iterate effectively over models using the ‘find_in_batches` method.

All the options are passed to ‘find_in_batches` and each result is yielded to the passed block.

Examples:

Print out the people’s names by scrolling through the index


Person.find_each { |person| puts person.name }

# # GET http://localhost:9200/people/person/_search?scroll=5m&search_type=scan&size=20
# # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
# Test 0
# Test 1
# Test 2
# ...
# # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
# Test 20
# Test 21
# Test 22

Leave out the block to return an Enumerator instance


Person.find_each.select { |person| person.name =~ /John/ }
# => => [#<Person {id: "NkltJP5vRxqk9_RMP7SU8Q", name: "John Smith",  ...}>]

Returns:

(String, Enumerator) —

The ‘scroll_id` for the request or Enumerator when the block is not passed

# File 'lib/elasticsearch/persistence/model/find.rb', line 144

def find_each(options = {})
  return to_enum(:find_each, options) unless block_given?

  find_in_batches(options) do |batch|
    batch.each { |result| yield result }
  end
end

#find_in_batches(options = {}, &block) ⇒ `String`, `Enumerator`

Returns all models efficiently via the Elasticsearch’s scan/scroll API

You can restrict the models being returned with a query.

The Search API options are passed to the search method as parameters, all remaining options are passed as the ‘:body` parameter.

The full Repository::Response::Results instance is yielded to the passed block in each batch, so you can access any of its properties; calling ‘to_a` will convert the object to an Array of model instances.

Examples:

Return all models in batches of 20 x number of primary shards


Person.find_in_batches { |batch| puts batch.map(&:name) }

Return all models in batches of 100 x number of primary shards


Person.find_in_batches(size: 100) { |batch| puts batch.map(&:name) }

Return all models matching a specific query


Person.find_in_batches(query: { match: { name: 'test' } }) { |batch| puts batch.map(&:name) }

Return all models, fetching only the ‘name` attribute from Elasticsearch


Person.find_in_batches( _source_include: 'name') { |_| puts _.response.hits.hits.map(&:to_hash) }

Leave out the block to return an Enumerator instance


Person.find_in_batches(size: 100).map { |batch| batch.size }
# => [100, 100, 100, ... ]

Returns:

(String, Enumerator) —

The ‘scroll_id` for the request or Enumerator when the block is not passed

# File 'lib/elasticsearch/persistence/model/find.rb', line 65

def find_in_batches(options = {}, &block)
  return to_enum(:find_in_batches, options) unless block_given?

  search_params = options.slice(
    :index,
    :type,
    :scroll,
    :size,
    :explain,
    :ignore_indices,
    :ignore_unavailable,
    :allow_no_indices,
    :expand_wildcards,
    :preference,
    :q,
    :routing,
    :source,
    :_source,
    :_source_include,
    :_source_exclude,
    :stats,
    :timeout
  )

  scroll = search_params.delete(:scroll) || "5m"

  body = options

  puts "BODY: #{body}".color :red
  # Get the initial scroll_id
  #
  response = gateway.client.search({ index: gateway.index_name,
                                     type: gateway.document_type,
                                     search_type: "scan",
                                     scroll: scroll,
                                     size: 20,
                                     body: body }.merge(search_params))

  # Get the initial batch of documents
  #
  response = gateway.client.scroll({ scroll_id: response["_scroll_id"], scroll: scroll })

  # Break when receiving an empty array of hits
  #
  while response["hits"]["hits"].any?
    yield Repository::Response::Results.new(gateway, response)

    response = gateway.client.scroll({ scroll_id: response["_scroll_id"], scroll: scroll })
  end

  return response["_scroll_id"]
end

Module: Elasticsearch::Persistence::Model::Find::ClassMethods

Instance Method Summary collapse

Instance Method Details

#count(query_or_definition = nil, options = {}) ⇒ Integer

#find_each(options = {}) ⇒ String, Enumerator

#find_in_batches(options = {}, &block) ⇒ String, Enumerator

#count(query_or_definition = nil, options = {}) ⇒ `Integer`

#find_each(options = {}) ⇒ `String`, `Enumerator`

#find_in_batches(options = {}, &block) ⇒ `String`, `Enumerator`