Class: SourceMonitor::Items::BatchItemCreator
- Inherits:
-
Object
- Object
- SourceMonitor::Items::BatchItemCreator
- Defined in:
- lib/source_monitor/items/batch_item_creator.rb
Overview
Builds a pre-fetched lookup index of existing items for a batch of entries.
Instead of N individual SELECT queries (one per feed entry) to check for existing items, this class:
1. Pre-parses all entries to collect GUIDs + fingerprints
2. Does a single WHERE guid IN (...) query to find existing items by GUID
3. Does a single WHERE content_fingerprint IN (...) for remaining entries
4. Returns an index hash that ItemCreator can use to skip per-entry SELECTs
The actual item creation/update is still done by ItemCreator.call, which accepts the index via the existing_items_index parameter.
Class Method Summary collapse
-
.build_index(source:, entries:) ⇒ Object
Builds a lookup index from a batch of feed entries.
Instance Method Summary collapse
- #build_index ⇒ Object
-
#initialize(source:, entries:) ⇒ BatchItemCreator
constructor
A new instance of BatchItemCreator.
Constructor Details
#initialize(source:, entries:) ⇒ BatchItemCreator
Returns a new instance of BatchItemCreator.
26 27 28 29 |
# File 'lib/source_monitor/items/batch_item_creator.rb', line 26 def initialize(source:, entries:) @source = source @entries = Array(entries) end |
Class Method Details
.build_index(source:, entries:) ⇒ Object
Builds a lookup index from a batch of feed entries. Returns a Hash with :by_guid and :by_fingerprint keys.
22 23 24 |
# File 'lib/source_monitor/items/batch_item_creator.rb', line 22 def self.build_index(source:, entries:) new(source: source, entries: entries).build_index end |
Instance Method Details
#build_index ⇒ Object
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
# File 'lib/source_monitor/items/batch_item_creator.rb', line 31 def build_index return { by_guid: {}, by_fingerprint: {} } if @entries.empty? # Step 1: Pre-parse entries to extract GUIDs and fingerprints for bulk lookup. entry_identifiers = @entries.map do |entry| normalized_entry = NormalizedEntry.new( source: @source, entry: entry, content_extractor: content_extractor ) { guid: normalized_entry.item_guid, fingerprint: normalized_entry.content_fingerprint, raw_guid_present: normalized_entry.raw_guid_present? } end # Step 2: Batch-fetch existing items by GUID (single query) guids = entry_identifiers .select { |ei| ei[:raw_guid_present] } .filter_map { |ei| ei[:guid] } .uniq existing_by_guid = if guids.any? @source.all_items.where(guid: guids).index_by(&:guid) else {} end # Step 3: For entries without a GUID match, batch-fetch by fingerprint unmatched_fingerprints = entry_identifiers.filter_map do |ei| guid = ei[:guid] next if ei[:raw_guid_present] && existing_by_guid.key?(guid) ei[:fingerprint].presence end.uniq existing_by_fingerprint = if unmatched_fingerprints.any? @source.all_items .where(content_fingerprint: unmatched_fingerprints) .index_by(&:content_fingerprint) else {} end { by_guid: existing_by_guid, by_fingerprint: existing_by_fingerprint } end |