Class: Perron::Site::Resource::Related
- Inherits:
-
Object
- Object
- Perron::Site::Resource::Related
- Defined in:
- lib/perron/resource/related.rb,
lib/perron/resource/related/stop_words.rb
Overview
Finds related resources using TF-IDF cosine similarity.
Pre-normalizes vectors so cosine similarity reduces to a dot product, then builds a symmetric similarity matrix once per collection.
Results are cached at the class level so the O(n²) comparison is paid once, not once per resource.
Defined Under Namespace
Modules: StopWords Classes: Cache
Class Method Summary collapse
- .cache_for(collection_name) ⇒ Object
- .clear_cache!(collection_name) ⇒ Object
- .fingerprinted(collection_name) ⇒ Object
- .stale?(collection_name) ⇒ Boolean
Instance Method Summary collapse
- #find(limit: 5) ⇒ Object
-
#initialize(resource) ⇒ Related
constructor
A new instance of Related.
Constructor Details
#initialize(resource) ⇒ Related
Returns a new instance of Related.
41 42 43 44 45 |
# File 'lib/perron/resource/related.rb', line 41 def initialize(resource) @resource = resource @collection = resource.collection @cache = self.class.cache_for(@collection.name) end |
Class Method Details
.cache_for(collection_name) ⇒ Object
20 21 22 23 24 |
# File 'lib/perron/resource/related.rb', line 20 def self.cache_for(collection_name) clear_cache!(collection_name) if stale?(collection_name) @collection_caches[collection_name] ||= Cache.new(nil, nil, fingerprinted(collection_name)) end |
.clear_cache!(collection_name) ⇒ Object
26 27 28 |
# File 'lib/perron/resource/related.rb', line 26 def self.clear_cache!(collection_name) @collection_caches.delete(collection_name) end |
.fingerprinted(collection_name) ⇒ Object
34 35 36 37 38 39 |
# File 'lib/perron/resource/related.rb', line 34 def self.fingerprinted(collection_name) path = File.join(Perron.configuration.input, collection_name) files = Dir.glob(File.join(path, "**", "*.*")) [files.size, files.map { File.mtime(it) }.max] end |
.stale?(collection_name) ⇒ Boolean
30 31 32 |
# File 'lib/perron/resource/related.rb', line 30 def self.stale?(collection_name) @collection_caches[collection_name]&.fingerprint != fingerprinted(collection_name) end |
Instance Method Details
#find(limit: 5) ⇒ Object
47 48 49 50 51 52 53 54 |
# File 'lib/perron/resource/related.rb', line 47 def find(limit: 5) scores = similarity_matrix[@resource.slug] || {} resources .reject { it.slug == @resource.slug } .sort_by { -(scores[it.slug] || 0.0) } .first(limit) end |