Class: Iriq::Clusterer
- Inherits:
-
Object
- Object
- Iriq::Clusterer
- Defined in:
- lib/iriq/clusterer.rb
Overview
Groups many identifiers by host + path shape. Use ‘add` to feed inputs and `clusters` to read out the groups. `explain` annotates a single identifier against the cluster it would fall into, including which positions are stable across all observed members.
Instance Method Summary collapse
- #add(input) ⇒ Object
- #clusters ⇒ Object
-
#explain(input) ⇒ Object
Returns a per-segment explanation for the input, merging classifier output with what we’ve observed in its cluster (i.e. positions that are factually stable get marked variable: false even if classifier would otherwise call them variable).
-
#initialize(classifier: SegmentClassifier.new) ⇒ Clusterer
constructor
A new instance of Clusterer.
- #size ⇒ Object
Constructor Details
#initialize(classifier: SegmentClassifier.new) ⇒ Clusterer
Returns a new instance of Clusterer.
7 8 9 10 |
# File 'lib/iriq/clusterer.rb', line 7 def initialize(classifier: SegmentClassifier.new) @classifier = classifier @clusters = {} end |
Instance Method Details
#add(input) ⇒ Object
12 13 14 15 16 17 18 19 20 21 22 23 |
# File 'lib/iriq/clusterer.rb', line 12 def add(input) iri = coerce(input) key, host, scheme, shape = cluster_key(iri) cluster = @clusters[key] ||= Cluster.new( key: key, host: host, scheme: scheme, shape: shape, ) cluster.add(iri) cluster end |
#clusters ⇒ Object
25 26 27 |
# File 'lib/iriq/clusterer.rb', line 25 def clusters @clusters.values end |
#explain(input) ⇒ Object
Returns a per-segment explanation for the input, merging classifier output with what we’ve observed in its cluster (i.e. positions that are factually stable get marked variable: false even if classifier would otherwise call them variable).
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/iriq/clusterer.rb', line 37 def explain(input) iri = coerce(input) key, * = cluster_key(iri) cluster = @clusters[key] stats = cluster ? cluster.segment_stats : [] iri.path_segments.each_with_index.map do |seg, i| type = @classifier.classify(seg) stable = stats[i] && stats[i][:stable] { value: seg, type: type, variable: !stable && @classifier.variable?(type), stable: !!stable, } end end |
#size ⇒ Object
29 30 31 |
# File 'lib/iriq/clusterer.rb', line 29 def size @clusters.size end |