Class: Iriq::Clusterer

Inherits:
Object
  • Object
show all
Defined in:
lib/iriq/clusterer.rb

Overview

Groups many identifiers by host + path shape. Use ‘add` to feed inputs and `clusters` to read out the groups. `explain` annotates a single identifier against the cluster it would fall into, including which positions are stable across all observed members.

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(classifier: SegmentClassifier::DEFAULT) ⇒ Clusterer

Returns a new instance of Clusterer.



7
8
9
10
# File 'lib/iriq/clusterer.rb', line 7

def initialize(classifier: SegmentClassifier::DEFAULT)
  @classifier = classifier
  @clusters   = {}
end

Class Method Details

.from_dump(h, classifier: SegmentClassifier::DEFAULT) ⇒ Object



85
86
87
88
89
90
# File 'lib/iriq/clusterer.rb', line 85

def self.from_dump(h, classifier: SegmentClassifier::DEFAULT)
  c = new(classifier: classifier)
  restored = h["clusters"].transform_values { |cdump| Cluster.from_dump(cdump) }
  c.instance_variable_set(:@clusters, restored)
  c
end

Instance Method Details

#add(input, shape: nil) ⇒ Object



12
13
14
15
16
17
18
19
20
21
22
23
# File 'lib/iriq/clusterer.rb', line 12

def add(input, shape: nil)
  iri = coerce(input)
  key, host, scheme, shape = cluster_key(iri, shape: shape)
  cluster = @clusters[key] ||= Cluster.new(
    key:    key,
    host:   host,
    scheme: scheme,
    shape:  shape,
  )
  cluster.add(iri)
  cluster
end

#clustersObject



25
26
27
# File 'lib/iriq/clusterer.rb', line 25

def clusters
  @clusters.values
end

#dumpObject



81
82
83
# File 'lib/iriq/clusterer.rb', line 81

def dump
  { "clusters" => @clusters.transform_values(&:dump) }
end

#explain(input) ⇒ Object

Returns a per-segment explanation for the input, merging classifier output with what we’ve observed in its cluster (i.e. positions that are factually stable get marked variable: false even if classifier would otherwise call them variable).



37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/iriq/clusterer.rb', line 37

def explain(input)
  iri = coerce(input)
  key, * = cluster_key(iri)
  cluster = @clusters[key]
  stats   = cluster ? cluster.segment_stats : []
  hinted  = SegmentHints.derive(iri.path_segments, @classifier)

  hinted.each_with_index.map do |entry, i|
    stable = stats[i] && stats[i][:stable]
    entry.merge(
      variable: !stable && entry[:variable],
      stable:   !!stable,
    )
  end
end

#sizeObject



29
30
31
# File 'lib/iriq/clusterer.rb', line 29

def size
  @clusters.size
end