Class: Iriq::Clusterer

Inherits:
Object
  • Object
show all
Defined in:
lib/iriq/clusterer.rb

Overview

Groups many identifiers by host + path shape. Use ‘add` to feed inputs and `clusters` to read out the groups. `explain` annotates a single identifier against the cluster it would fall into, including which positions are stable across all observed members.

Instance Method Summary collapse

Constructor Details

#initialize(classifier: SegmentClassifier.new) ⇒ Clusterer

Returns a new instance of Clusterer.



7
8
9
10
# File 'lib/iriq/clusterer.rb', line 7

def initialize(classifier: SegmentClassifier.new)
  @classifier = classifier
  @clusters   = {}
end

Instance Method Details

#add(input) ⇒ Object



12
13
14
15
16
17
18
19
20
21
22
23
# File 'lib/iriq/clusterer.rb', line 12

def add(input)
  iri = coerce(input)
  key, host, scheme, shape = cluster_key(iri)
  cluster = @clusters[key] ||= Cluster.new(
    key:    key,
    host:   host,
    scheme: scheme,
    shape:  shape,
  )
  cluster.add(iri)
  cluster
end

#clustersObject



25
26
27
# File 'lib/iriq/clusterer.rb', line 25

def clusters
  @clusters.values
end

#explain(input) ⇒ Object

Returns a per-segment explanation for the input, merging classifier output with what we’ve observed in its cluster (i.e. positions that are factually stable get marked variable: false even if classifier would otherwise call them variable).



37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# File 'lib/iriq/clusterer.rb', line 37

def explain(input)
  iri = coerce(input)
  key, * = cluster_key(iri)
  cluster = @clusters[key]
  stats   = cluster ? cluster.segment_stats : []

  iri.path_segments.each_with_index.map do |seg, i|
    type   = @classifier.classify(seg)
    stable = stats[i] && stats[i][:stable]
    {
      value:    seg,
      type:     type,
      variable: !stable && @classifier.variable?(type),
      stable:   !!stable,
    }
  end
end

#sizeObject



29
30
31
# File 'lib/iriq/clusterer.rb', line 29

def size
  @clusters.size
end