Module: YouPlot::Aggregation

Defined in:
lib/youplot/aggregation.rb

Class Method Summary collapse

Class Method Details

.compare_integer_strings(a, b) ⇒ Object

Compares two numeric strings, handling leading zeros. Order: by length (sans leading zeros), then numeric value, then original.



125
126
127
128
129
130
131
132
133
134
135
136
137
138
# File 'lib/youplot/aggregation.rb', line 125

def compare_integer_strings(a, b)
  aa = a.sub(/\A0+/, '')
  bb = b.sub(/\A0+/, '')
  aa = '0' if aa.empty?
  bb = '0' if bb.empty?

  r = aa.length <=> bb.length
  return r unless r.zero?

  r = aa <=> bb
  return r unless r.zero?

  a <=> b
end

.count_values(arr, tally: true, reverse: false) ⇒ Object



7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# File 'lib/youplot/aggregation.rb', line 7

def count_values(arr, tally: true, reverse: false)
  # tally was added in Ruby 2.7
  result = \
    if tally && Enumerable.method_defined?(:tally)
      arr.tally
    else
      # value_counts Enumerable::Statistics
      arr.value_counts(dropna: false)
    end

  sort_cache = {}

  # sorting
  result = result.sort do |a, b|
    # compare values
    r = b[1] <=> a[1]
    # If the values are the same, compare by name
    r = natural_compare(a[0], b[0], sort_cache) if r.zero?
    r
  end

  # --reverse option
  result.reverse! if reverse

  # prepare for barplot
  result.transpose
end

.ensure_natural_tokens(key) ⇒ Object

Memoizes token pairs for fallback chunked comparison.



101
102
103
# File 'lib/youplot/aggregation.rb', line 101

def ensure_natural_tokens(key)
  key[:tokens] ||= natural_tokens(key[:string])
end

.natural_compare(a, b, cache = nil) ⇒ Object

Natural order comparison for tie-breaking when counts are equal. Fast paths handle text-only and pure numeric labels. Mixed labels still use chunked comparison (e.g. “chr1” vs “chr10”).



38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/youplot/aggregation.rb', line 38

def natural_compare(a, b, cache = nil)
  aa = natural_sort_key(a, cache)
  bb = natural_sort_key(b, cache)

  # Fast path: both labels are text-only, so plain string comparison is enough.
  return aa[:string] <=> bb[:string] if aa[:type] == :text && bb[:type] == :text

  # Fast path: both labels are pure numbers, so compare numerically first.
  if aa[:type] == :numeric && bb[:type] == :numeric
    r = aa[:numeric] <=> bb[:numeric]
    return r unless r.zero?

    # Tiebreaker for equivalent numeric values (e.g. "1" and "01")
    return aa[:string] <=> bb[:string]
  end

  # Fallback path: at least one label mixes text and digits.
  ta = ensure_natural_tokens(aa)
  tb = ensure_natural_tokens(bb)
  max = [ta.size, tb.size].max

  0.upto(max - 1) do |i|
    xa = ta[i]
    xb = tb[i]

    return -1 if xa.nil?
    return 1 if xb.nil?

    r = if xa[0] == :num && xb[0] == :num
          compare_integer_strings(xa[1], xb[1])
        else
          xa[1] <=> xb[1]
        end

    return r unless r.zero?
  end

  aa[:string] <=> bb[:string]
end

.natural_sort_key(value, cache = nil) ⇒ Object

Classifies a value for natural sorting and caches the result per label.



79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# File 'lib/youplot/aggregation.rb', line 79

def natural_sort_key(value, cache = nil)
  str = value.to_s
  return cache[str] if cache && cache.key?(str)

  key = if str.match?(/\d/)
          numeric = parse_numeric(str)
          if numeric
            # Pure numeric labels get a dedicated fast path.
            { type: :numeric, string: str, numeric: numeric }
          else
            # Mixed labels fall back to chunked natural comparison.
            { type: :mixed, string: str, tokens: nil }
          end
        else
          # Text-only labels get a dedicated fast path.
          { type: :text, string: str, tokens: nil }
        end

  cache ? cache[str] = key : key
end

.natural_tokens(str) ⇒ Object

Splits a string into [type, token] pairs for natural comparison. Type is :num for digit-only chunks, :text for anything else. E.g. “chr10” => [[:text, “chr”], [:num, “10”]]



116
117
118
119
120
121
# File 'lib/youplot/aggregation.rb', line 116

def natural_tokens(str)
  str.scan(/\d+|\D+/).map do |tok|
    kind = tok.match?(/\A\d+\z/) ? :num : :text
    [kind, tok]
  end
end

.parse_numeric(str) ⇒ Object

Parses a string as a numeric value if it matches pure number format. Returns Float or nil.



107
108
109
110
111
# File 'lib/youplot/aggregation.rb', line 107

def parse_numeric(str)
  return nil unless str.match?(/\A[+-]?(?:\d+(?:\.\d+)?|\.\d+)\z/)

  str.to_f
end