Module: Toy::Core::ModelScan

Defined in:
lib/toy/core/model_scan.rb

Class Method Summary collapse

Class Method Details

.arch_prefix(meta) ⇒ Object

Probe the arch-prefix the converter used (arch.rb:159-166).



152
153
154
155
156
157
158
159
160
# File 'lib/toy/core/model_scan.rb', line 152

def arch_prefix(meta)
  %w[llama olmoe gemma2 qwen2 qwen3].each do |p|
    return p if meta.kv.key?("#{p}.embedding_length")
  end
  # fall back to general.architecture if it names a prefix we see
  ga = meta.kv["general.architecture"]
  return ga if ga.is_a?(String) && meta.kv.key?("#{ga}.embedding_length")
  nil
end

.array_length(v) ⇒ Object



162
163
164
# File 'lib/toy/core/model_scan.rb', line 162

def array_length(v)
  v.is_a?(Hash) ? v[:length] : nil
end

.build_entry(path, name, src_kind, size) ⇒ Object



203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
# File 'lib/toy/core/model_scan.rb', line 203

def build_entry(path, name, src_kind, size)
  begin
    meta = GGUFMeta.read(path)
    arch = read_arch(meta)
  rescue GGUFMeta::ParseError, SystemCallError
    arch = nil
  end
  if arch
    ModelEntry.new(name: name, path: path, family: arch[:family],
                   n_params: estimate_params(arch), size_b: size,
                   source: src_kind)
  else
    ModelEntry.new(name: name, path: path, family: :unknown,
                   n_params: 0, size_b: size, source: src_kind)
  end
end

.classify_path(path) ⇒ Object

Friendly name + source-kind from an absolute path. Ported from ModelIndex.classify_path (model_index.rb:108-125). Returns [source_kind, friendly_name].



78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'lib/toy/core/model_scan.rb', line 78

def classify_path(path)
  home = ENV["HOME"] || "/"
  bn = File.basename(path)
  bn_no_gguf = bn.end_with?(".gguf") ? bn[0...-5] : bn

  if path.start_with?(File.join(home, ".cache/huggingface/hub") + "/")
    ["hf", bn_no_gguf]
  elsif path.start_with?(File.join(home, ".ollama/models") + "/")
    ["ollama", bn]
  elsif path.start_with?(File.join(home, ".lmstudio/models") + "/")
    ["lmstudio", bn_no_gguf]
  else
    ["local", bn_no_gguf]
  end
end

.default_sourcesObject

Search-path order matters for first-found dedup. Project-local paths first, then standard caches. Ported verbatim from ModelIndex.default_sources (model_index.rb:65-77).



50
51
52
53
54
55
56
57
58
59
60
61
62
# File 'lib/toy/core/model_scan.rb', line 50

def default_sources
  home = ENV["HOME"] || "/"
  paths = []
  env = ENV["TOY_MODEL_DIR"]
  paths << env if env && !env.empty?
  paths << "./data"
  paths << "./models"
  paths << File.join(home, ".cache/huggingface/hub")
  paths << File.join(home, ".ollama/models")
  paths << File.join(home, ".lmstudio/models")
  paths << File.join(home, "models")
  paths
end

.estimate_params(a) ⇒ Object

Estimate parameter count from arch dims. Ported verbatim from ModelIndex.estimate_params (model_index.rb:131-144).



168
169
170
171
172
173
174
175
176
177
# File 'lib/toy/core/model_scan.rb', line 168

def estimate_params(a)
  v = a[:vocab]; d = a[:d_model]; l = a[:n_layers]
  ff = a[:d_ff] || 0
  nq = a[:n_q]; nkv = a[:n_kv]; dh = a[:d_head]
  embed = v * d
  attn_per_layer = (d * nq * dh) + (d * nkv * dh) * 2 + (nq * dh * d)
  ffn_per_layer = 3 * d * ff
  untied = a[:untied] ? (v * d) : 0
  embed + l * (attn_per_layer + ffn_per_layer) + untied
end

.find_ggufs(root) ⇒ Object

Walk a directory tree for *.gguf files. Replaces the C shim tnn_list_ggufs with Dir.glob, which follows symlinks (HF cache stores blobs + symlinked snapshots). Returns absolute paths.



67
68
69
70
71
72
73
# File 'lib/toy/core/model_scan.rb', line 67

def find_ggufs(root)
  return [] if root.nil? || root.empty?
  return [] unless File.directory?(root)
  Dir.glob(File.join(root, "**", "*.gguf"))
     .select { |f| File.file?(f) }
     .map { |f| File.expand_path(f) }
end

.read_arch(meta) ⇒ Object

Read arch dims + family from GGUF metadata via the pure-Ruby header reader. Mirrors the key list + family logic of Arch.from_gguf (arch.rb:159-218). Returns a Hash, or nil if the file can’t be parsed as a llama-family GGUF.



98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
# File 'lib/toy/core/model_scan.rb', line 98

def read_arch(meta)
  prefix = arch_prefix(meta)
  return nil if prefix.nil?

  kv = meta.kv
  vocab = kv["#{prefix}.vocab_size"]
  vocab = array_length(kv["tokenizer.ggml.tokens"]) if vocab.nil?
  d_model  = kv["#{prefix}.embedding_length"]
  d_ff     = kv["#{prefix}.feed_forward_length"]
  n_q      = kv["#{prefix}.attention.head_count"]
  n_kv     = kv["#{prefix}.attention.head_count_kv"] || n_q
  n_layers = kv["#{prefix}.block_count"]

  return nil if vocab.nil? || d_model.nil? || n_layers.nil? || n_q.nil?
  return nil if vocab <= 0 || d_model <= 0 || n_layers <= 0 || n_q <= 0

  # Family detection. general.architecture reliably names genuinely
  # different arches (gpt2, gemma2, olmoe, ...). Our own converter
  # writes "llama" for BOTH real-llama AND real-qwen (qwen is
  # structurally llama-family in this toy), so within "llama" we
  # refine llama-vs-qwen2 by QKV-bias tensor presence (cosmetic —
  # both render the same llama-family Card). Non-llama general.arch
  # values are surfaced verbatim so `describe` can decline to render
  # a llama Card for an arch it doesn't model (fail loud, not mask).
  ga = kv["general.architecture"]
  if ga.is_a?(String) && ga != "llama"
    family = ga.to_sym
  else
    has_qkv_bias = meta.tensor?("blk.0.attn_q.bias") ||
                   meta.tensor?("blk.0.attn_q.head_0.bias")
    family = has_qkv_bias ? :qwen2 : :llama
  end

  {
    family:   family,
    vocab:    vocab,
    d_model:  d_model,
    d_ff:     d_ff,
    n_q:      n_q,
    n_kv:     n_kv,
    n_layers: n_layers,
    d_head:   d_model / n_q,
    untied:   meta.tensor?("output.weight"),
    ctx:      kv["#{prefix}.context_length"] || 8192,
    rope_base: kv["#{prefix}.rope.freq_base"],
    rms_eps:   kv["#{prefix}.attention.layer_norm_rms_epsilon"],
    moe:       meta.tensor?("blk.0.ffn_gate_inp.weight"),
    n_experts:      kv["#{prefix}.expert_count"] || 0,
    n_experts_used: kv["#{prefix}.expert_used_count"] || 0,
    arch_prefix: prefix
  }
end

.scan(sources = default_sources) ⇒ Object

Scan source dirs → [ModelEntry]. De-dup by CANONICAL (symlink- resolved) path, first-found wins — so a ‘toy fetch` model shows once under its data/ symlink (scanned first) rather than twice (the symlink AND the HF-cache blob it points at). Falls back to the literal path if realpath fails (e.g. a broken symlink). Unparseable/non-llama files degrade to family=:unknown, params=0 (NOT dropped — the UX delta).



186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
# File 'lib/toy/core/model_scan.rb', line 186

def scan(sources = default_sources)
  seen = {}
  out = []
  sources.each do |src|
    find_ggufs(src).each do |path|
      canon = (File.realpath(path) rescue path)
      next if seen[canon]
      seen[canon] = true
      src_kind, name = classify_path(path)
      size = File.size(path)
      entry = build_entry(path, name, src_kind, size)
      out << entry
    end
  end
  out
end