Class: Toy::LLM::Recipes::WarmStartMetal

Inherits:
Object
  • Object
show all
Defined in:
lib/toy/llm/recipes/warm_start_metal.rb

Overview

The warm-start training recipe. realize_scratch! builds the random-init forward+CE+backward+AdamW graph on a Toy::LLM::Engine::LlamaSeqEngineMetal (random-init realize self-enables full_finetune + train_embeddings, so no extra enable_* call is needed) and OPENS the warm window; realize_warm! (optional) uploads an already-read donor embedding into the realize’d embed table BEFORE the graph is baked; build! CLOSES the window by baking forward+CE+backward+opt_step_adamw into the ggml graph. step! then drives one training step. The caller (fixture) owns the experiment config, the donor/PCA GGUF read, the corpus stream, the LR schedule, and the per-step input Mats.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeWarmStartMetal

Returns a new instance of WarmStartMetal.



75
76
77
78
79
80
81
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 75

def initialize
  @ws_cache      = Toy::LLM::Engine::LlamaSeqEngineMetal.new
  @ws_t_loss     = nil
  @ws_t_labels   = nil
  @ws_t_hp       = nil
  @ws_step_index = 0
end

Instance Attribute Details

#ws_cacheObject

Returns the value of attribute ws_cache.



73
74
75
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 73

def ws_cache
  @ws_cache
end

#ws_step_indexObject

Returns the value of attribute ws_step_index.



73
74
75
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 73

def ws_step_index
  @ws_step_index
end

#ws_t_hpObject

Returns the value of attribute ws_t_hp.



73
74
75
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 73

def ws_t_hp
  @ws_t_hp
end

#ws_t_labelsObject

Returns the value of attribute ws_t_labels.



73
74
75
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 73

def ws_t_labels
  @ws_t_labels
end

#ws_t_lossObject

Returns the value of attribute ws_t_loss.



73
74
75
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 73

def ws_t_loss
  @ws_t_loss
end

Class Method Details

.donor_embed_width(donor_gguf_path) ⇒ Object

Read the donor’s embedding width (llama.embedding_length) from a GGUF path — the value the caller must put in cfg.donor_d_in BEFORE realize_scratch! (the projection lens is sized donor_d_in x d_model at realize time, so the recipe cannot learn it later). FAILS LOUD on a missing/corrupt donor or a non-llama-family GGUF. (toy#73 item 4 — the read half of the donor plumbing realize_warm! owns.)



107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 107

def self.donor_embed_width(donor_gguf_path)
  if !File.exist?(donor_gguf_path)
    raise "WarmStartMetal.donor_embed_width: donor GGUF not found: " +
          donor_gguf_path
  end
  ggh = TinyNNMetal.tnn_gguf_load(donor_gguf_path)
  if ggh == nil || ggh == TinyNNMetal.tnn_null_ptr
    raise "WarmStartMetal.donor_embed_width: failed to open " +
          donor_gguf_path + " (not a GGUF?)"
  end
  donor_d = TinyNNMetal.tnn_gguf_get_u32(ggh, "llama.embedding_length")
  TinyNNMetal.tnn_gguf_free(ggh)
  if donor_d <= 0
    raise "WarmStartMetal.donor_embed_width: donor has no " +
          "llama.embedding_length key — not llama-family? (" +
          donor_gguf_path + ")"
  end
  donor_d
end

Instance Method Details

#build!Object

CLOSE the warm window: bake forward + CE + backward + opt_step_adamw into the ggml graph (no Ruby Trainer/optimizer —same rationale as FromScratch). Delegates VERBATIM to build_training_step and stashes the returned [t_loss, t_labels, t_hp] triple. Returns nil.



203
204
205
206
207
208
209
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 203

def build!
  result       = @ws_cache.build_training_step
  @ws_t_loss   = result[0]
  @ws_t_labels = result[1]
  @ws_t_hp     = result[2]
  nil
end

#realize_scratch!(cfg, opts) ⇒ Object

Realize the random-init graph and OPEN the warm window. Delegates VERBATIM to the cache: realize_for_random_init (which self-enables ‘opts` is a Toy::LLM::RecipeOptions (toy#64 item 1) carrying the former 7 trailing positional args, unpacked here in the engine’s exact positional order (identical to FromScratch#realize! / 09 L138), so the realize is byte-identical. Does NOT bake the graph —that is build!‘s job, leaving the window open for an optional realize_warm! upload in between. Returns nil.



92
93
94
95
96
97
98
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 92

def realize_scratch!(cfg, opts)
  @ws_cache.realize_for_random_init(cfg, opts.t_seq, opts.t_batch,
                                    opts.weight_dtype, opts.untied,
                                    opts.qkv_bias, opts.seed,
                                    opts.init_scale)
  nil
end

#realize_warm!(donor_gguf_path, cfg) ⇒ Object

OPTIONAL: warm the realize’d embed table from a donor GGUF. Owns the WHOLE donor read (toy#73 item 4 — was ~25 lines of bare GGUF plumbing in every consumer): open, re-read llama.embedding_length and DIM-CHECK it against cfg.donor_d_in (the width the lens was realized at — a mismatch would silently upload garbage through the wrong stride), find token_embd.weight, read the first cfg.vocab rows, upload through upload_donor!, free. Every failure raises NAMED + LOUD (which tensor, expected vs got, which path). Must be called AFTER realize_scratch! (the tensor exists) and BEFORE build! (else we train through the random init). INIT=scratch flows skip this method entirely. Returns nil.



138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 138

def realize_warm!(donor_gguf_path, cfg)
  if !File.exist?(donor_gguf_path)
    raise "WarmStartMetal#realize_warm!: donor GGUF not found: " +
          donor_gguf_path
  end
  ggh = TinyNNMetal.tnn_gguf_load(donor_gguf_path)
  if ggh == nil || ggh == TinyNNMetal.tnn_null_ptr
    raise "WarmStartMetal#realize_warm!: failed to open " +
          donor_gguf_path + " (not a GGUF?)"
  end
  donor_d = TinyNNMetal.tnn_gguf_get_u32(ggh, "llama.embedding_length")
  if donor_d <= 0
    TinyNNMetal.tnn_gguf_free(ggh)
    raise "WarmStartMetal#realize_warm!: donor has no " +
          "llama.embedding_length key — not llama-family? (" +
          donor_gguf_path + ")"
  end
  if donor_d != cfg.donor_d_in
    TinyNNMetal.tnn_gguf_free(ggh)
    raise "WarmStartMetal#realize_warm!: token_embd.weight width " +
          "mismatch: expected donor_d_in=" + cfg.donor_d_in.to_s +
          " (the width realize_scratch! sized the lens at) but " +
          "donor llama.embedding_length=" + donor_d.to_s + " (" +
          donor_gguf_path + ")"
  end
  te_idx = TinyNNMetal.tnn_gguf_find_index(ggh, "token_embd.weight")
  if te_idx < 0
    TinyNNMetal.tnn_gguf_free(ggh)
    raise "WarmStartMetal#realize_warm!: donor has no " +
          "token_embd.weight tensor (" + donor_gguf_path + ")"
  end
  n_floats = cfg.vocab * donor_d
  te_buf = Mat.new(1, n_floats)
  rc = TinyNNMetal.tnn_gguf_read_f32_to_doubles(ggh, te_idx,
                                           te_buf.flat, n_floats)
  if rc != 0
    TinyNNMetal.tnn_gguf_free(ggh)
    raise "WarmStartMetal#realize_warm!: token_embd.weight read failed " +
          "rc=" + rc.to_s + " — wanted " + n_floats.to_s +
          " floats (vocab " + cfg.vocab.to_s + " x donor_d " +
          donor_d.to_s + ") from " + donor_gguf_path
  end
  upload_donor!(te_buf.flat, n_floats)
  TinyNNMetal.tnn_gguf_free(ggh)
  nil
end

#step!(seq_ids, positions, m_labels, m_hp, is_first) ⇒ Object

ONE training step. Op order is COPIED VERBATIM from FromScratch#step! (from_scratch.rb:83-97) and LITERALLY IDENTICAL to LoRA#step!: graph_reset on the first step else reset_grads_only; the four uploads in order (token_ids/positions/labels/hp); compute_backward; download_row_major(t_loss, 1, 1). is_first selects the reset; the @ws_step_index accessor is carried for callers that want it but is NOT used for the reset decision, so the caller stays in full control of the step==0 branch. The per-step LR enters ONLY via the caller mutating m_hp.flat before this call — there is deliberately NO lr param here (matches the siblings; keeps schedule logic in the fixture). Returns the loss Float. Per-step input Mats are built by the caller.



223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 223

def step!(seq_ids, positions, m_labels, m_hp, is_first)
  s = @ws_cache.sess
  if is_first
    TinyNNMetal.tnn_graph_reset(s)
  else
    TinyNNMetal.tnn_graph_reset_grads_only(s)
  end
  TinyNNMetal.upload_int_array(s, @ws_cache.t_seq_token_ids, seq_ids)
  TinyNNMetal.upload_int_array(s, @ws_cache.t_seq_positions, positions)
  TinyNNMetal.upload_row_major(s, @ws_t_labels, m_labels)
  TinyNNMetal.upload_row_major(s, @ws_t_hp,     m_hp)
  TinyNNMetal.tnn_compute_backward(s)
  loss_mat = TinyNNMetal.download_row_major(s, @ws_t_loss, 1, 1)
  loss_mat.flat[0]
end

#upload_donor!(donor_buf_flat, n_floats) ⇒ Object

The raw upload MECHANISM realize_warm! rides (and the seam for already-read buffers — e.g. the legacy PCA-lens flow): one tnn_upload_from_float_array into the realize’d token_embed table (mirrors 09 L180). Same window rules as realize_warm!. The PCA lens W_proj upload (09 L188-229) stays caller-side through



191
192
193
194
195
196
# File 'lib/toy/llm/recipes/warm_start_metal.rb', line 191

def upload_donor!(donor_buf_flat, n_floats)
  TinyNNMetal.tnn_upload_from_float_array(@ws_cache.sess,
                                     @ws_cache.t_seq_token_embed,
                                     donor_buf_flat, n_floats)
  nil
end