Class: Toy::LLM::Recipes::VitTiny

Inherits:
Object
  • Object
show all
Defined in:
lib/toy/llm/recipes/vit_tiny.rb

Overview

The ViT-Tiny from-scratch random-init training recipe. realize! builds the random-init forward+CE+backward+AdamW graph (realize_for_random_init self-seeds every PARAM + Adam moment via Box-Muller xorshift64 — no donor GGUF), then step! drives one training step. The caller (runner) owns the ViT config + the per-step input Mats (image/labels/hp/cls_idx).

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeVitTiny

Returns a new instance of VitTiny.



32
33
34
35
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 32

def initialize
  @vt_cache  = Toy::LLM::Engine::ViTTinyEngine.new
  @vt_t_loss = nil
end

Instance Attribute Details

#vt_cacheObject

Returns the value of attribute vt_cache.



30
31
32
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 30

def vt_cache
  @vt_cache
end

#vt_t_lossObject

Returns the value of attribute vt_t_loss.



30
31
32
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 30

def vt_t_loss
  @vt_t_loss
end

Instance Method Details

#realize!(cfg, opts) ⇒ Object

Realize the random-init graph. Delegates VERBATIM to the cache: realize_for_random_init (3-arg: cfg, seed, init_scale — self-seeds via Box-Muller, NO donor) then build_training_step (forward + CE + backward + opt_step_adamw baked into the ggml graph; returns a SINGLE t_loss ptr). ‘opts` is a Toy::LLM::RecipeOptions (toy#64 item 1); the ViT random-init path consumes ONLY seed + init_scale (init_scale=1.0 per 07_train_vit_tiny.rb:80). Returns nil.



44
45
46
47
48
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 44

def realize!(cfg, opts)
  @vt_cache.realize_for_random_init(cfg, opts.seed, opts.init_scale)
  @vt_t_loss = @vt_cache.build_training_step
  nil
end

#step!(m_image, cls_idx, m_labels, m_hp, is_first) ⇒ Object

ONE training step. Op order is VERBATIM from examples/legacy/07_train_vit_tiny.rb:269-281: graph_reset on the first step else reset_grads_only; the four uploads in order (image/cls_idx/labels/hp); compute_backward; download_row_major(t_loss, 1, 1). is_first selects the reset, so the caller stays in full control of the step==0 branch (matches the gate’s step==0 branch exactly). Reads t_image/t_cls_idx/t_labels/t_hp from the CACHE accessors (NOT a stashed triple). Returns the loss Float. Per-step input Mats are built by the caller (runner).



59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 59

def step!(m_image, cls_idx, m_labels, m_hp, is_first)
  s = @vt_cache.sess
  if is_first
    TinyNN.tnn_graph_reset(s)
  else
    TinyNN.tnn_graph_reset_grads_only(s)
  end
  TinyNN.upload_row_major(s, @vt_cache.t_image,   m_image)
  TinyNN.upload_int_array(s, @vt_cache.t_cls_idx, cls_idx)
  TinyNN.upload_row_major(s, @vt_cache.t_labels,  m_labels)
  TinyNN.upload_row_major(s, @vt_cache.t_hp,      m_hp)
  TinyNN.tnn_compute_backward(s)
  loss_mat = TinyNN.download_row_major(s, @vt_t_loss, 1, 1)
  loss_mat.flat[0]
end