Class: Toy::LLM::Recipes::VitTiny
- Inherits:
-
Object
- Object
- Toy::LLM::Recipes::VitTiny
- Defined in:
- lib/toy/llm/recipes/vit_tiny.rb
Overview
The ViT-Tiny from-scratch random-init training recipe. realize! builds the random-init forward+CE+backward+AdamW graph (realize_for_random_init self-seeds every PARAM + Adam moment via Box-Muller xorshift64 — no donor GGUF), then step! drives one training step. The caller (runner) owns the ViT config + the per-step input Mats (image/labels/hp/cls_idx).
Instance Attribute Summary collapse
-
#vt_cache ⇒ Object
Returns the value of attribute vt_cache.
-
#vt_t_loss ⇒ Object
Returns the value of attribute vt_t_loss.
Instance Method Summary collapse
-
#initialize ⇒ VitTiny
constructor
A new instance of VitTiny.
-
#realize!(cfg, opts) ⇒ Object
Realize the random-init graph.
-
#step!(m_image, cls_idx, m_labels, m_hp, is_first) ⇒ Object
ONE training step.
Constructor Details
Instance Attribute Details
#vt_cache ⇒ Object
Returns the value of attribute vt_cache.
30 31 32 |
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 30 def vt_cache @vt_cache end |
#vt_t_loss ⇒ Object
Returns the value of attribute vt_t_loss.
30 31 32 |
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 30 def vt_t_loss @vt_t_loss end |
Instance Method Details
#realize!(cfg, opts) ⇒ Object
Realize the random-init graph. Delegates VERBATIM to the cache: realize_for_random_init (3-arg: cfg, seed, init_scale — self-seeds via Box-Muller, NO donor) then build_training_step (forward + CE + backward + opt_step_adamw baked into the ggml graph; returns a SINGLE t_loss ptr). ‘opts` is a Toy::LLM::RecipeOptions (toy#64 item 1); the ViT random-init path consumes ONLY seed + init_scale (init_scale=1.0 per 07_train_vit_tiny.rb:80). Returns nil.
44 45 46 47 48 |
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 44 def realize!(cfg, opts) @vt_cache.realize_for_random_init(cfg, opts.seed, opts.init_scale) @vt_t_loss = @vt_cache.build_training_step nil end |
#step!(m_image, cls_idx, m_labels, m_hp, is_first) ⇒ Object
ONE training step. Op order is VERBATIM from examples/legacy/07_train_vit_tiny.rb:269-281: graph_reset on the first step else reset_grads_only; the four uploads in order (image/cls_idx/labels/hp); compute_backward; download_row_major(t_loss, 1, 1). is_first selects the reset, so the caller stays in full control of the step==0 branch (matches the gate’s step==0 branch exactly). Reads t_image/t_cls_idx/t_labels/t_hp from the CACHE accessors (NOT a stashed triple). Returns the loss Float. Per-step input Mats are built by the caller (runner).
59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/toy/llm/recipes/vit_tiny.rb', line 59 def step!(m_image, cls_idx, m_labels, m_hp, is_first) s = @vt_cache.sess if is_first TinyNN.tnn_graph_reset(s) else TinyNN.tnn_graph_reset_grads_only(s) end TinyNN.upload_row_major(s, @vt_cache.t_image, m_image) TinyNN.upload_int_array(s, @vt_cache.t_cls_idx, cls_idx) TinyNN.upload_row_major(s, @vt_cache.t_labels, m_labels) TinyNN.upload_row_major(s, @vt_cache.t_hp, m_hp) TinyNN.tnn_compute_backward(s) loss_mat = TinyNN.download_row_major(s, @vt_t_loss, 1, 1) loss_mat.flat[0] end |