Class: Toy::LLM::Engine::ViTTinyBlockFFI
- Inherits:
-
Object
- Object
- Toy::LLM::Engine::ViTTinyBlockFFI
- Defined in:
- lib/toy/llm/engine/vit_tiny_engine.rb
Instance Attribute Summary collapse
-
#ft_m ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#ft_v ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#ft_weights ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_ln1_gamma ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_ln2_gamma ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_w_down ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_w_k ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_w_o ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_w_q ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_w_up ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
-
#t_w_v ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward.
Instance Method Summary collapse
-
#initialize ⇒ ViTTinyBlockFFI
constructor
A new instance of ViTTinyBlockFFI.
Constructor Details
#initialize ⇒ ViTTinyBlockFFI
Returns a new instance of ViTTinyBlockFFI.
45 46 47 48 49 50 51 52 53 54 55 56 57 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 45 def initialize @t_ln1_gamma = TinyNN.tnn_null_ptr @t_ln2_gamma = TinyNN.tnn_null_ptr @t_w_q = [TinyNN.tnn_null_ptr] @t_w_k = [TinyNN.tnn_null_ptr] @t_w_v = [TinyNN.tnn_null_ptr] @t_w_o = TinyNN.tnn_null_ptr @t_w_up = TinyNN.tnn_null_ptr @t_w_down = TinyNN.tnn_null_ptr @ft_weights = [TinyNN.tnn_null_ptr]; @ft_weights.pop @ft_m = [TinyNN.tnn_null_ptr]; @ft_m.pop @ft_v = [TinyNN.tnn_null_ptr]; @ft_v.pop end |
Instance Attribute Details
#ft_m ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def ft_m @ft_m end |
#ft_v ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def ft_v @ft_v end |
#ft_weights ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def ft_weights @ft_weights end |
#t_ln1_gamma ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_ln1_gamma @t_ln1_gamma end |
#t_ln2_gamma ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_ln2_gamma @t_ln2_gamma end |
#t_w_down ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_w_down @t_w_down end |
#t_w_k ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_w_k @t_w_k end |
#t_w_o ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_w_o @t_w_o end |
#t_w_q ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_w_q @t_w_q end |
#t_w_up ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_w_up @t_w_up end |
#t_w_v ⇒ Object
NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.
40 41 42 |
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40 def t_w_v @t_w_v end |