Class: Toy::LLM::Engine::ViTTinyBlockFFI

Inherits:
Object
  • Object
show all
Defined in:
lib/toy/llm/engine/vit_tiny_engine.rb

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeViTTinyBlockFFI

Returns a new instance of ViTTinyBlockFFI.



45
46
47
48
49
50
51
52
53
54
55
56
57
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 45

def initialize
  @t_ln1_gamma = TinyNN.tnn_null_ptr
  @t_ln2_gamma = TinyNN.tnn_null_ptr
  @t_w_q = [TinyNN.tnn_null_ptr]
  @t_w_k = [TinyNN.tnn_null_ptr]
  @t_w_v = [TinyNN.tnn_null_ptr]
  @t_w_o    = TinyNN.tnn_null_ptr
  @t_w_up   = TinyNN.tnn_null_ptr
  @t_w_down = TinyNN.tnn_null_ptr
  @ft_weights = [TinyNN.tnn_null_ptr]; @ft_weights.pop
  @ft_m       = [TinyNN.tnn_null_ptr]; @ft_m.pop
  @ft_v       = [TinyNN.tnn_null_ptr]; @ft_v.pop
end

Instance Attribute Details

#ft_mObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def ft_m
  @ft_m
end

#ft_vObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def ft_v
  @ft_v
end

#ft_weightsObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def ft_weights
  @ft_weights
end

#t_ln1_gammaObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_ln1_gamma
  @t_ln1_gamma
end

#t_ln2_gammaObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_ln2_gamma
  @t_ln2_gamma
end

#t_w_downObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_w_down
  @t_w_down
end

#t_w_kObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_w_k
  @t_w_k
end

#t_w_oObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_w_o
  @t_w_o
end

#t_w_qObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_w_q
  @t_w_q
end

#t_w_upObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_w_up
  @t_w_up
end

#t_w_vObject

NOTE on norm choice: ggml has no backward for GGML_OP_NORM (LayerNorm); only GGML_OP_RMS_NORM has a registered backward. We use RMSNorm in the training path. This is a documented ViT variant (e.g. timm’s ‘vit_giant_patch14_clip_224` uses LayerScale + variants; LayerNorm/RMSNorm both train; LayerNorm is the timm default but not load-bearing for the recipe). For E1 acceptance (loss decreases on a memorisation smoke) this swap is fine. The follow-up to ship a vendored LayerNorm backward — and the timm-loader compatibility that goes with it — is its own issue.



40
41
42
# File 'lib/toy/llm/engine/vit_tiny_engine.rb', line 40

def t_w_v
  @t_w_v
end