Class: Ignis::AI::Transformer::Model
- Inherits:
-
NN::Module
- Object
- NN::Module
- Ignis::AI::Transformer::Model
- Defined in:
- lib/nnw/ai/transformer/model.rb
Overview
Full Transformer language model.
token_embedding → position_embedding → N × Block → LayerNorm → LM head
Factory methods provide standard model configurations:
.gpt2_small → 124M params
.gpt2_medium → 345M params
.gpt2_large → 774M params
Instance Attribute Summary collapse
- #embed_dim ⇒ Integer readonly
- #max_seq_len ⇒ Integer readonly
- #num_heads ⇒ Integer readonly
- #num_layers ⇒ Integer readonly
- #vocab_size ⇒ Integer readonly
Attributes inherited from NN::Module
Class Method Summary collapse
-
.gpt2_large(device_id: 0) ⇒ Model
GPT-2 Large: 774M parameters.
-
.gpt2_medium(device_id: 0) ⇒ Model
GPT-2 Medium: 345M parameters.
-
.gpt2_small(device_id: 0) ⇒ Model
GPT-2 Small: 124M parameters.
Instance Method Summary collapse
-
#decode_step(token_id, cache) ⇒ Tensor
Incremental forward for ONE new token using a KV cache (decode path).
-
#forward(input_ids, mask: nil) ⇒ Tensor
Forward pass: returns logits.
-
#initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) ⇒ Model
constructor
A new instance of Model.
-
#make_kv_cache(device_id: @device_id) ⇒ KVCache
Allocate a fresh KV cache sized for this model.
- #to_s ⇒ String
Methods inherited from NN::Module
#call, #eval!, #load_state_dict, #named_parameters, #num_parameters, #parameters, #state_dict, #to, #train!, #zero_grad!
Constructor Details
#initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) ⇒ Model
Returns a new instance of Model.
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
# File 'lib/nnw/ai/transformer/model.rb', line 28 def initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) super() @vocab_size = vocab_size @embed_dim = @num_heads = num_heads @num_layers = num_layers @max_seq_len = max_seq_len @device_id = device_id @token_embedding = register_module("token_embedding", NN::Embedding.new(vocab_size, , device_id: device_id)) @position_embedding = register_module("position_embedding", NN::Embedding.new(max_seq_len, , device_id: device_id)) @blocks = [] num_layers.times do |i| block = Block.new(, num_heads, ff_dim, dropout: dropout, pre_norm: pre_norm, activation: activation, device_id: device_id) @blocks << register_module("blocks.#{i}", block) end @norm = register_module("norm", NN::LayerNorm.new(, device_id: device_id)) @head = register_module("head", NN::Linear.new(, vocab_size, bias: false, device_id: device_id)) @dropout = register_module("dropout", NN::Dropout.new(p: dropout)) end |
Instance Attribute Details
#embed_dim ⇒ Integer (readonly)
16 17 18 |
# File 'lib/nnw/ai/transformer/model.rb', line 16 def @embed_dim end |
#max_seq_len ⇒ Integer (readonly)
16 17 18 |
# File 'lib/nnw/ai/transformer/model.rb', line 16 def max_seq_len @max_seq_len end |
#num_heads ⇒ Integer (readonly)
16 17 18 |
# File 'lib/nnw/ai/transformer/model.rb', line 16 def num_heads @num_heads end |
#num_layers ⇒ Integer (readonly)
16 17 18 |
# File 'lib/nnw/ai/transformer/model.rb', line 16 def num_layers @num_layers end |
#vocab_size ⇒ Integer (readonly)
16 17 18 |
# File 'lib/nnw/ai/transformer/model.rb', line 16 def vocab_size @vocab_size end |
Class Method Details
.gpt2_large(device_id: 0) ⇒ Model
GPT-2 Large: 774M parameters
165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
# File 'lib/nnw/ai/transformer/model.rb', line 165 def self.gpt2_large(device_id: 0) new( vocab_size: 50257, embed_dim: 1280, num_heads: 20, num_layers: 36, ff_dim: 5120, max_seq_len: 1024, dropout: 0.1, activation: :gelu, pre_norm: true, device_id: device_id ) end |
.gpt2_medium(device_id: 0) ⇒ Model
GPT-2 Medium: 345M parameters
147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
# File 'lib/nnw/ai/transformer/model.rb', line 147 def self.gpt2_medium(device_id: 0) new( vocab_size: 50257, embed_dim: 1024, num_heads: 16, num_layers: 24, ff_dim: 4096, max_seq_len: 1024, dropout: 0.1, activation: :gelu, pre_norm: true, device_id: device_id ) end |
.gpt2_small(device_id: 0) ⇒ Model
GPT-2 Small: 124M parameters
129 130 131 132 133 134 135 136 137 138 139 140 141 142 |
# File 'lib/nnw/ai/transformer/model.rb', line 129 def self.gpt2_small(device_id: 0) new( vocab_size: 50257, embed_dim: 768, num_heads: 12, num_layers: 12, ff_dim: 3072, max_seq_len: 1024, dropout: 0.1, activation: :gelu, pre_norm: true, device_id: device_id ) end |
Instance Method Details
#decode_step(token_id, cache) ⇒ Tensor
Incremental forward for ONE new token using a KV cache (decode path). Equivalent to the last-position logits of a full forward over the whole prefix, but O(prefix) instead of O(prefix²): only this token is projected and embedded; its query attends over cached K/V. Must run under Tape.no_grad (no autograd). Append order matches the prefix order, so callers feed the prompt token-by-token before sampling.
107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
# File 'lib/nnw/ai/transformer/model.rb', line 107 def decode_step(token_id, cache) pos = cache.length raise "KVCache full: position #{pos} exceeds max_seq_len #{@max_seq_len}" if pos >= @max_seq_len tok = Tensor.from_host([token_id], shape: [1], dtype: :int32, device_id: @device_id) pos_t = Tensor.from_host([pos], shape: [1], dtype: :int32, device_id: @device_id) x = @token_embedding.call(tok) + @position_embedding.call(pos_t) # [1, embed] @blocks.each_with_index { |block, i| x = block.decode_step(x, cache, i) } cache.advance! x = @norm.call(x) @head.call(x) # [1, vocab] end |
#forward(input_ids, mask: nil) ⇒ Tensor
Forward pass: returns logits.
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
# File 'lib/nnw/ai/transformer/model.rb', line 62 def forward(input_ids, mask: nil) seq_len = input_ids.shape[-1] # Create position indices positions_data = (0...seq_len).to_a pos_nv = Ignis::Shared::NvArray.new(shape: [seq_len], dtype: :int32, device_id: input_ids.device_id) pos_nv.from_host(positions_data) positions = Tensor.new(data: pos_nv, requires_grad: false) # Embeddings tok_emb = @token_embedding.call(input_ids) # [batch, seq, embed] pos_emb = @position_embedding.call(positions) # [seq, embed] # Combine and dropout x = tok_emb + pos_emb x = @dropout.call(x) # Transformer blocks @blocks.each do |block| x = block.call(x, mask: mask) end # Final norm and LM head x = @norm.call(x) @head.call(x) # → logits [batch*seq, vocab] end |
#make_kv_cache(device_id: @device_id) ⇒ KVCache
Allocate a fresh KV cache sized for this model.
93 94 95 96 |
# File 'lib/nnw/ai/transformer/model.rb', line 93 def make_kv_cache(device_id: @device_id) KVCache.new(num_layers: @num_layers, max_seq_len: @max_seq_len, embed_dim: @embed_dim, device_id: device_id) end |
#to_s ⇒ String
181 182 183 184 185 |
# File 'lib/nnw/ai/transformer/model.rb', line 181 def to_s "TransformerModel(vocab=#{@vocab_size}, embed=#{@embed_dim}, " \ "heads=#{@num_heads}, layers=#{@num_layers}, " \ "params=#{num_parameters})" end |