Class: Ignis::AI::Transformer::Model

Inherits:

NN::Module

Object
NN::Module
Ignis::AI::Transformer::Model

show all

Defined in:: lib/nnw/ai/transformer/model.rb

Overview

Full Transformer language model.

token_embedding → position_embedding → N × Block → LayerNorm → LM head

Factory methods provide standard model configurations:

.gpt2_small  → 124M params
.gpt2_medium → 345M params
.gpt2_large  → 774M params

Attributes inherited from NN::Module

#training

Class Method Summary collapse

.gpt2_large(device_id: 0) ⇒ Model

GPT-2 Large: 774M parameters.
.gpt2_medium(device_id: 0) ⇒ Model

GPT-2 Medium: 345M parameters.
.gpt2_small(device_id: 0) ⇒ Model

GPT-2 Small: 124M parameters.

Instance Method Summary collapse

#decode_step(token_id, cache) ⇒ Tensor

Incremental forward for ONE new token using a KV cache (decode path).
#forward(input_ids, mask: nil) ⇒ Tensor

Forward pass: returns logits.
#initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) ⇒ Model constructor

A new instance of Model.
#make_kv_cache(device_id: @device_id) ⇒ KVCache

Allocate a fresh KV cache sized for this model.
#to_s ⇒ String

Methods inherited from NN::Module

#call, #eval!, #load_state_dict, #named_parameters, #num_parameters, #parameters, #state_dict, #to, #train!, #zero_grad!

Constructor Details

#initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) ⇒ `Model`

Returns a new instance of Model.

Parameters:

vocab_size (Integer) —

vocabulary size
embed_dim (Integer) —

model dimension
num_heads (Integer) —

attention heads per block
num_layers (Integer) —

number of Transformer blocks
ff_dim (Integer) —

feed-forward hidden dimension
max_seq_len (Integer) —

maximum sequence length
dropout (Float) (defaults to: 0.0)
activation (Symbol) (defaults to: :gelu)
pre_norm (Boolean) (defaults to: true)
device_id (Integer) (defaults to: 0)

# File 'lib/nnw/ai/transformer/model.rb', line 28

def initialize(vocab_size:, embed_dim:, num_heads:, num_layers:,
               ff_dim:, max_seq_len:, dropout: 0.0,
               activation: :gelu, pre_norm: true, device_id: 0)
  super()
  @vocab_size = vocab_size
  @embed_dim = embed_dim
  @num_heads = num_heads
  @num_layers = num_layers
  @max_seq_len = max_seq_len
  @device_id = device_id

  @token_embedding = register_module("token_embedding",
                      NN::Embedding.new(vocab_size, embed_dim, device_id: device_id))
  @position_embedding = register_module("position_embedding",
                         NN::Embedding.new(max_seq_len, embed_dim, device_id: device_id))

  @blocks = []
  num_layers.times do |i|
    block = Block.new(embed_dim, num_heads, ff_dim,
                      dropout: dropout, pre_norm: pre_norm,
                      activation: activation, device_id: device_id)
    @blocks << register_module("blocks.#{i}", block)
  end

  @norm = register_module("norm", NN::LayerNorm.new(embed_dim, device_id: device_id))
  @head = register_module("head",
           NN::Linear.new(embed_dim, vocab_size, bias: false, device_id: device_id))
  @dropout = register_module("dropout", NN::Dropout.new(p: dropout))
end

Instance Attribute Details

#embed_dim ⇒ `Integer` (readonly)

Returns:

(Integer)



16
17
18

# File 'lib/nnw/ai/transformer/model.rb', line 16

def embed_dim
  @embed_dim
end

#max_seq_len ⇒ `Integer` (readonly)

Returns:

(Integer)



16
17
18

# File 'lib/nnw/ai/transformer/model.rb', line 16

def max_seq_len
  @max_seq_len
end

#num_heads ⇒ `Integer` (readonly)

Returns:

(Integer)



16
17
18

# File 'lib/nnw/ai/transformer/model.rb', line 16

def num_heads
  @num_heads
end

#num_layers ⇒ `Integer` (readonly)

Returns:

(Integer)



16
17
18

# File 'lib/nnw/ai/transformer/model.rb', line 16

def num_layers
  @num_layers
end

#vocab_size ⇒ `Integer` (readonly)

Returns:

(Integer)



16
17
18

# File 'lib/nnw/ai/transformer/model.rb', line 16

def vocab_size
  @vocab_size
end

Class Method Details

.gpt2_large(device_id: 0) ⇒ `Model`

GPT-2 Large: 774M parameters

Parameters:

device_id (Integer) (defaults to: 0)

Returns:

(Model)

# File 'lib/nnw/ai/transformer/model.rb', line 165

def self.gpt2_large(device_id: 0)
  new(
    vocab_size: 50257,
    embed_dim: 1280,
    num_heads: 20,
    num_layers: 36,
    ff_dim: 5120,
    max_seq_len: 1024,
    dropout: 0.1,
    activation: :gelu,
    pre_norm: true,
    device_id: device_id
  )
end

.gpt2_medium(device_id: 0) ⇒ `Model`

GPT-2 Medium: 345M parameters

Parameters:

device_id (Integer) (defaults to: 0)

Returns:

(Model)

# File 'lib/nnw/ai/transformer/model.rb', line 147

def self.gpt2_medium(device_id: 0)
  new(
    vocab_size: 50257,
    embed_dim: 1024,
    num_heads: 16,
    num_layers: 24,
    ff_dim: 4096,
    max_seq_len: 1024,
    dropout: 0.1,
    activation: :gelu,
    pre_norm: true,
    device_id: device_id
  )
end

.gpt2_small(device_id: 0) ⇒ `Model`

GPT-2 Small: 124M parameters

Parameters:

device_id (Integer) (defaults to: 0)

Returns:

(Model)

# File 'lib/nnw/ai/transformer/model.rb', line 129

def self.gpt2_small(device_id: 0)
  new(
    vocab_size: 50257,
    embed_dim: 768,
    num_heads: 12,
    num_layers: 12,
    ff_dim: 3072,
    max_seq_len: 1024,
    dropout: 0.1,
    activation: :gelu,
    pre_norm: true,
    device_id: device_id
  )
end

Instance Method Details

#decode_step(token_id, cache) ⇒ `Tensor`

Incremental forward for ONE new token using a KV cache (decode path). Equivalent to the last-position logits of a full forward over the whole prefix, but O(prefix) instead of O(prefix²): only this token is projected and embedded; its query attends over cached K/V. Must run under Tape.no_grad (no autograd). Append order matches the prefix order, so callers feed the prompt token-by-token before sampling.

Parameters:

token_id (Integer) —

the new token’s id
cache (KVCache)

Returns:

(Tensor) —

logits [1, vocab]

# File 'lib/nnw/ai/transformer/model.rb', line 107

def decode_step(token_id, cache)
  pos = cache.length
  raise "KVCache full: position #{pos} exceeds max_seq_len #{@max_seq_len}" if pos >= @max_seq_len

  tok = Tensor.from_host([token_id], shape: [1], dtype: :int32, device_id: @device_id)
  pos_t = Tensor.from_host([pos], shape: [1], dtype: :int32, device_id: @device_id)

  x = @token_embedding.call(tok) + @position_embedding.call(pos_t) # [1, embed]
  @blocks.each_with_index { |block, i| x = block.decode_step(x, cache, i) }
  cache.advance!

  x = @norm.call(x)
  @head.call(x) # [1, vocab]
end

#forward(input_ids, mask: nil) ⇒ `Tensor`

Forward pass: returns logits.

Parameters:

input_ids (Tensor) —

token indices [batch_size, seq_len] (int32)
mask (Tensor, nil) (defaults to: nil) —

attention mask

Returns:

(Tensor) —

logits [batch_size * seq_len, vocab_size]

# File 'lib/nnw/ai/transformer/model.rb', line 62

def forward(input_ids, mask: nil)
  seq_len = input_ids.shape[-1]

  # Create position indices
  positions_data = (0...seq_len).to_a
  pos_nv = Ignis::Shared::NvArray.new(shape: [seq_len], dtype: :int32,
                                     device_id: input_ids.device_id)
  pos_nv.from_host(positions_data)
  positions = Tensor.new(data: pos_nv, requires_grad: false)

  # Embeddings
  tok_emb = @token_embedding.call(input_ids)   # [batch, seq, embed]
  pos_emb = @position_embedding.call(positions) # [seq, embed]

  # Combine and dropout
  x = tok_emb + pos_emb
  x = @dropout.call(x)

  # Transformer blocks
  @blocks.each do |block|
    x = block.call(x, mask: mask)
  end

  # Final norm and LM head
  x = @norm.call(x)
  @head.call(x)  # → logits [batch*seq, vocab]
end

#make_kv_cache(device_id: @device_id) ⇒ `KVCache`

Allocate a fresh KV cache sized for this model.

Parameters:

device_id (Integer) (defaults to: @device_id)

Returns:

(KVCache)

# File 'lib/nnw/ai/transformer/model.rb', line 93

def make_kv_cache(device_id: @device_id)
  KVCache.new(num_layers: @num_layers, max_seq_len: @max_seq_len,
              embed_dim: @embed_dim, device_id: device_id)
end

#to_s ⇒ `String`

Returns:

(String)

# File 'lib/nnw/ai/transformer/model.rb', line 181

def to_s
  "TransformerModel(vocab=#{@vocab_size}, embed=#{@embed_dim}, " \
  "heads=#{@num_heads}, layers=#{@num_layers}, " \
  "params=#{num_parameters})"
end

Class: Ignis::AI::Transformer::Model

Overview

Instance Attribute Summary collapse

Attributes inherited from NN::Module

Class Method Summary collapse

Instance Method Summary collapse

Methods inherited from NN::Module

Constructor Details

#initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) ⇒ Model

Instance Attribute Details

#embed_dim ⇒ Integer (readonly)

#max_seq_len ⇒ Integer (readonly)

#num_heads ⇒ Integer (readonly)

#num_layers ⇒ Integer (readonly)

#vocab_size ⇒ Integer (readonly)

Class Method Details

.gpt2_large(device_id: 0) ⇒ Model

.gpt2_medium(device_id: 0) ⇒ Model

.gpt2_small(device_id: 0) ⇒ Model

Instance Method Details

#decode_step(token_id, cache) ⇒ Tensor

#forward(input_ids, mask: nil) ⇒ Tensor

#make_kv_cache(device_id: @device_id) ⇒ KVCache

#to_s ⇒ String

#initialize(vocab_size:, embed_dim:, num_heads:, num_layers:, ff_dim:, max_seq_len:, dropout: 0.0, activation: :gelu, pre_norm: true, device_id: 0) ⇒ `Model`

#embed_dim ⇒ `Integer` (readonly)

#max_seq_len ⇒ `Integer` (readonly)

#num_heads ⇒ `Integer` (readonly)

#num_layers ⇒ `Integer` (readonly)

#vocab_size ⇒ `Integer` (readonly)

.gpt2_large(device_id: 0) ⇒ `Model`

.gpt2_medium(device_id: 0) ⇒ `Model`

.gpt2_small(device_id: 0) ⇒ `Model`

#decode_step(token_id, cache) ⇒ `Tensor`

#forward(input_ids, mask: nil) ⇒ `Tensor`

#make_kv_cache(device_id: @device_id) ⇒ `KVCache`

#to_s ⇒ `String`