Module: Ignis::AI::LlamaLoader

Defined in:: lib/nnw/ai/llama_loader.rb

Overview

LlamaLoader — load HuggingFace Llama-family checkpoints (Llama-3.x, and other LlamaForCausalLM models: SmolLM, TinyLlama, …) into a Transformer::ModernModel.

Reads config.json to size the model (RoPE base/scaling, GQA head counts, RMSNorm eps, tied embeddings), then loads model.safetensors, dequantizing bf16 weights to fp32 on-device. HF’s nn.Linear weights are [out, in] and applied as x·Wᵀ — the SAME convention as Ignis::AI::NN::Linear — so weights map across with no transpose (unlike GPT-2’s Conv1D).

Class Method Summary collapse

.from_pretrained(dir, device_id: 0) ⇒ Transformer::ModernModel

Build a ModernModel from config.json and load its weights.
.load(model, dir, device_id: 0) ⇒ Integer

Load weights from dir/model.safetensors into an existing ModernModel.

Class Method Details

.from_pretrained(dir, device_id: 0) ⇒ `Transformer::ModernModel`

Build a ModernModel from config.json and load its weights.

Parameters:

dir (String) —

directory containing config.json + model.safetensors
device_id (Integer) (defaults to: 0)

Returns:

(Transformer::ModernModel)

# File 'lib/nnw/ai/llama_loader.rb', line 21

def from_pretrained(dir, device_id: 0)
  cfg = JSON.parse(File.read(File.join(dir, "config.json")))
  model = Transformer::ModernModel.new(
    vocab_size:   cfg["vocab_size"],
    embed_dim:    cfg["hidden_size"],
    num_heads:    cfg["num_attention_heads"],
    num_kv_heads: cfg["num_key_value_heads"] || cfg["num_attention_heads"],
    num_layers:   cfg["num_hidden_layers"],
    ff_dim:       cfg["intermediate_size"],
    max_seq_len:  cfg["max_position_embeddings"],
    rope_base:    (cfg["rope_theta"] || 10000.0).to_f,
    rope_scaling: cfg["rope_scaling"],
    head_dim:     cfg["head_dim"],
    eps:          (cfg["rms_norm_eps"] || 1e-5).to_f,
    device_id:    device_id
  )
  load(model, dir, device_id: device_id)
  model
end

.load(model, dir, device_id: 0) ⇒ `Integer`

Load weights from dir/model.safetensors into an existing ModernModel.