Module: Ignis::AI::Safetensors

Defined in:: lib/nnw/ai/safetensors.rb

Overview

Safetensors — pure Ruby loader/saver for the HuggingFace safetensors format.

Format:

8 bytes: LE uint64 header length
N bytes: JSON header {"tensor_name": {"dtype": "F16", "shape": [d1,d2], "data_offsets": [begin, end]}}
Remaining: raw tensor data (contiguous, little-endian, row-major)

Zero-copy: uses IO.read + direct cudaMemcpy to GPU.

Constant Summary collapse

DTYPE_MAP = Dtype mapping: safetensors string → Ignis::Shared::NvArray symbol + Ruby unpack format

{
  "F16"   => { symbol: :float16,  bytes: 2, pack: "v*" },
  "BF16"  => { symbol: :bfloat16, bytes: 2, pack: "v*" },
  "F32"   => { symbol: :float32,  bytes: 4, pack: "e*" },
  "F64"   => { symbol: :float64,  bytes: 8, pack: "E*" },
  "I8"    => { symbol: :uint8,    bytes: 1, pack: "c*" },
  "I16"   => { symbol: :int32,    bytes: 2, pack: "s<*" },
  "I32"   => { symbol: :int32,    bytes: 4, pack: "l<*" },
  "I64"   => { symbol: :int64,    bytes: 8, pack: "q<*" },
  "U8"    => { symbol: :uint8,    bytes: 1, pack: "C*" },
  "BOOL"  => { symbol: :uint8,    bytes: 1, pack: "C*" }
}.freeze

MAX_HEADER_SIZE = Maximum header size (100MB — DOS protection per spec)

100 * 1024 * 1024

Class Method Summary collapse

.load(path, device_id: 0) ⇒ Hash{String => Tensor}

Load all tensors from a safetensors file.
.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ NN::Module

Load tensors into a model using a weight map.
.save(tensors, path, metadata: nil) ⇒ void

Save tensors to safetensors format.

Class Method Details

.load(path, device_id: 0) ⇒ `Hash{String => Tensor}`

Load all tensors from a safetensors file.

Parameters:

path (String) —

path to .safetensors file
device_id (Integer) (defaults to: 0) —

target GPU device

Returns:

(Hash{String => Tensor}) —

name → Tensor mapping

Raises:

(ArgumentError)

# File 'lib/nnw/ai/safetensors.rb', line 38

def load(path, device_id: 0)
  raise ArgumentError, "File not found: #{path}" unless File.exist?(path)

  File.open(path, "rb") do |f|
    header_size_bytes = f.read(8)
    header_size = header_size_bytes.unpack1("Q<")

    raise "Header size #{header_size} exceeds maximum #{MAX_HEADER_SIZE}" if header_size > MAX_HEADER_SIZE

    header_json = f.read(header_size)
    header = JSON.parse(header_json)

    # Data section starts after 8 + header_size bytes
    data_offset = 8 + header_size

    tensors = {}

    header.each do |name, meta|
      next if name == "__metadata__"

      dtype_str = meta["dtype"]
      shape = meta["shape"]
      offsets = meta["data_offsets"]

      dtype_info = DTYPE_MAP[dtype_str]
      raise "Unsupported dtype: #{dtype_str}" unless dtype_info

      begin_offset = offsets[0]
      end_offset = offsets[1]
      byte_count = end_offset - begin_offset

      # Read raw bytes from file
      f.seek(data_offset + begin_offset)
      raw_bytes = f.read(byte_count)
      raise "Short read for #{name}: expected #{byte_count}, got #{raw_bytes&.length}" unless raw_bytes&.length == byte_count

      # Convert to NvArray on GPU
      nv_array = bytes_to_nv_array(raw_bytes, shape, dtype_info, device_id)
      tensors[name] = Tensor.new(data: nv_array, requires_grad: false)
    end

    tensors
  end
end

.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ `NN::Module`

Load tensors into a model using a weight map.

Parameters:

model (NN::Module) —

the model to load into
path (String) —

path to .safetensors file
weight_map (Hash{String => String}, nil) (defaults to: nil) —

HF name → Ignis name mapping
strict (Boolean) (defaults to: false) —

fail on missing/extra keys
device_id (Integer) (defaults to: 0)

Returns:

(NN::Module)

# File 'lib/nnw/ai/safetensors.rb', line 90

def load_model(model, path, weight_map: nil, strict: false, device_id: 0)
  tensors = load(path, device_id: device_id)
  model_params = model.named_parameters

  loaded_count = 0
  skipped = []

  tensors.each do |name, tensor|
    mapped_name = weight_map ? (weight_map[name] || name) : name

    if model_params.key?(mapped_name)
      param = model_params[mapped_name]
      if param.shape == tensor.shape
        param.data.from_host(tensor.to_host)
        loaded_count += 1
      else
        Ignis.logger.warn("Shape mismatch for #{mapped_name}: " \
                           "model=#{param.shape} file=#{tensor.shape}")
        skipped << name
      end
    else
      skipped << name
    end
  end

  if strict
    missing = model_params.keys - tensors.keys.map { |n| weight_map ? (weight_map[n] || n) : n }
    unless missing.empty?
      raise KeyError, "Missing weights: #{missing.join(', ')}"
    end
  end

  Ignis.logger.info("Loaded #{loaded_count}/#{tensors.size} tensors, skipped #{skipped.size}")
  model
end

.save(tensors, path, metadata: nil) ⇒ `void`

This method returns an undefined value.

Save tensors to safetensors format.

Parameters:

tensors (Hash{String => Tensor}) —

name → Tensor
path (String) —

output file path
metadata (Hash, nil) (defaults to: nil) —

optional metadata

# File 'lib/nnw/ai/safetensors.rb', line 131

def save(tensors, path, metadata: nil)
  header = {}
  header["__metadata__"] = metadata if metadata

  # Calculate offsets
  current_offset = 0
  tensor_data = []

  tensors.each do |name, tensor|
    host_data = tensor.to_host
    dtype_str = nv_dtype_to_safetensors(tensor.dtype)
    bytes = host_values_to_bytes(host_data, tensor.dtype)
    byte_count = bytes.length

    header[name] = {
      "dtype"        => dtype_str,
      "shape"        => tensor.shape,
      "data_offsets" => [current_offset, current_offset + byte_count]
    }

    tensor_data << bytes
    current_offset += byte_count
  end

  header_json = JSON.generate(header)
  # Pad header to 8-byte alignment
  padding = (8 - (header_json.length % 8)) % 8
  header_json += " " * padding

  File.open(path, "wb") do |f|
    f.write([header_json.length].pack("Q<"))
    f.write(header_json)
    tensor_data.each { |d| f.write(d) }
  end
end

Module: Ignis::AI::Safetensors

Overview

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.load(path, device_id: 0) ⇒ Hash{String => Tensor}

.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ NN::Module

.save(tensors, path, metadata: nil) ⇒ void

.load(path, device_id: 0) ⇒ `Hash{String => Tensor}`

.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ `NN::Module`

.save(tensors, path, metadata: nil) ⇒ `void`