Module: Ignis::AI::Safetensors

Defined in:
lib/nnw/ai/safetensors.rb

Overview

Safetensors — pure Ruby loader/saver for the HuggingFace safetensors format.

Format:

8 bytes: LE uint64 header length
N bytes: JSON header {"tensor_name": {"dtype": "F16", "shape": [d1,d2], "data_offsets": [begin, end]}}
Remaining: raw tensor data (contiguous, little-endian, row-major)

Zero-copy: uses IO.read + direct cudaMemcpy to GPU.

Constant Summary collapse

DTYPE_MAP =

Dtype mapping: safetensors string → Ignis::Shared::NvArray symbol + Ruby unpack format

{
  "F16"   => { symbol: :float16,  bytes: 2, pack: "v*" },
  "BF16"  => { symbol: :bfloat16, bytes: 2, pack: "v*" },
  "F32"   => { symbol: :float32,  bytes: 4, pack: "e*" },
  "F64"   => { symbol: :float64,  bytes: 8, pack: "E*" },
  "I8"    => { symbol: :uint8,    bytes: 1, pack: "c*" },
  "I16"   => { symbol: :int32,    bytes: 2, pack: "s<*" },
  "I32"   => { symbol: :int32,    bytes: 4, pack: "l<*" },
  "I64"   => { symbol: :int64,    bytes: 8, pack: "q<*" },
  "U8"    => { symbol: :uint8,    bytes: 1, pack: "C*" },
  "BOOL"  => { symbol: :uint8,    bytes: 1, pack: "C*" }
}.freeze
MAX_HEADER_SIZE =

Maximum header size (100MB — DOS protection per spec)

100 * 1024 * 1024

Class Method Summary collapse

Class Method Details

.load(path, device_id: 0) ⇒ Hash{String => Tensor}

Load all tensors from a safetensors file.

Parameters:

  • path (String)

    path to .safetensors file

  • device_id (Integer) (defaults to: 0)

    target GPU device

Returns:

  • (Hash{String => Tensor})

    name → Tensor mapping

Raises:

  • (ArgumentError)


38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
# File 'lib/nnw/ai/safetensors.rb', line 38

def load(path, device_id: 0)
  raise ArgumentError, "File not found: #{path}" unless File.exist?(path)

  File.open(path, "rb") do |f|
    header_size_bytes = f.read(8)
    header_size = header_size_bytes.unpack1("Q<")

    raise "Header size #{header_size} exceeds maximum #{MAX_HEADER_SIZE}" if header_size > MAX_HEADER_SIZE

    header_json = f.read(header_size)
    header = JSON.parse(header_json)

    # Data section starts after 8 + header_size bytes
    data_offset = 8 + header_size

    tensors = {}

    header.each do |name, meta|
      next if name == "__metadata__"

      dtype_str = meta["dtype"]
      shape = meta["shape"]
      offsets = meta["data_offsets"]

      dtype_info = DTYPE_MAP[dtype_str]
      raise "Unsupported dtype: #{dtype_str}" unless dtype_info

      begin_offset = offsets[0]
      end_offset = offsets[1]
      byte_count = end_offset - begin_offset

      # Read raw bytes from file
      f.seek(data_offset + begin_offset)
      raw_bytes = f.read(byte_count)
      raise "Short read for #{name}: expected #{byte_count}, got #{raw_bytes&.length}" unless raw_bytes&.length == byte_count

      # Convert to NvArray on GPU
      nv_array = bytes_to_nv_array(raw_bytes, shape, dtype_info, device_id)
      tensors[name] = Tensor.new(data: nv_array, requires_grad: false)
    end

    tensors
  end
end

.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ NN::Module

Load tensors into a model using a weight map.

Parameters:

  • model (NN::Module)

    the model to load into

  • path (String)

    path to .safetensors file

  • weight_map (Hash{String => String}, nil) (defaults to: nil)

    HF name → Ignis name mapping

  • strict (Boolean) (defaults to: false)

    fail on missing/extra keys

  • device_id (Integer) (defaults to: 0)

Returns:



90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
# File 'lib/nnw/ai/safetensors.rb', line 90

def load_model(model, path, weight_map: nil, strict: false, device_id: 0)
  tensors = load(path, device_id: device_id)
  model_params = model.named_parameters

  loaded_count = 0
  skipped = []

  tensors.each do |name, tensor|
    mapped_name = weight_map ? (weight_map[name] || name) : name

    if model_params.key?(mapped_name)
      param = model_params[mapped_name]
      if param.shape == tensor.shape
        param.data.from_host(tensor.to_host)
        loaded_count += 1
      else
        Ignis.logger.warn("Shape mismatch for #{mapped_name}: " \
                           "model=#{param.shape} file=#{tensor.shape}")
        skipped << name
      end
    else
      skipped << name
    end
  end

  if strict
    missing = model_params.keys - tensors.keys.map { |n| weight_map ? (weight_map[n] || n) : n }
    unless missing.empty?
      raise KeyError, "Missing weights: #{missing.join(', ')}"
    end
  end

  Ignis.logger.info("Loaded #{loaded_count}/#{tensors.size} tensors, skipped #{skipped.size}")
  model
end

.save(tensors, path, metadata: nil) ⇒ void

This method returns an undefined value.

Save tensors to safetensors format.

Parameters:

  • tensors (Hash{String => Tensor})

    name → Tensor

  • path (String)

    output file path

  • metadata (Hash, nil) (defaults to: nil)

    optional metadata



131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
# File 'lib/nnw/ai/safetensors.rb', line 131

def save(tensors, path, metadata: nil)
  header = {}
  header["__metadata__"] =  if 

  # Calculate offsets
  current_offset = 0
  tensor_data = []

  tensors.each do |name, tensor|
    host_data = tensor.to_host
    dtype_str = nv_dtype_to_safetensors(tensor.dtype)
    bytes = host_values_to_bytes(host_data, tensor.dtype)
    byte_count = bytes.length

    header[name] = {
      "dtype"        => dtype_str,
      "shape"        => tensor.shape,
      "data_offsets" => [current_offset, current_offset + byte_count]
    }

    tensor_data << bytes
    current_offset += byte_count
  end

  header_json = JSON.generate(header)
  # Pad header to 8-byte alignment
  padding = (8 - (header_json.length % 8)) % 8
  header_json += " " * padding

  File.open(path, "wb") do |f|
    f.write([header_json.length].pack("Q<"))
    f.write(header_json)
    tensor_data.each { |d| f.write(d) }
  end
end