Module: Ignis::AI::Safetensors
- Defined in:
- lib/nnw/ai/safetensors.rb
Overview
Safetensors — pure Ruby loader/saver for the HuggingFace safetensors format.
Format:
8 bytes: LE uint64 header length
N bytes: JSON header {"tensor_name": {"dtype": "F16", "shape": [d1,d2], "data_offsets": [begin, end]}}
Remaining: raw tensor data (contiguous, little-endian, row-major)
Zero-copy: uses IO.read + direct cudaMemcpy to GPU.
Constant Summary collapse
- DTYPE_MAP =
Dtype mapping: safetensors string → Ignis::Shared::NvArray symbol + Ruby unpack format
{ "F16" => { symbol: :float16, bytes: 2, pack: "v*" }, "BF16" => { symbol: :bfloat16, bytes: 2, pack: "v*" }, "F32" => { symbol: :float32, bytes: 4, pack: "e*" }, "F64" => { symbol: :float64, bytes: 8, pack: "E*" }, "I8" => { symbol: :uint8, bytes: 1, pack: "c*" }, "I16" => { symbol: :int32, bytes: 2, pack: "s<*" }, "I32" => { symbol: :int32, bytes: 4, pack: "l<*" }, "I64" => { symbol: :int64, bytes: 8, pack: "q<*" }, "U8" => { symbol: :uint8, bytes: 1, pack: "C*" }, "BOOL" => { symbol: :uint8, bytes: 1, pack: "C*" } }.freeze
- MAX_HEADER_SIZE =
Maximum header size (100MB — DOS protection per spec)
100 * 1024 * 1024
Class Method Summary collapse
-
.load(path, device_id: 0) ⇒ Hash{String => Tensor}
Load all tensors from a safetensors file.
-
.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ NN::Module
Load tensors into a model using a weight map.
-
.save(tensors, path, metadata: nil) ⇒ void
Save tensors to safetensors format.
Class Method Details
.load(path, device_id: 0) ⇒ Hash{String => Tensor}
Load all tensors from a safetensors file.
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/nnw/ai/safetensors.rb', line 38 def load(path, device_id: 0) raise ArgumentError, "File not found: #{path}" unless File.exist?(path) File.open(path, "rb") do |f| header_size_bytes = f.read(8) header_size = header_size_bytes.unpack1("Q<") raise "Header size #{header_size} exceeds maximum #{MAX_HEADER_SIZE}" if header_size > MAX_HEADER_SIZE header_json = f.read(header_size) header = JSON.parse(header_json) # Data section starts after 8 + header_size bytes data_offset = 8 + header_size tensors = {} header.each do |name, | next if name == "__metadata__" dtype_str = ["dtype"] shape = ["shape"] offsets = ["data_offsets"] dtype_info = DTYPE_MAP[dtype_str] raise "Unsupported dtype: #{dtype_str}" unless dtype_info begin_offset = offsets[0] end_offset = offsets[1] byte_count = end_offset - begin_offset # Read raw bytes from file f.seek(data_offset + begin_offset) raw_bytes = f.read(byte_count) raise "Short read for #{name}: expected #{byte_count}, got #{raw_bytes&.length}" unless raw_bytes&.length == byte_count # Convert to NvArray on GPU nv_array = bytes_to_nv_array(raw_bytes, shape, dtype_info, device_id) tensors[name] = Tensor.new(data: nv_array, requires_grad: false) end tensors end end |
.load_model(model, path, weight_map: nil, strict: false, device_id: 0) ⇒ NN::Module
Load tensors into a model using a weight map.
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/nnw/ai/safetensors.rb', line 90 def load_model(model, path, weight_map: nil, strict: false, device_id: 0) tensors = load(path, device_id: device_id) model_params = model.named_parameters loaded_count = 0 skipped = [] tensors.each do |name, tensor| mapped_name = weight_map ? (weight_map[name] || name) : name if model_params.key?(mapped_name) param = model_params[mapped_name] if param.shape == tensor.shape param.data.from_host(tensor.to_host) loaded_count += 1 else Ignis.logger.warn("Shape mismatch for #{mapped_name}: " \ "model=#{param.shape} file=#{tensor.shape}") skipped << name end else skipped << name end end if strict missing = model_params.keys - tensors.keys.map { |n| weight_map ? (weight_map[n] || n) : n } unless missing.empty? raise KeyError, "Missing weights: #{missing.join(', ')}" end end Ignis.logger.info("Loaded #{loaded_count}/#{tensors.size} tensors, skipped #{skipped.size}") model end |
.save(tensors, path, metadata: nil) ⇒ void
This method returns an undefined value.
Save tensors to safetensors format.
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
# File 'lib/nnw/ai/safetensors.rb', line 131 def save(tensors, path, metadata: nil) header = {} header["__metadata__"] = if # Calculate offsets current_offset = 0 tensor_data = [] tensors.each do |name, tensor| host_data = tensor.to_host dtype_str = nv_dtype_to_safetensors(tensor.dtype) bytes = host_values_to_bytes(host_data, tensor.dtype) byte_count = bytes.length header[name] = { "dtype" => dtype_str, "shape" => tensor.shape, "data_offsets" => [current_offset, current_offset + byte_count] } tensor_data << bytes current_offset += byte_count end header_json = JSON.generate(header) # Pad header to 8-byte alignment padding = (8 - (header_json.length % 8)) % 8 header_json += " " * padding File.open(path, "wb") do |f| f.write([header_json.length].pack("Q<")) f.write(header_json) tensor_data.each { |d| f.write(d) } end end |