Class: Ignis::CUDA::Memory

Inherits:
Object
  • Object
show all
Defined in:
lib/nvruby/cuda/memory.rb

Overview

Manages GPU device memory allocation and transfers.

Refactored to use Fiddle-based RuntimeAPI hot-path methods. No FFI::MemoryPointer usage — all pointers via Fiddle::Pointer.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(size, device: nil, ptr: nil, owned: true) ⇒ Memory

Returns a new instance of Memory.

Parameters:

  • size (Integer)

    Size in bytes to allocate

  • device (Device, Integer, nil) (defaults to: nil)

    Device to allocate on (nil for current)

  • ptr (Fiddle::Pointer, nil) (defaults to: nil)

    Existing pointer to wrap (nil to allocate new)

  • owned (Boolean) (defaults to: true)

    Whether this object owns the memory and should free it



25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# File 'lib/nvruby/cuda/memory.rb', line 25

def initialize(size, device: nil, ptr: nil, owned: true)
  @size = size
  @device_index = resolve_device_index(device)
  @owned = owned

  if ptr
    @device_ptr = ptr.is_a?(Fiddle::Pointer) ? ptr : Fiddle::Pointer.new(ptr.to_i)
  else
    @device_ptr = allocate_device_memory
  end

  @freed = false

  if @owned
    captured_ptr = @device_ptr
    captured_device = @device_index
    ObjectSpace.define_finalizer(self, self.class.release_finalizer(captured_ptr, captured_device))
  end
end

Instance Attribute Details

#device_indexInteger (readonly)

Returns Device index.

Returns:

  • (Integer)

    Device index



19
20
21
# File 'lib/nvruby/cuda/memory.rb', line 19

def device_index
  @device_index
end

#device_ptrFiddle::Pointer (readonly)

Returns Device pointer.

Returns:

  • (Fiddle::Pointer)

    Device pointer



13
14
15
# File 'lib/nvruby/cuda/memory.rb', line 13

def device_ptr
  @device_ptr
end

#sizeInteger (readonly)

Returns Size in bytes.

Returns:

  • (Integer)

    Size in bytes



16
17
18
# File 'lib/nvruby/cuda/memory.rb', line 16

def size
  @size
end

Class Method Details

.release_finalizer(ptr, device_index) ⇒ Proc

Create a finalizer proc for releasing device memory.

Parameters:

  • ptr (Fiddle::Pointer)

    Device pointer to free

  • device_index (Integer)

    Device the memory is on

Returns:

  • (Proc)


230
231
232
233
234
235
236
237
238
239
240
241
242
243
# File 'lib/nvruby/cuda/memory.rb', line 230

def release_finalizer(ptr, device_index)
  ptr_addr = ptr.to_i
  proc do
    begin
      RuntimeAPI.ensure_loaded!
      current = RuntimeAPI.get_device
      RuntimeAPI.set_device(device_index) if current != device_index
      RuntimeAPI.free(Fiddle::Pointer.new(ptr_addr))
      RuntimeAPI.set_device(current) if current != device_index
    rescue StandardError
      # Silently ignore errors during finalization
    end
  end
end

Instance Method Details

#addressInteger

Returns raw address.

Returns:

  • (Integer)

    raw address



62
63
64
# File 'lib/nvruby/cuda/memory.rb', line 62

def address
  @device_ptr.to_i
end

#copy_from_device(source, count: nil, stream: nil) ⇒ void

This method returns an undefined value.

Copy data from another device memory.

Parameters:

  • source (Memory)

    Source device memory

  • count (Integer, nil) (defaults to: nil)

    Number of bytes to copy

  • stream (Stream, nil) (defaults to: nil)

    Optional stream for async copy

Raises:



162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# File 'lib/nvruby/cuda/memory.rb', line 162

def copy_from_device(source, count: nil, stream: nil)
  raise MemoryError, 'Memory has been freed' if @freed
  raise MemoryError, 'Source memory has been freed' if source.freed?

  count ||= [source.size, @size].min
  raise MemoryError, "Copy count #{count} exceeds allocation size #{@size}" if count > @size

  RuntimeAPI.ensure_loaded!

  if stream
    RuntimeAPI.memcpy_async(
      @device_ptr, source.device_ptr, count,
      RuntimeAPI::MEMCPY_DEVICE_TO_DEVICE, stream.to_ptr
    )
  else
    RuntimeAPI.memcpy(
      @device_ptr, source.device_ptr, count,
      RuntimeAPI::MEMCPY_DEVICE_TO_DEVICE
    )
  end
end

#copy_from_host(host_data, count: nil, stream: nil) ⇒ void

This method returns an undefined value.

Copy data from host to device.

Parameters:

  • host_data (Fiddle::Pointer, String)

    Source data

  • count (Integer, nil) (defaults to: nil)

    Number of bytes to copy (defaults to size)

  • stream (Stream, nil) (defaults to: nil)

    Optional stream for async copy

Raises:



91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
# File 'lib/nvruby/cuda/memory.rb', line 91

def copy_from_host(host_data, count: nil, stream: nil)
  raise MemoryError, 'Memory has been freed' if @freed

  count ||= @size
  raise MemoryError, "Copy count #{count} exceeds allocation size #{@size}" if count > @size

  RuntimeAPI.ensure_loaded!
  host_ptr = prepare_host_pointer(host_data)

  ensure_correct_device do
    if stream
      RuntimeAPI.memcpy_async(
        @device_ptr, host_ptr, count,
        RuntimeAPI::MEMCPY_HOST_TO_DEVICE, stream.to_ptr
      )
    else
      RuntimeAPI.memcpy(
        @device_ptr, host_ptr, count,
        RuntimeAPI::MEMCPY_HOST_TO_DEVICE
      )
    end
  end
end

#copy_to_host(host_buffer: nil, count: nil, stream: nil) ⇒ Fiddle::Pointer

Copy data from device to host.

Parameters:

  • host_buffer (Fiddle::Pointer, nil) (defaults to: nil)

    Destination buffer (created if nil)

  • count (Integer, nil) (defaults to: nil)

    Number of bytes to copy (defaults to size)

  • stream (Stream, nil) (defaults to: nil)

    Optional stream for async copy

Returns:

  • (Fiddle::Pointer)

    Host buffer with data

Raises:



120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
# File 'lib/nvruby/cuda/memory.rb', line 120

def copy_to_host(host_buffer: nil, count: nil, stream: nil)
  raise MemoryError, 'Memory has been freed' if @freed

  count ||= @size
  raise MemoryError, "Copy count #{count} exceeds allocation size #{@size}" if count > @size

  RuntimeAPI.ensure_loaded!
  host_buffer ||= Fiddle::Pointer.malloc(count)

  # The destination may be an FFI::MemoryPointer (NvArray host buffer);
  # bridge it to a Fiddle address for the Fiddle-based memcpy, but return
  # the original object so the caller can read it back with its own API.
  dst_ptr = if host_buffer.is_a?(Fiddle::Pointer)
              host_buffer
            elsif host_buffer.respond_to?(:address)
              Fiddle::Pointer.new(host_buffer.address)
            else
              host_buffer
            end

  ensure_correct_device do
    if stream
      RuntimeAPI.memcpy_async(
        dst_ptr, @device_ptr, count,
        RuntimeAPI::MEMCPY_DEVICE_TO_HOST, stream.to_ptr
      )
    else
      RuntimeAPI.memcpy(
        dst_ptr, @device_ptr, count,
        RuntimeAPI::MEMCPY_DEVICE_TO_HOST
      )
    end
  end

  host_buffer
end

#ffi_ptrFFI::Pointer

Device pointer wrapped as an FFI::Pointer.

The hot path (this class’ own cudaMemcpy etc.) is Fiddle-based, but the CUDA-X library bindings (cuBLAS/cuSOLVER/cuFFT/cuRAND/cuSPARSE) are FFI and cannot accept a Fiddle::Pointer. Use this when handing a device buffer to an FFI-bound library call.

Returns:

  • (FFI::Pointer)


57
58
59
# File 'lib/nvruby/cuda/memory.rb', line 57

def ffi_ptr
  FFI::Pointer.new(@device_ptr.to_i)
end

#free!void

This method returns an undefined value.

Free the device memory.



73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/nvruby/cuda/memory.rb', line 73

def free!
  return if @freed
  return unless @owned

  RuntimeAPI.ensure_loaded!
  ensure_correct_device do
    RuntimeAPI.free(@device_ptr)
  end

  @freed = true
  ObjectSpace.undefine_finalizer(self)
end

#freed?Boolean

Returns:

  • (Boolean)


67
68
69
# File 'lib/nvruby/cuda/memory.rb', line 67

def freed?
  @freed
end

#inspectString

Returns:

  • (String)


220
221
222
223
# File 'lib/nvruby/cuda/memory.rb', line 220

def inspect
  "#<Ignis::CUDA::Memory:#{object_id} size=#{@size} device=#{@device_index} " \
    "ptr=0x#{@device_ptr.to_i.to_s(16)} freed=#{@freed}>"
end

#memset(value, count: nil, stream: nil) ⇒ void

This method returns an undefined value.

Set memory to a value.

Parameters:

  • value (Integer)

    Byte value to set (0-255)

  • count (Integer, nil) (defaults to: nil)

    Number of bytes to set (defaults to size)

  • stream (Stream, nil) (defaults to: nil)

    Optional stream for async operation

Raises:



189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
# File 'lib/nvruby/cuda/memory.rb', line 189

def memset(value, count: nil, stream: nil)
  raise MemoryError, 'Memory has been freed' if @freed

  count ||= @size
  raise MemoryError, "Memset count #{count} exceeds allocation size #{@size}" if count > @size

  RuntimeAPI.ensure_loaded!

  ensure_correct_device do
    if stream
      RuntimeAPI.memset_async(@device_ptr, value, count, stream.to_ptr)
    else
      RuntimeAPI.memset(@device_ptr, value, count)
    end
  end
end

#to_ptrFiddle::Pointer

Returns pointer for interop.

Returns:

  • (Fiddle::Pointer)

    pointer for interop



46
47
48
# File 'lib/nvruby/cuda/memory.rb', line 46

def to_ptr
  @device_ptr
end

#to_sString

Returns:

  • (String)


214
215
216
217
# File 'lib/nvruby/cuda/memory.rb', line 214

def to_s
  status = @freed ? 'freed' : 'allocated'
  "DeviceMemory[#{@size} bytes, device #{@device_index}, #{status}]"
end

#zero!(stream: nil) ⇒ void

This method returns an undefined value.

Zero out the memory.

Parameters:

  • stream (Stream, nil) (defaults to: nil)

    Optional stream



209
210
211
# File 'lib/nvruby/cuda/memory.rb', line 209

def zero!(stream: nil)
  memset(0, stream: stream)
end