Module: Ignis::CUDA::DeviceProperties

Extended by:
FFI::Library
Defined in:
lib/nvruby/cuda/device_props.rb

Overview

cudaDeviceProp struct for FFI binding.

This covers the essential fields. For the full struct, use raw_props.

Defined Under Namespace

Classes: CudaDeviceProp

Constant Summary collapse

CUDART_PATH =

Resolve CUDA runtime library path per platform.

if defined?(Ignis::Platform)
  Ignis::Platform.cudart_path
elsif RUBY_PLATFORM.match?(/mswin|mingw|cygwin/i)
  File.join('C:', 'Program Files', 'NVIDIA GPU Computing Toolkit',
            'CUDA', 'v13.0', 'bin', 'cudart64_130.dll')
else
  'libcudart.so.13'
end

Class Method Summary collapse

Class Method Details

.get(device_id) ⇒ CudaDeviceProp

Get device properties for a given device index.

Parameters:

  • device_id (Integer)

    GPU device index

Returns:

Raises:

  • (RuntimeError)

    if the CUDA call fails



159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
# File 'lib/nvruby/cuda/device_props.rb', line 159

def self.get(device_id)
  prop = CudaDeviceProp.new

  if respond_to?(:cudaGetDeviceProperties_v2)
    status = cudaGetDeviceProperties_v2(prop.pointer, device_id)
  elsif respond_to?(:cudaGetDeviceProperties)
    status = cudaGetDeviceProperties(prop.pointer, device_id)
  else
    raise 'cudaGetDeviceProperties not available'
  end

  raise "cudaGetDeviceProperties failed with status #{status}" unless status.zero?

  prop
end

.summary(device_id) ⇒ Hash

Helper to extract a human-readable property hash.

Parameters:

  • device_id (Integer)

Returns:

  • (Hash)


179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
# File 'lib/nvruby/cuda/device_props.rb', line 179

def self.summary(device_id)
  prop = get(device_id)
  {
    name: prop[:name].to_s.strip,
    compute_capability: "#{prop[:major]}.#{prop[:minor]}",
    total_memory_mb: (prop[:totalGlobalMem] / (1024.0 * 1024.0)).round(1),
    multiprocessor_count: prop[:multiProcessorCount],
    max_threads_per_block: prop[:maxThreadsPerBlock],
    warp_size: prop[:warpSize],
    memory_clock_mhz: prop[:memoryClockRate] / 1000,
    memory_bus_width: prop[:memoryBusWidth],
    l2_cache_kb: prop[:l2CacheSize] / 1024,
    ecc_enabled: prop[:ECCEnabled] != 0,
    unified_addressing: prop[:unifiedAddressing] != 0,
    managed_memory: prop[:managedMemory] != 0,
    cooperative_launch: prop[:cooperativeLaunch] != 0,
    gpu_direct_rdma: prop[:gpuDirectRDMASupported] != 0,
    memory_pools: prop[:memoryPoolsSupported] != 0,
    cluster_launch: prop[:clusterLaunch] != 0
  }
end