Module: Ignis::MathDx

Defined in:
lib/nvruby/mathdx.rb,
lib/nvruby/mathdx/fft_kernel.rb,
lib/nvruby/mathdx/gemm_kernel.rb

Overview

MathDx module for device-side extensions Provides high-performance fused kernel generation using cuBLASDx and cuFFTDx

Note: MathDx libraries (cuBLASDx, cuFFTDx) are header-only C++ template libraries that generate optimized device code at compile time. This module generates CUDA C++ source code that uses MathDx and compiles it via NVRTC.

Examples:

Fused GEMM + epilog kernel

kernel = Ignis::MathDx::GemmKernel.new(
  m: 64, n: 64, k: 64,
  dtype: :float16,
  epilog: :relu
)
kernel.compile!
kernel.execute(a, b, c)

Defined Under Namespace

Classes: FftKernel, GemmKernel, StateError

Constant Summary collapse

MATHDX_PATH =

MathDx installation path (set via environment or default)

ENV.fetch("MATHDX_PATH", "C:/Program Files/NVIDIA/MathDx")
SUPPORTED_SM_VERSIONS =

Supported compute capabilities for MathDx

[70, 75, 80, 86, 89, 90].freeze

Class Method Summary collapse

Class Method Details

.available?Boolean

Check if MathDx SDK is available

Returns:

  • (Boolean)


31
32
33
34
35
36
37
# File 'lib/nvruby/mathdx.rb', line 31

def self.available?
  cublasdx_header = File.join(MATHDX_PATH, "include", "cublasdx.hpp")
  cufftdx_header = File.join(MATHDX_PATH, "include", "cufftdx.hpp")
  File.exist?(cublasdx_header) || File.exist?(cufftdx_header)
rescue StandardError
  false
end

.fft_kernel(size:, dtype: :complex64, direction: :forward, elements_per_thread: 8) ⇒ FftKernel

Create a fused FFT kernel with device-side MathDx acceleration

Parameters:

  • size (Integer)

    FFT size

  • dtype (Symbol) (defaults to: :complex64)

    Data type (:complex64, :complex128)

  • direction (Symbol) (defaults to: :forward)

    Direction (:forward, :inverse)

  • elements_per_thread (Integer) (defaults to: 8)

    Elements per thread

Returns:



67
68
69
70
# File 'lib/nvruby/mathdx.rb', line 67

def fft_kernel(size:, dtype: :complex64, direction: :forward, elements_per_thread: 8)
  FftKernel.new(size: size, dtype: dtype, direction: direction,
                elements_per_thread: elements_per_thread)
end

.gemm_kernel(m:, n:, k:, dtype: :float32, epilog: nil, block_size: 128) ⇒ GemmKernel

Create a fused GEMM kernel with device-side MathDx acceleration

Parameters:

  • m (Integer)

    Output rows

  • n (Integer)

    Output columns

  • k (Integer)

    Inner dimension

  • dtype (Symbol) (defaults to: :float32)

    Data type (:float16, :float32, :float64)

  • epilog (Symbol, nil) (defaults to: nil)

    Epilog operation (:relu, :gelu, :sigmoid, nil)

  • block_size (Integer) (defaults to: 128)

    Thread block size

Returns:



57
58
59
# File 'lib/nvruby/mathdx.rb', line 57

def gemm_kernel(m:, n:, k:, dtype: :float32, epilog: nil, block_size: 128)
  GemmKernel.new(m: m, n: n, k: k, dtype: dtype, epilog: epilog, block_size: block_size)
end

.include_pathString

Get MathDx include path for NVRTC compilation

Returns:

  • (String)


41
42
43
# File 'lib/nvruby/mathdx.rb', line 41

def self.include_path
  File.join(MATHDX_PATH, "include")
end