Module: Ignis::LinAlg::Matmul

Defined in:: lib/nvruby/linalg/matmul.rb

Overview

Matrix multiplication operations using cuBLAS

Class Method Summary collapse

.call(a, b, c: nil, alpha: 1.0, beta: 0.0, transpose_a: false, transpose_b: false, stream: nil) ⇒ NvArray

Perform matrix multiplication: C = alpha * A @ B + beta * C.
.call_with_algo(a, b, algo, c: nil, alpha: 1.0, beta: 0.0, transpose_a: false, transpose_b: false, stream: nil) ⇒ Object

Perform GEMM with a specific algorithm (useful for benchmarking/perf tuning).
.matmul(a, b) ⇒ NvArray

Shorthand for matrix multiplication: C = A @ B.

Class Method Details

.call(a, b, c: nil, alpha: 1.0, beta: 0.0, transpose_a: false, transpose_b: false, stream: nil) ⇒ `NvArray`

Perform matrix multiplication: C = alpha * A @ B + beta * C

Parameters:

a (NvArray) —

Left matrix
b (NvArray) —

Right matrix
c (NvArray, nil) (defaults to: nil) —

Output matrix (created if nil)
alpha (Float) (defaults to: 1.0) —

Scaling factor for A @ B
beta (Float) (defaults to: 0.0) —

Scaling factor for C
transpose_a (Boolean) (defaults to: false) —

Transpose A
transpose_b (Boolean) (defaults to: false) —

Transpose B
stream (CUDA::Stream, nil) (defaults to: nil) —

CUDA stream

Returns:

(NvArray) —

Result matrix

Raises:

(DimensionError)

# File 'lib/nvruby/linalg/matmul.rb', line 18

def call(a, b, c: nil, alpha: 1.0, beta: 0.0, transpose_a: false, transpose_b: false, stream: nil)
  validate_inputs!(a, b)

  # Get dimensions
  m, k1, k2, n = compute_dimensions(a, b, transpose_a, transpose_b)
  raise DimensionError, "Matrix dimensions incompatible: A(#{a.shape}) @ B(#{b.shape})" unless k1 == k2

  # Ensure arrays are on device
  a = a.to_device unless a.on_device?
  b = b.to_device unless b.on_device?

  # Create or validate output
  c = prepare_output(c, m, n, a)
  c = c.to_device unless c.on_device?

  # Perform GEMM
  execute_gemm(a, b, c, m, n, k1, alpha, beta, transpose_a, transpose_b, stream)

  c
end

.call_with_algo(a, b, algo, c: nil, alpha: 1.0, beta: 0.0, transpose_a: false, transpose_b: false, stream: nil) ⇒ `Object`

Perform GEMM with a specific algorithm (useful for benchmarking/perf tuning)

Raises:

(DimensionError)

# File 'lib/nvruby/linalg/matmul.rb', line 48

def call_with_algo(a, b, algo, c: nil, alpha: 1.0, beta: 0.0, transpose_a: false, transpose_b: false, stream: nil)
  validate_inputs!(a, b)
  m, k1, k2, n = compute_dimensions(a, b, transpose_a, transpose_b)
  raise DimensionError, "Matrix dimensions incompatible" unless k1 == k2

  a = a.to_device unless a.on_device?
  b = b.to_device unless b.on_device?
  c = prepare_output(c, m, n, a)
  c = c.to_device unless c.on_device?

  CuBLASBindings.ensure_loaded!
  handle = CuBLASBindings.get_handle

  if stream
    CuBLASBindings.check_status!(CuBLASBindings.cublasSetStream_v2(handle, stream.handle))
  end

  op_a = transpose_b ? CuBLASBindings::CUBLAS_OP_T : CuBLASBindings::CUBLAS_OP_N
  op_b = transpose_a ? CuBLASBindings::CUBLAS_OP_T : CuBLASBindings::CUBLAS_OP_N

  lda = b.shape[1]
  ldb = a.shape[1]
  ldc = n

  status = execute_gemmex(handle, op_a, op_b, n, m, k1, alpha, a, b, c, lda, ldb, ldc, algo, beta)
  CuBLASBindings.check_status!(status, "GEMM execution with algorithm #{algo}")
  c
end

.matmul(a, b) ⇒ `NvArray`

Shorthand for matrix multiplication: C = A @ B