Class: Ignis::AI::Tape

Inherits:

Object

Object
Ignis::AI::Tape

show all

Defined in:: lib/nnw/ai/tape.rb

Overview

Tape — fiber-local reverse-mode automatic differentiation.

Each Ruby fiber/thread gets its own tape. Operations record backward functions during forward pass. backward! does topological sort and reverse walk to compute gradients.

Examples:

a = Tensor.from_host([2.0], shape: [1], requires_grad: true)
b = a * a
b.backward!
a.grad.to_host  # => [4.0]

Defined Under Namespace

Classes: Entry

Constant Summary collapse

TAPE_KEY = Thread-local tape key

:nnw_ai_tape

NO_GRAD_KEY =

:nnw_ai_no_grad

Class Method Summary collapse

.backward!(tensor, grad_output) ⇒ void

Run reverse-mode AD from a tensor.
.clear! ⇒ void

Clear current thread’s tape.
.current_tape ⇒ Array<Entry>

Get current thread’s tape.
.gradient_checkpoint(inputs) { ... } ⇒ Tensor

Gradient checkpointing: recompute activations during backward.
.no_grad { ... } ⇒ Object

Disable gradient recording inside block.
.no_grad_active? ⇒ Boolean

Check if no_grad is currently active.
.record(output, inputs:) {|Ignis::Shared::NvArray| ... } ⇒ void

Record an operation on the tape.
.size ⇒ Integer

Get tape size (for debugging).

Class Method Details

.backward!(tensor, grad_output) ⇒ `void`

This method returns an undefined value.

Run reverse-mode AD from a tensor.

Parameters:

tensor (Tensor) —

the output tensor to differentiate
grad_output (Ignis::Shared::NvArray) —

initial gradient

# File 'lib/nnw/ai/tape.rb', line 44

def backward!(tensor, grad_output)
  tape = current_tape
  return if tape.empty?

  # Build a map of tensor object_id → accumulated gradient (NvArray).
  # This is the single source of truth during the reverse walk; leaf
  # .grad is written ONCE afterwards. Writing both during the walk caused
  # double-counting when a leaf was reused (e.g. x in x*x): grad_map[x]
  # and x.grad aliased the same buffer, so the second occurrence
  # accumulated into it twice.
  grad_map = {}
  grad_map[tensor.object_id] = grad_output
  leaves = {} # object_id => leaf Tensor that received gradient

  # Buffers grad_map EXCLUSIVELY OWNS, tracked by Ruby object identity.
  # accumulate_grads! mutates its dst in place, so the tape must never
  # store or accumulate a buffer that another grad_map entry also
  # references — an in-place add would silently corrupt the aliased entry.
  # Backward closures are free to return aliased buffers (e.g. `+` returns
  # [grad, grad]; `-` returns [grad, neg_grad] reusing the upstream grad).
  # We clone on the way in to restore exclusive ownership. Clones happen
  # ONLY on these aliasing paths; the common case (fresh buffer per input)
  # never clones.
  owned = {}.compare_by_identity
  owned[grad_output] = true

  # Walk tape in reverse order (topological by construction)
  tape.reverse_each do |entry|
    output = entry.output
    output_grad = grad_map[output.object_id]
    next unless output_grad

    # Call backward function to get input gradients
    input_grads = entry.backward_fn.call(output_grad)

    # Accumulate gradients for each input
    entry.inputs.each_with_index do |input_tensor, i|
      next unless input_tensor.requires_grad
      input_grad = input_grads[i]
      next unless input_grad

      if grad_map.key?(input_tensor.object_id)
        dst = grad_map[input_tensor.object_id]
        # Never accumulate a buffer into itself (would compute 2*dst):
        # clone so we add a snapshot of src's current value.
        src = input_grad.equal?(dst) ? input_grad.clone : input_grad
        accumulate_grads!(dst, src)
      else
        # Take exclusive ownership. If this exact buffer is already owned
        # by another entry (the aliasing case), clone before storing.
        input_grad = input_grad.clone if owned[input_grad]
        grad_map[input_tensor.object_id] = input_grad
        owned[input_grad] = true
      end

      leaves[input_tensor.object_id] = input_tensor if input_tensor.is_leaf
    end
  end

  # Assign accumulated gradients to leaf tensors. Accumulate into any
  # pre-existing .grad so gradient accumulation across multiple
  # backward! calls (e.g. micro-batching) still works.
  leaves.each do |oid, leaf|
    g = grad_map[oid]
    next unless g

    if leaf.grad && !leaf.grad.equal?(g)
      accumulate_grads!(leaf.grad, g)
    else
      leaf.grad = g
    end
  end

  # Clear tape after backward (each backward is a fresh computation)
  clear!
end

.clear! ⇒ `void`

This method returns an undefined value.

Clear current thread’s tape.



175
176
177

# File 'lib/nnw/ai/tape.rb', line 175

def clear!
  Thread.current[TAPE_KEY] = []
end

.current_tape ⇒ `Array<Entry>`

Get current thread’s tape.

Returns:

(Array<Entry>)



169
170
171

# File 'lib/nnw/ai/tape.rb', line 169

def current_tape
  Thread.current[TAPE_KEY] ||= []
end

.gradient_checkpoint(inputs) { ... } ⇒ `Tensor`

Gradient checkpointing: recompute activations during backward. Stores only inputs + output. Reruns forward in backward pass. Critical for large models on 12GB VRAM.

Parameters:

inputs (Array<Tensor>) —

input tensors to save

Yields:

block that computes the forward pass

Returns:

(Tensor) —

the output tensor

# File 'lib/nnw/ai/tape.rb', line 146

def gradient_checkpoint(inputs, &forward_fn)
  # Run forward with no_grad to avoid double recording
  output = no_grad { forward_fn.call }

  # Record a special tape entry that recomputes forward in backward
  if output.requires_grad
    saved_inputs = inputs.map { |t| t.data }
    record(output, inputs: inputs) do |grad|
      # Recompute forward pass to get intermediate values
      recomputed = forward_fn.call
      # Now the tape has entries for this recomputation
      # Run backward on the recomputed output
      Tape.backward!(recomputed, grad)
      # Collect input gradients
      inputs.map { |t| t.grad }
    end
  end

  output
end

.no_grad { ... } ⇒ `Object`

Disable gradient recording inside block.

Yields:

block where no gradients are recorded

Returns:

(Object) —

block return value

# File 'lib/nnw/ai/tape.rb', line 124

def no_grad(&block)
  prev = Thread.current[NO_GRAD_KEY]
  Thread.current[NO_GRAD_KEY] = true
  begin
    block.call
  ensure
    Thread.current[NO_GRAD_KEY] = prev
  end
end

.no_grad_active? ⇒ `Boolean`

Check if no_grad is currently active.

Returns:

(Boolean)



136
137
138

# File 'lib/nnw/ai/tape.rb', line 136

def no_grad_active?
  Thread.current[NO_GRAD_KEY] == true
end

.record(output, inputs:) {|Ignis::Shared::NvArray| ... } ⇒ `void`

This method returns an undefined value.

Record an operation on the tape.

Parameters:

output (Tensor) —

the result tensor
inputs (Array<Tensor>) —

input tensors

Yields:

(Ignis::Shared::NvArray) —

receives gradient, must return Array of NvArrays

# File 'lib/nnw/ai/tape.rb', line 30

def record(output, inputs:, &backward_fn)
  return if no_grad_active?
  return unless output.requires_grad

  tape = current_tape
  entry = Entry.new(output: output, inputs: inputs, backward_fn: backward_fn)
  output._tape_id = tape.length
  tape << entry
end

.size ⇒ `Integer`

Get tape size (for debugging).

Returns:

(Integer)



181
182
183

# File 'lib/nnw/ai/tape.rb', line 181

def size
  current_tape.length
end

Class: Ignis::AI::Tape

Overview

Examples:

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.backward!(tensor, grad_output) ⇒ void

.clear! ⇒ void

.current_tape ⇒ Array<Entry>

.gradient_checkpoint(inputs) { ... } ⇒ Tensor

.no_grad { ... } ⇒ Object

.no_grad_active? ⇒ Boolean

.record(output, inputs:) {|Ignis::Shared::NvArray| ... } ⇒ void

.size ⇒ Integer

.backward!(tensor, grad_output) ⇒ `void`

.clear! ⇒ `void`

.current_tape ⇒ `Array<Entry>`

.gradient_checkpoint(inputs) { ... } ⇒ `Tensor`

.no_grad { ... } ⇒ `Object`

.no_grad_active? ⇒ `Boolean`

.record(output, inputs:) {|Ignis::Shared::NvArray| ... } ⇒ `void`

.size ⇒ `Integer`