Class: FFNFFICache

Inherits:

Object

Object
FFNFFICache

show all

Defined in:: lib/toy/ffi/tinynn.rb

Overview

Persistent FFI cache for one transformer block’s FFN. Single ggml session holding the full chain ‘matmul -> gelu -> matmul`. Activations stay inside ggml between the two matmuls; only the three outputs (pre, hidden, out) are downloaded at the end.

Lazy-realized: T (sequence length) isn’t known until the first forward call. realize_for(t_seq, d_model, d_ff) sets up the graph; subsequent calls with the same T reuse it.

Operand layout: we feed matmul1 as ‘matmul(t_w1_t, t_h)` so its result has ne0=d_ff – which is the k-dim of matmul2 – so the chain doesn’t need an intermediate transpose. Downloads of all three result tensors are then a straight row-major memcpy.

Instance Attribute Summary collapse

#d_ff ⇒ Object

Returns the value of attribute d_ff.
#d_model ⇒ Object

Returns the value of attribute d_model.
#realized ⇒ Object

Returns the value of attribute realized.
#sess ⇒ Object

Returns the value of attribute sess.
#t_h ⇒ Object

Returns the value of attribute t_h.
#t_hidden ⇒ Object

Returns the value of attribute t_hidden.
#t_out ⇒ Object

Returns the value of attribute t_out.
#t_pre ⇒ Object

Returns the value of attribute t_pre.
#t_seq ⇒ Object

Returns the value of attribute t_seq.
#t_w1_t ⇒ Object

Returns the value of attribute t_w1_t.
#t_w2_t ⇒ Object

Returns the value of attribute t_w2_t.

Instance Method Summary collapse

#initialize ⇒ FFNFFICache constructor

A new instance of FFNFFICache.
#realize_for(t_seq, d_model, d_ff) ⇒ Object

Constructor Details

#initialize ⇒ `FFNFFICache`

Returns a new instance of FFNFFICache.

# File 'lib/toy/ffi/tinynn.rb', line 35

def initialize
  @realized = false
  @t_seq    = 0
  @d_model  = 0
  @d_ff     = 0
  # `:ptr` ivars seed with TinyNN.tnn_null_ptr (a typed NULL `void *`)
  # rather than `nil`. Post-spinel `85a4670`, mixing `nil` with `:ptr`
  # boxes the ivar as `sp_RbVal`, which then fails the `(void *)` cast
  # at every FFI call site downstream. The typed-NULL seed keeps the
  # ivar as plain `void *` end-to-end.
  @sess     = TinyNN.tnn_null_ptr
  @t_h      = TinyNN.tnn_null_ptr
  @t_w1_t   = TinyNN.tnn_null_ptr
  @t_w2_t   = TinyNN.tnn_null_ptr
  @t_pre    = TinyNN.tnn_null_ptr
  @t_hidden = TinyNN.tnn_null_ptr
  @t_out    = TinyNN.tnn_null_ptr
end

Instance Attribute Details

#d_ff ⇒ `Object`

Returns the value of attribute d_ff.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def d_ff
  @d_ff
end

#d_model ⇒ `Object`

Returns the value of attribute d_model.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def d_model
  @d_model
end

#realized ⇒ `Object`

Returns the value of attribute realized.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def realized
  @realized
end

#sess ⇒ `Object`

Returns the value of attribute sess.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def sess
  @sess
end

#t_h ⇒ `Object`

Returns the value of attribute t_h.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_h
  @t_h
end

#t_hidden ⇒ `Object`

Returns the value of attribute t_hidden.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_hidden
  @t_hidden
end

#t_out ⇒ `Object`

Returns the value of attribute t_out.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_out
  @t_out
end

#t_pre ⇒ `Object`

Returns the value of attribute t_pre.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_pre
  @t_pre
end

#t_seq ⇒ `Object`

Returns the value of attribute t_seq.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_seq
  @t_seq
end

#t_w1_t ⇒ `Object`

Returns the value of attribute t_w1_t.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_w1_t
  @t_w1_t
end

#t_w2_t ⇒ `Object`

Returns the value of attribute t_w2_t.



31
32
33

# File 'lib/toy/ffi/tinynn.rb', line 31

def t_w2_t
  @t_w2_t
end

Instance Method Details

#realize_for(t_seq, d_model, d_ff) ⇒ `Object`

# File 'lib/toy/ffi/tinynn.rb', line 54

def realize_for(t_seq, d_model, d_ff)
  @t_seq   = t_seq
  @d_model = d_model
  @d_ff    = d_ff

  @sess = TinyNN.tnn_session_new(0)
  # t_h:    ne=[d_model, T] -- h uploaded row-major (data[k] = h.flat[k]).
  # t_w1_t: ne=[d_model, d_ff] -- w1 uploaded transposed.
  # t_w2_t: ne=[d_ff, d_model] -- w2 uploaded transposed.
  @t_h    = TinyNN.tnn_input_2d_f32(@sess, t_seq,  d_model)
  @t_w1_t = TinyNN.tnn_input_2d_f32(@sess, d_ff,   d_model)
  @t_w2_t = TinyNN.tnn_input_2d_f32(@sess, d_model, d_ff)

  # Chain: mul_mat(w1_t, h) -> gelu -> mul_mat(w2_t, hidden).
  # Result shapes (ggml ne):  [d_ff, T] -> [d_ff, T] -> [d_model, T].
  @t_pre    = TinyNN.tnn_matmul(@sess, @t_w1_t, @t_h)
  @t_hidden = TinyNN.tnn_gelu(@sess, @t_pre)
  @t_out    = TinyNN.tnn_matmul(@sess, @t_w2_t, @t_hidden)
  # Mark intermediates as outputs so the scheduler doesn't alias
  # their buffers with later ops -- backward needs pre and hidden.
  TinyNN.tnn_set_output(@t_pre)
  TinyNN.tnn_set_output(@t_hidden)
  TinyNN.tnn_set_output(@t_out)
  TinyNN.tnn_realize(@sess, @t_out)

  @realized = true
end

Class: FFNFFICache

Overview

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize ⇒ FFNFFICache

Instance Attribute Details

#d_ff ⇒ Object

#d_model ⇒ Object

#realized ⇒ Object

#sess ⇒ Object

#t_h ⇒ Object

#t_hidden ⇒ Object

#t_out ⇒ Object

#t_pre ⇒ Object

#t_seq ⇒ Object

#t_w1_t ⇒ Object

#t_w2_t ⇒ Object