Module: Toy::LLM::Primitives::ScalableSoftmax

Defined in:
lib/toy/llm/primitives/scalable_softmax.rb

Constant Summary collapse

NAME =
:scalable_softmax

Class Method Summary collapse

Class Method Details

.attend(sess, scores, mask, ssmax_scale, max_bias) ⇒ Object

SSMax-scaled softmax over attention scores. scores is the raw q·kᵀmap; mask the additive attention mask handle (or null); ssmax_scale the block’s precomputed (1/sqrt(d))*s*log(n) Float; max_bias the ggml soft_max_ext ALiBi slope (0.0 when unused). Returns the attention-weight map. (Plain softmax falls out when ssmax_scale is the ordinary 1/sqrt(d) — so this also covers vanilla attention.)



31
32
33
# File 'lib/toy/llm/primitives/scalable_softmax.rb', line 31

def self.attend(sess, scores, mask, ssmax_scale, max_bias)
  TinyNN.tnn_soft_max_ext(sess, scores, mask, ssmax_scale, max_bias)
end