Module: Toy::LLM::Primitives::ScalableSoftmax
- Defined in:
- lib/toy/llm/primitives/scalable_softmax.rb
Constant Summary collapse
- NAME =
:scalable_softmax
Class Method Summary collapse
-
.attend(sess, scores, mask, ssmax_scale, max_bias) ⇒ Object
SSMax-scaled softmax over attention scores.
Class Method Details
.attend(sess, scores, mask, ssmax_scale, max_bias) ⇒ Object
SSMax-scaled softmax over attention scores. scores is the raw q·kᵀmap; mask the additive attention mask handle (or null); ssmax_scale the block’s precomputed (1/sqrt(d))*s*log(n) Float; max_bias the ggml soft_max_ext ALiBi slope (0.0 when unused). Returns the attention-weight map. (Plain softmax falls out when ssmax_scale is the ordinary 1/sqrt(d) — so this also covers vanilla attention.)
31 32 33 |
# File 'lib/toy/llm/primitives/scalable_softmax.rb', line 31 def self.attend(sess, scores, mask, ssmax_scale, max_bias) TinyNN.tnn_soft_max_ext(sess, scores, mask, ssmax_scale, max_bias) end |