Module: ToySample

Defined in:: lib/toy/train/toy_sample.rb

Class Method Summary collapse

._read_vocab(path) ⇒ Object

Read a vocab file (one token per line; line index = token ID).
.detokenize(ids, vocab_path) ⇒ Object

IDs → space-separated word string via vocab file.
.emit_event(prompt_text, text, step, t_now) ⇒ Object

Emit one toy/v1 sample event.
.greedy_decode(sess, t_logits, t_tokens, t_positions, prompt_ids, n_new, context, vocab_size) ⇒ Object

Greedy autoregressive decode using a realized forward graph.

Class Method Details

._read_vocab(path) ⇒ `Object`

Read a vocab file (one token per line; line index = token ID). File.read is the natural “whole file as one string” primitive here; the historical Spinel d59926a interaction with File.open- with-block + FFI-init is fixed as of a03bb49. Pure while-loop walk over the split lines (no .each) keeps Array<String> dispatch monomorphic inside this module.

# File 'lib/toy/train/toy_sample.rb', line 80

def self._read_vocab(path)
  raw = File.read(path)
  parts = raw.split("\n")
  out = ["?"]; out.pop
  i = 0
  while i < parts.length
    out.push(parts[i])
    i = i + 1
  end
  out
end

.detokenize(ids, vocab_path) ⇒ `Object`

IDs → space-separated word string via vocab file. Out-of-range IDs render as <UNK:N>; never crashes.

# File 'lib/toy/train/toy_sample.rb', line 94

def self.detokenize(ids, vocab_path)
  vocab = _read_vocab(vocab_path)
  out = ""
  i = 0
  while i < ids.length
    if i > 0; out = out + " "; end
    tid = ids[i]
    if tid >= 0 && tid < vocab.length
      out = out + vocab[tid]
    else
      out = out + "<UNK:" + tid.to_s + ">"
    end
    i = i + 1
  end
  out
end

.emit_event(prompt_text, text, step, t_now) ⇒ `Object`

Emit one toy/v1 sample event. Caller hands in already-detokenized prompt + text strings, the macro step, and the wall-time tick. SpinelKit::Json::Builder.j_str escapes the prompt/text bodies (replacing the old local json_escape — which missed r/b/f and control bytes).

# File 'lib/toy/train/toy_sample.rb', line 115

def self.emit_event(prompt_text, text, step, t_now)
  e = SpinelKit::Json::Builder.new
  e.add_str("kind",  "sample")
  e.add_str("phase", "decode")
  e.add_num("t",      t_now)
  e.add_num("step",   step)
  e.add_str("prompt", prompt_text)
  e.add_str("text",   text)
  TinyNN.tnn_events_emit(e.dump)
end

.greedy_decode(sess, t_logits, t_tokens, t_positions, prompt_ids, n_new, context, vocab_size) ⇒ `Object`

Greedy autoregressive decode using a realized forward graph.

The caller supplies the realized cache’s token-IDs / positions / logits tensors. We run the forward graph once per decode step: upload [prompt + decoded-so-far + zero-padding], compute, argmax logits[cur_len - 1, :], append, repeat.

Returns the full sequence as Array<Int> (length prompt_len + n_new). Stops early if the context window fills up.

# File 'lib/toy/train/toy_sample.rb', line 28

def self.greedy_decode(sess, t_logits, t_tokens, t_positions,
                         prompt_ids, n_new, context, vocab_size)
  tokens    = [0]; tokens.pop
  positions = [0]; positions.pop
  # Seed [prompt + zero-pad to context]. Positions are 0..context-1
  # (RoPE encodes each slot's position independently).
  pk = 0
  while pk < prompt_ids.length
    tokens.push(prompt_ids[pk])
    positions.push(pk)
    pk = pk + 1
  end
  while tokens.length < context
    tokens.push(0)
    positions.push(tokens.length - 1)
  end

  n_done = 0
  while n_done < n_new
    cur_len = prompt_ids.length + n_done
    if cur_len >= context
      return tokens[0...cur_len]
    end
    TinyNN.upload_int_array(sess, t_tokens,    tokens)
    TinyNN.upload_int_array(sess, t_positions, positions)
    TinyNN.tnn_compute(sess)
    # logits ggml shape [vocab, T] → row-major Mat is [T, vocab].
    logits = TinyNN.download_row_major(sess, t_logits, context, vocab_size)
    base = (cur_len - 1) * vocab_size
    best_v = 0
    best_l = logits.flat[base]
    v = 1
    while v < vocab_size
      x = logits.flat[base + v]
      if x > best_l
        best_l = x
        best_v = v
      end
      v = v + 1
    end
    tokens[cur_len] = best_v
    n_done = n_done + 1
  end
  tokens[0...(prompt_ids.length + n_new)]
end

Module: ToySample

Class Method Summary collapse

Class Method Details

._read_vocab(path) ⇒ Object

.detokenize(ids, vocab_path) ⇒ Object

.emit_event(prompt_text, text, step, t_now) ⇒ Object

.greedy_decode(sess, t_logits, t_tokens, t_positions, prompt_ids, n_new, context, vocab_size) ⇒ Object

._read_vocab(path) ⇒ `Object`

.detokenize(ids, vocab_path) ⇒ `Object`

.emit_event(prompt_text, text, step, t_now) ⇒ `Object`

.greedy_decode(sess, t_logits, t_tokens, t_positions, prompt_ids, n_new, context, vocab_size) ⇒ `Object`