Class: Rubino::Jobs::Handlers::DistillSkillJob

Inherits:
Object
  • Object
show all
Defined in:
lib/rubino/jobs/handlers/distill_skill_job.rb

Overview

Variant B — deterministic post-turn skill distillation.

Enqueued from Interaction::Lifecycle#enqueue_post_turn_jobs alongside ExtractMemoryJob. The GATE is fully deterministic (no model call):

- the run produced a non-empty final assistant answer (succeeded), AND
- the turn used >= TOOL_THRESHOLD tool calls (mirrors the reference "5+"), AND
- no existing skill already covers the work (kept simple here:
  no skill whose name/description shares a salient keyword with the
  user's task — a fresh skills dir always passes).

Only on a gate-PASS do we spend ONE auxiliary-model call to distil the just-finished transcript into a SKILL.md candidate, which we then write. So: +1 LLM call per gate-pass, 0 otherwise.

Constant Summary collapse

TOOL_THRESHOLD =
Integer(ENV.fetch("RA_DISTILL_TOOL_THRESHOLD", "5"))
NAME_RE =
/\A[a-z0-9]+(?:-[a-z0-9]+)*\z/
STOPWORDS =

Common 4+-char English / dev words that carry no topical signal. A single one of these overlapping (“file”, “code”, “this”, “with”, “rails” sitting in a skill description) must NOT count as coverage —that single-shared-word rule (#368) over-suppressed legitimately distinct tasks (“deploy workflow for Rails” suppressed by the word “rails” appearing in ruby-expert’s description).

%w[
  this that with from your into about make made using used will would
  should could have has had been being does done when then than them
  they their there here what which while also some such only just like
  want need help please thing things file files code line lines step
  steps task tasks work works call calls user users data text time
  name names show list find each both more most less very much many
  good well over under again same other else type kind sort
].to_set.freeze
COVERAGE_JACCARD =

Coverage requires MEANINGFUL overlap (#368), not a single shared word: a name-level match, OR salient stopword-filtered tokens overlapping by at least COVERAGE_JACCARD with at least MIN_SHARED_SALIENT shared tokens.

0.4
MIN_SHARED_SALIENT =
2
DISTILL_SYSTEM =
<<~SYS
  You distil a just-finished agent task into a REUSABLE skill, or decline.
  You are given the user's task and a transcript of the tools the agent ran
  and its final answer. If — and only if — the work was a complex, multi-step,
  REPEATABLE procedure that would help future similar tasks, output a skill.
  If it was trivial, one-off, or not generalizable, decline.

  Output ONLY a JSON object, no prose:
  {"create": true, "name": "<kebab-case, <=64 chars>",
   "description": "<one line: what it's for and WHEN it applies>",
   "body": "<markdown: # Title then the proven step-by-step instructions, commands, pitfalls — generalized, not hard-coded to this one input>"}
  or {"create": false, "reason": "<why not skill-worthy>"}
SYS

Instance Method Summary collapse

Instance Method Details

#perform(payload) ⇒ Object



62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
# File 'lib/rubino/jobs/handlers/distill_skill_job.rb', line 62

def perform(payload)
  session_id = payload[:session_id] || payload["session_id"]
  return unless session_id

  messages = Session::Store.new.for_session(session_id)
  return unless gate_passes?(messages)

  candidate = distill(messages)
  return unless candidate && candidate["create"] == true

  write_skill(candidate)
rescue StandardError => e
  Rubino.logger.warn(event: "jobs.distill_skill.error", error_class: e.class.name, message: e.message)
  nil
end