Class: Rubino::LLM::InlineThinkFilter
- Inherits:
-
Object
- Object
- Rubino::LLM::InlineThinkFilter
- Defined in:
- lib/rubino/llm/inline_think_filter.rb
Overview
Streaming filter that splits text into :content and :thinking events by recognising inline <think>…</think> sentinels emitted by MiniMax, DeepSeek-R1, Qwen, and similar reasoning models that don’t expose a dedicated reasoning channel.
Holds back up to TAG_MAX_LEN-1 chars across chunks so a tag split between chunks (e.g. “<thi” + “nk>”) still gets matched. Call #flush at end of stream to drain any tail.
A reasoning model emits its <think> block as the FIRST thing in the turn —the reasoning precedes the answer. A LITERAL <think> a coding agent types mid-answer (echoing user input, writing docs/HTML, discussing the syntax) is content, not a control marker, and MUST survive verbatim. So we only honor an OPENING <think> as a reasoning sentinel while the turn still LEADS with it — i.e. before any visible content has been emitted and while not inside a fenced code block. Once real content (or a “‘ fence) has appeared, every <think>/</think> is treated as ordinary text and is never dropped from the answer or the persisted transcript (STRM-1).
Constant Summary collapse
- OPEN_RE =
/<think>/i- CLOSE_RE =
%r{</think>}i- FENCE_RE =
A “‘ fence toggles “literal code” mode: backticks can appear mid-line (inline `code`) or open a block, so we only need to know a fence run STARTED to stop treating <think> as control inside it.
/```/- TAG_MAX_LEN =
"</think>".length
Instance Method Summary collapse
- #feed(chunk, &block) ⇒ Object
- #flush {|sentinel, @pending| ... } ⇒ Object
-
#initialize ⇒ InlineThinkFilter
constructor
A new instance of InlineThinkFilter.
Constructor Details
#initialize ⇒ InlineThinkFilter
Returns a new instance of InlineThinkFilter.
32 33 34 35 36 37 |
# File 'lib/rubino/llm/inline_think_filter.rb', line 32 def initialize @inside = false # currently inside a <think>...</think> reasoning span @content_seen = false # any visible (:content) text already emitted this turn @in_fence = false # inside a ``` code fence (where <think> is literal) @pending = +"" end |
Instance Method Details
#feed(chunk, &block) ⇒ Object
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
# File 'lib/rubino/llm/inline_think_filter.rb', line 39 def feed(chunk, &block) @pending << chunk loop do # Outside a reasoning span, <think> is only a CONTROL marker while the # turn still LEADS with it: no visible content emitted yet and not # inside a ``` fence. Once content (or a fence) has appeared, every # <think> is literal — emit the safe prefix as content and never split. if !@inside && (@content_seen || @in_fence) emit_safe_prefix(:content, &block) break end re, sentinel = @inside ? [CLOSE_RE, :thinking] : [OPEN_RE, :content] match = @pending.match(re) if match idx = match.begin(0) # An OPEN <think> preceded by NON-BLANK content on this turn is not a # reasoning sentinel — it's literal text the user must keep. Emit the # whole pending span (prefix INCLUDING the tag) as content and treat # all that follows as literal too. (Whitespace-only prefix still # leads, so a genuine reasoning block can start after a newline.) if sentinel == :content && @pending[0, idx].match?(/\S/) emit_safe_prefix(:content, &block) break end tag_len = match[0].length emit = @pending.slice!(0, idx) @pending.slice!(0, tag_len) unless emit.empty? note_content(emit) if sentinel == :content block.call(sentinel, emit) end @inside = !@inside else emit_safe_prefix(sentinel, &block) break end end end |
#flush {|sentinel, @pending| ... } ⇒ Object
81 82 83 84 85 86 87 |
# File 'lib/rubino/llm/inline_think_filter.rb', line 81 def flush return if @pending.empty? sentinel = @inside ? :thinking : :content yield sentinel, @pending @pending = +"" end |