Class: Tep::Llm::OpenAI::StreamSink

Inherits:

Object

Object
Tep::Llm::OpenAI::StreamSink

show all

Defined in:: lib/tep/openai_server.rb

Overview

The per-token write surface a streaming backend uses (7.2). One method: ‘emit_token(piece)`. The sink formats `piece` as an OpenAI text-completion SSE frame and writes one chunked frame to the outbound stream. Counts emitted tokens for the inference event’s completion_tokens.

Why a sink object instead of a block: spinel can’t lower a block parameter across the backend call boundary; a typed object with one method does the same job through ordinary virtual dispatch.

Instance Attribute Summary collapse

#completion_count ⇒ Object

Returns the value of attribute completion_count.
#model ⇒ Object

Returns the value of attribute model.
#out ⇒ Object

Returns the value of attribute out.

Instance Method Summary collapse

#emit_token(piece) ⇒ Object

Write one SSE event carrying a single text delta.
#initialize ⇒ StreamSink constructor

A new instance of StreamSink.

Constructor Details

#initialize ⇒ `StreamSink`

Returns a new instance of StreamSink.

# File 'lib/tep/openai_server.rb', line 304

def initialize
  @model            = ""
  @completion_count = 0
end

Instance Attribute Details

#completion_count ⇒ `Object`

Returns the value of attribute completion_count.



302
303
304

# File 'lib/tep/openai_server.rb', line 302

def completion_count
  @completion_count
end

#model ⇒ `Object`

Returns the value of attribute model.



302
303
304

# File 'lib/tep/openai_server.rb', line 302

def model
  @model
end

#out ⇒ `Object`

Returns the value of attribute out.



302
303
304

# File 'lib/tep/openai_server.rb', line 302

def out
  @out
end

Instance Method Details

#emit_token(piece) ⇒ `Object`

Write one SSE event carrying a single text delta. Matches OpenAI’s text_completion streaming shape: one choices[].text per event, finish_reason: null until the streamer sends [DONE]. created uses Time.now.to_i (epoch seconds).

# File 'lib/tep/openai_server.rb', line 313

def emit_token(piece)
  @completion_count = @completion_count + 1
  frame = "{" +
    Tep::Json.encode_pair_str("id", "cmpl-tep") + "," +
    Tep::Json.encode_pair_str("object", "text_completion") + "," +
    Tep::Json.encode_pair_int("created", Time.now.to_i) + "," +
    Tep::Json.encode_pair_str("model", @model) + "," +
    "\"choices\":[{" +
      Tep::Json.encode_pair_int("index", 0) + "," +
      Tep::Json.encode_pair_str("text", piece) + "," +
      "\"finish_reason\":null" +
    "}]" +
  "}"
  @out.write("data: " + frame + "\n\n")
  0
end