Class: Tep::Llm::OpenAI::ChatStreamSink

Inherits:
Object
  • Object
show all
Defined in:
lib/tep/openai_server.rb

Overview

Chat-streaming write surface (#127). Three emit_* methods cover the OpenAI chat-streaming wire shape:

1. emit_role_prelude("assistant") -> first frame carries
   `delta:{role:"assistant"}` (no content).
2. emit_token(piece) -> N content frames, each
   `delta:{content:<piece>}` with finish_reason:null.
3. emit_finish("stop") -> last frame carries an empty
   `delta:{}` with finish_reason set; the streamer then
   writes the terminating data:[DONE].

Backends typically: sink.emit_role_prelude(“assistant”); then call sink.emit_token(piece) per generated token. emit_finish is invoked by the streamer after the backend returns – not the backend’s responsibility.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initializeChatStreamSink

Returns a new instance of ChatStreamSink.



370
371
372
373
# File 'lib/tep/openai_server.rb', line 370

def initialize
  @model            = ""
  @completion_count = 0
end

Instance Attribute Details

#completion_countObject

Returns the value of attribute completion_count.



368
369
370
# File 'lib/tep/openai_server.rb', line 368

def completion_count
  @completion_count
end

#modelObject

Returns the value of attribute model.



368
369
370
# File 'lib/tep/openai_server.rb', line 368

def model
  @model
end

#outObject

Returns the value of attribute out.



368
369
370
# File 'lib/tep/openai_server.rb', line 368

def out
  @out
end

Instance Method Details

#emit_finish(reason) ⇒ Object

Final frame: empty delta + populated finish_reason. The streamer writes data: after this.



417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
# File 'lib/tep/openai_server.rb', line 417

def emit_finish(reason)
  frame = "{" +
    Tep::Json.encode_pair_str("id", "chatcmpl-tep") + "," +
    Tep::Json.encode_pair_str("object", "chat.completion.chunk") + "," +
    Tep::Json.encode_pair_int("created", Time.now.to_i) + "," +
    Tep::Json.encode_pair_str("model", @model) + "," +
    "\"choices\":[{" +
      Tep::Json.encode_pair_int("index", 0) + "," +
      "\"delta\":{}," +
      Tep::Json.encode_pair_str("finish_reason", reason) +
    "}]" +
  "}"
  @out.write("data: " + frame + "\n\n")
  0
end

#emit_role_prelude(role) ⇒ Object

First frame: role-only delta, no content. Per OpenAI’s wire shape, sent once before content frames.



377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
# File 'lib/tep/openai_server.rb', line 377

def emit_role_prelude(role)
  frame = "{" +
    Tep::Json.encode_pair_str("id", "chatcmpl-tep") + "," +
    Tep::Json.encode_pair_str("object", "chat.completion.chunk") + "," +
    Tep::Json.encode_pair_int("created", Time.now.to_i) + "," +
    Tep::Json.encode_pair_str("model", @model) + "," +
    "\"choices\":[{" +
      Tep::Json.encode_pair_int("index", 0) + "," +
      "\"delta\":{" +
        Tep::Json.encode_pair_str("role", role) +
      "}," +
      "\"finish_reason\":null" +
    "}]" +
  "}"
  @out.write("data: " + frame + "\n\n")
  0
end

#emit_token(piece) ⇒ Object

Content delta. One per generated token.



396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
# File 'lib/tep/openai_server.rb', line 396

def emit_token(piece)
  @completion_count = @completion_count + 1
  frame = "{" +
    Tep::Json.encode_pair_str("id", "chatcmpl-tep") + "," +
    Tep::Json.encode_pair_str("object", "chat.completion.chunk") + "," +
    Tep::Json.encode_pair_int("created", Time.now.to_i) + "," +
    Tep::Json.encode_pair_str("model", @model) + "," +
    "\"choices\":[{" +
      Tep::Json.encode_pair_int("index", 0) + "," +
      "\"delta\":{" +
        Tep::Json.encode_pair_str("content", piece) +
      "}," +
      "\"finish_reason\":null" +
    "}]" +
  "}"
  @out.write("data: " + frame + "\n\n")
  0
end