Class: Tep::Llm::OpenAI::Server

Inherits:
Object
  • Object
show all
Defined in:
lib/tep/openai_server.rb

Overview

The mountable server. Class methods because an app wires one backend per process at boot (use) then mounts the standard routes (serve!).

Class Method Summary collapse

Class Method Details

.serve!(events_jsonl = "") ⇒ Object

Mount the standard OpenAI routes + (optionally) start the toy/v1 events stream. events_jsonl is a JSONL path the per-request inference event + the run_start at boot append to; an empty path (the default) disables emission with zero overhead. Backwards-compatible with the 7.1a/b no-arg form.



141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
# File 'lib/tep/openai_server.rb', line 141

def self.serve!(events_jsonl = "")
  events = Tep::Events.new(events_jsonl)
  Tep::APP.set_openai_events(events)
  host = ENV["HOSTNAME"]
  if host.length == 0
    host = "tep"
  end
  # backend.device_kind => the run_start's `backend.kind`; reads
  # the backend via APP.openai_backend so a `use`d subclass's
  # override answers (e.g. ToyBackend returning "cuda").
  backend_kind = Tep::APP.openai_backend.device_kind
  config_json = "{" +
    SpinelKit::Json.encode_pair_str("server", "tep-llm-openai") + "," +
    SpinelKit::Json.encode_pair_str("events_jsonl", events_jsonl) +
  "}"
  events.run_start(host, backend_kind, "", "", config_json)
  Tep.get("/v1/models",            Tep::Llm::OpenAI::ModelsHandler.new)
  Tep.post("/v1/completions",      Tep::Llm::OpenAI::CompletionsHandler.new)
  Tep.post("/v1/chat/completions", Tep::Llm::OpenAI::ChatCompletionsHandler.new)
  # Always mounted; the handler 501s when supports_embeddings?
  # is false (same gate shape as chat completions).
  Tep.post("/v1/embeddings",       Tep::Llm::OpenAI::EmbeddingsHandler.new)
  0
end

.use(backend) ⇒ Object

Register the app's backend. Pass a concrete Backend subclass instance; it's stored on Tep::APP and dispatched per request.



131
132
133
134
# File 'lib/tep/openai_server.rb', line 131

def self.use(backend)
  Tep::APP.set_openai_backend(backend)
  0
end