Class: Tep::Llm::OpenAI::EmbeddingsHandler

Inherits:
Handler
  • Object
show all
Defined in:
lib/tep/openai_server.rb

Overview

POST /v1/embeddings – OpenAI embeddings shape. Gated 501 when backend.supports_embeddings? is false (the default). When a backend opts in, parses the IDs-only ‘input` array, asks the backend for the pooled vector, and formats the standard embeddings envelope. Mirrors toy’s mean-pooled handler – the pooling strategy lives in the backend, not here.

Instance Method Summary collapse

Methods inherited from Handler

#is_regex?, #re_capture, #re_match?

Instance Method Details

#handle(req, res) ⇒ Object



665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
# File 'lib/tep/openai_server.rb', line 665

def handle(req, res)
  res.headers["Content-Type"] = "application/json"
  if !Tep::APP.openai_backend.supports_embeddings?
    res.set_status(501)
    return "{" +
      "\"error\":{" +
        Tep::Json.encode_pair_str("message",
          "embeddings not supported by this backend") + "," +
        Tep::Json.encode_pair_str("type", "not_implemented") +
      "}" +
    "}"
  end
  body  = req.raw_body
  model = Tep::Json.get_str(body, "model")
  ids   = Tep::Json.get_int_array(body, "input")
  if ids.length == 0
    res.set_status(400)
    return "{" +
      "\"error\":{" +
        Tep::Json.encode_pair_str("message",
          "input must be a non-empty integer array " +
          "(this server speaks token IDs only; tokenize client-side)") + "," +
        Tep::Json.encode_pair_str("type", "invalid_request_error") +
      "}" +
    "}"
  end

  vec = Tep::APP.openai_backend.generate_embeddings(model, ids)

  # Build the embedding float array by hand: Tep::Json has no
  # float-array encoder, and Float#to_s yields a JSON number.
  emb = "["
  k = 0
  while k < vec.length
    if k > 0
      emb = emb + ","
    end
    emb = emb + vec[k].to_s
    k = k + 1
  end
  emb = emb + "]"

  n = ids.length
  "{" +
    Tep::Json.encode_pair_str("object", "list") + "," +
    "\"data\":[{" +
      Tep::Json.encode_pair_str("object", "embedding") + "," +
      Tep::Json.encode_pair_int("index", 0) + "," +
      "\"embedding\":" + emb +
    "}]," +
    Tep::Json.encode_pair_str("model", model) + "," +
    "\"usage\":{" +
      Tep::Json.encode_pair_int("prompt_tokens", n) + "," +
      Tep::Json.encode_pair_int("total_tokens", n) +
    "}" +
  "}"
end