Class: ActiveHarness::Providers::GPUStack
- Defined in:
- lib/active_harness/providers/gpustack.rb
Overview
GPUStack — self-hosted GPU inference server, OpenAI-compatible API. docs.gpustack.ai/latest/user-guide/inference-openai-compatible-apis/
GPUSTACK_API_BASE is required (e.g. “my-gpustack-server:80”). GPUSTACK_API_KEY is optional (needed only if the server has auth enabled).
Example:
model do
use provider: :gpustack, model: "Qwen/Qwen2.5-7B-Instruct-GGUF"
end
Constant Summary
Constants inherited from Base
Base::HTTP, Base::STREAMING_HTTP
Instance Method Summary collapse
Instance Method Details
#call(model:, messages:, temperature: 0.7) ⇒ Object
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
# File 'lib/active_harness/providers/gpustack.rb', line 16 def call(model:, messages:, temperature: 0.7) url = URI("#{api_base}/v1/chat/completions") headers = { "Content-Type" => "application/json" } key = api_key headers["Authorization"] = "Bearer #{key}" if key raw = post_json(url, headers: headers, body: { model: model, messages: , temperature: temperature } ) data = parse!(raw) handle_error!(data) { content: data.dig("choices", 0, "message", "content").to_s.strip, provider: :gpustack, model: data["model"] || model, usage: extract_usage_openai(data) } end |