Module: OpenAI::Models::Chat::ChatCompletion::ServiceTier

Extended by:
Internal::Type::Enum
Defined in:
lib/openai/models/chat/chat_completion.rb

Overview

Specifies the latency tier to use for processing the request. This parameter is relevant for customers subscribed to the scale tier service:

  • If set to ‘auto’, and the Project is Scale tier enabled, the system will utilize scale tier credits until they are exhausted.

  • If set to ‘auto’, and the Project is not Scale tier enabled, the request will be processed using the default service tier with a lower uptime SLA and no latency guarentee.

  • If set to ‘default’, the request will be processed using the default service tier with a lower uptime SLA and no latency guarentee.

  • If set to ‘flex’, the request will be processed with the Flex Processing service tier. [Learn more](platform.openai.com/docs/guides/flex-processing).

  • When not set, the default behavior is ‘auto’.

When this parameter is set, the response body will include the ‘service_tier` utilized.

See Also:

Constant Summary collapse

AUTO =
:auto
DEFAULT =
:default
FLEX =
:flex

Method Summary

Methods included from Internal::Type::Enum

==, ===, coerce, dump, hash, inspect, to_sorbet_type, values

Methods included from Internal::Util::SorbetRuntimeSupport

#const_missing, #define_sorbet_constant!, #sorbet_constant_defined?, #to_sorbet_type, to_sorbet_type

Methods included from Internal::Type::Converter

#coerce, coerce, #dump, dump, #inspect, inspect, type_info