Class: Google::Cloud::GkeRecommender::V1::GenerateOptimizedManifestRequest

Inherits:
Object
  • Object
show all
Extended by:
Protobuf::MessageExts::ClassMethods
Includes:
Protobuf::MessageExts
Defined in:
proto_docs/google/cloud/gkerecommender/v1/gkerecommender.rb

Overview

Instance Attribute Summary collapse

Instance Attribute Details

#accelerator_type::String

Returns Required. The accelerator type. Use GkeInferenceQuickstart.FetchProfiles to find valid accelerators for a given model_server_info.

Returns:



497
498
499
500
# File 'proto_docs/google/cloud/gkerecommender/v1/gkerecommender.rb', line 497

class GenerateOptimizedManifestRequest
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods
end

#kubernetes_namespace::String

Returns Optional. The kubernetes namespace to deploy the manifests in.

Returns:

  • (::String)

    Optional. The kubernetes namespace to deploy the manifests in.



497
498
499
500
# File 'proto_docs/google/cloud/gkerecommender/v1/gkerecommender.rb', line 497

class GenerateOptimizedManifestRequest
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods
end

#model_server_info::Google::Cloud::GkeRecommender::V1::ModelServerInfo

Returns Required. The model server configuration to generate the manifest for. Use GkeInferenceQuickstart.FetchProfiles to find valid configurations.

Returns:



497
498
499
500
# File 'proto_docs/google/cloud/gkerecommender/v1/gkerecommender.rb', line 497

class GenerateOptimizedManifestRequest
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods
end

#performance_requirements::Google::Cloud::GkeRecommender::V1::PerformanceRequirements

Returns Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.

Returns:

  • (::Google::Cloud::GkeRecommender::V1::PerformanceRequirements)

    Optional. The performance requirements to use for generating Horizontal Pod Autoscaler (HPA) resources. If provided, the manifest includes HPA resources to adjust the model server replica count to maintain the specified targets (e.g., NTPOT, TTFT) at a P50 latency. Cost targets are not currently supported for HPA generation. If the specified targets are not achievable, the HPA manifest will not be generated.



497
498
499
500
# File 'proto_docs/google/cloud/gkerecommender/v1/gkerecommender.rb', line 497

class GenerateOptimizedManifestRequest
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods
end

#storage_config::Google::Cloud::GkeRecommender::V1::StorageConfig

Returns Optional. The storage configuration for the model. If not provided, the model is loaded from Huggingface.

Returns:



497
498
499
500
# File 'proto_docs/google/cloud/gkerecommender/v1/gkerecommender.rb', line 497

class GenerateOptimizedManifestRequest
  include ::Google::Protobuf::MessageExts
  extend ::Google::Protobuf::MessageExts::ClassMethods
end