Class: Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1MachineSpec

Inherits:

Object

Object
Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1MachineSpec

show all

Includes:: Core::Hashable, Core::JsonObjectSupport

Defined in:: lib/google/apis/aiplatform_v1beta1/classes.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb

Overview

Specification of a single machine.

Instance Attribute Summary collapse

#accelerator_count ⇒ Fixnum
The number of accelerators to attach to the machine.
#accelerator_type ⇒ String
Immutable.
#gpu_partition_size ⇒ String
Optional.
#machine_type ⇒ String
Immutable.
#min_gpu_driver_version ⇒ String
Optional.
#multihost_gpu_node_count ⇒ Fixnum
Optional.
#reservation_affinity ⇒ Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReservationAffinity
A ReservationAffinity can be used to configure a Vertex AI resource (e.g., a DeployedModel) to draw its Compute Engine resources from a Shared Reservation, or exclusively from on-demand capacity.
#tpu_topology ⇒ String
Immutable.

Instance Method Summary collapse

#initialize(**args) ⇒ GoogleCloudAiplatformV1beta1MachineSpec constructor
A new instance of GoogleCloudAiplatformV1beta1MachineSpec.
#update!(**args) ⇒ Object
Update properties of this object.

Constructor Details

#initialize(**args) ⇒ `GoogleCloudAiplatformV1beta1MachineSpec`

Returns a new instance of GoogleCloudAiplatformV1beta1MachineSpec.



25262
25263
25264

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25262

def initialize(**args)
   update!(**args)
end

Instance Attribute Details

#accelerator_count ⇒ `Fixnum`

The number of accelerators to attach to the machine. For accelerator optimized machine types (https://cloud.google.com/compute/docs/accelerator-optimized- machines), One may set the accelerator_count from 1 to N for machine with N GPUs. If accelerator_count is less than or equal to N / 2, Vertex will co- schedule the replicas of the model into the same VM to save cost. For example, if the machine type is a3-highgpu-8g, which has 8 H100 GPUs, one can set accelerator_count to 1 to 8. If accelerator_count is 1, 2, 3, or 4, Vertex will co-schedule 8, 4, 2, or 2 replicas of the model into the same VM to save cost. When co-scheduling, CPU, memory and storage on the VM will be distributed to replicas on the VM. For example, one can expect a co-scheduled replica requesting 2 GPUs out of a 8-GPU VM will receive 25% of the CPU, memory and storage of the VM. Note that the feature is not compatible with multihost_gpu_node_count. When multihost_gpu_node_count is set, the co- scheduling will not be enabled. Corresponds to the JSON property acceleratorCount

Returns:

(Fixnum)



25204
25205
25206

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25204

def accelerator_count
  @accelerator_count
end

#accelerator_type ⇒ `String`

Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count. Corresponds to the JSON property acceleratorType

Returns:

(String)



25210
25211
25212

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25210

def accelerator_type
  @accelerator_type
end

#gpu_partition_size ⇒ `String`

Optional. Immutable. The Nvidia GPU partition size. When specified, the requested accelerators will be partitioned into smaller GPU partitions. For example, if the request is for 8 units of NVIDIA A100 GPUs, and gpu_partition_size="1g.10gb", the service will create 8 * 7 = 56 partitioned MIG instances. The partition size must be a value supported by the requested accelerator. Refer to Nvidia GPU Partitioning for the available partition sizes. If set, the accelerator_count should be set to 1. Corresponds to the JSON property gpuPartitionSize

Returns:

(String)



25223
25224
25225

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25223

def gpu_partition_size
  @gpu_partition_size
end

#machine_type ⇒ `String`

Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is n1-standard-2. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. Corresponds to the JSON property machineType

Returns:

(String)



25234
25235
25236

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25234

def machine_type
  @machine_type
end

#min_gpu_driver_version ⇒ `String`

Optional. Immutable. The minimum GPU driver version that this machine requires. For example, "535.104.06". If not specified, the default GPU driver version will be used by the underlying infrastructure. Corresponds to the JSON property minGpuDriverVersion

Returns:

(String)



25241
25242
25243

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25241

def min_gpu_driver_version
  @min_gpu_driver_version
end

#multihost_gpu_node_count ⇒ `Fixnum`

Optional. Immutable. The number of nodes per replica for multihost GPU deployments. Corresponds to the JSON property multihostGpuNodeCount

Returns:

(Fixnum)



25247
25248
25249

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25247

def multihost_gpu_node_count
  @multihost_gpu_node_count
end

#reservation_affinity ⇒ `Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReservationAffinity`

A ReservationAffinity can be used to configure a Vertex AI resource (e.g., a DeployedModel) to draw its Compute Engine resources from a Shared Reservation, or exclusively from on-demand capacity. Corresponds to the JSON property reservationAffinity

Returns:

(Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReservationAffinity)



25254
25255
25256

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25254

def reservation_affinity
  @reservation_affinity
end

#tpu_topology ⇒ `String`

Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1"). Corresponds to the JSON property tpuTopology

Returns:

(String)



25260
25261
25262

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25260

def tpu_topology
  @tpu_topology
end

Instance Method Details

#update!(**args) ⇒ `Object`

Update properties of this object

# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 25267

def update!(**args)
  @accelerator_count = args[:accelerator_count] if args.key?(:accelerator_count)
  @accelerator_type = args[:accelerator_type] if args.key?(:accelerator_type)
  @gpu_partition_size = args[:gpu_partition_size] if args.key?(:gpu_partition_size)
  @machine_type = args[:machine_type] if args.key?(:machine_type)
  @min_gpu_driver_version = args[:min_gpu_driver_version] if args.key?(:min_gpu_driver_version)
  @multihost_gpu_node_count = args[:multihost_gpu_node_count] if args.key?(:multihost_gpu_node_count)
  @reservation_affinity = args[:reservation_affinity] if args.key?(:reservation_affinity)
  @tpu_topology = args[:tpu_topology] if args.key?(:tpu_topology)
end

Class: Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1MachineSpec

Overview

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(**args) ⇒ GoogleCloudAiplatformV1beta1MachineSpec

Instance Attribute Details

#accelerator_count ⇒ Fixnum

#accelerator_type ⇒ String

#gpu_partition_size ⇒ String

#machine_type ⇒ String

#min_gpu_driver_version ⇒ String

#multihost_gpu_node_count ⇒ Fixnum

#reservation_affinity ⇒ Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReservationAffinity

#tpu_topology ⇒ String

Instance Method Details

#update!(**args) ⇒ Object

#initialize(**args) ⇒ `GoogleCloudAiplatformV1beta1MachineSpec`

#accelerator_count ⇒ `Fixnum`

#accelerator_type ⇒ `String`

#gpu_partition_size ⇒ `String`

#machine_type ⇒ `String`

#min_gpu_driver_version ⇒ `String`

#multihost_gpu_node_count ⇒ `Fixnum`

#reservation_affinity ⇒ `Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReservationAffinity`

#tpu_topology ⇒ `String`

#update!(**args) ⇒ `Object`