Class: Google::Apis::AiplatformV1::GoogleCloudAiplatformV1MachineSpec

Inherits:
Object
  • Object
show all
Includes:
Core::Hashable, Core::JsonObjectSupport
Defined in:
lib/google/apis/aiplatform_v1/classes.rb,
lib/google/apis/aiplatform_v1/representations.rb,
lib/google/apis/aiplatform_v1/representations.rb

Overview

Specification of a single machine.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(**args) ⇒ GoogleCloudAiplatformV1MachineSpec

Returns a new instance of GoogleCloudAiplatformV1MachineSpec.



22439
22440
22441
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22439

def initialize(**args)
   update!(**args)
end

Instance Attribute Details

#accelerator_countFixnum

The number of accelerators to attach to the machine. For accelerator optimized machine types, One may set the accelerator_count from 1 to N for machine with N GPUs. If accelerator_count is less than or equal to N / 2, Agent Platform co-schedules the replicas of the model into the same VM to save cost. For example, if the machine type is a3-highgpu-8g, which has 8 H100 GPUs, one can set accelerator_count to 1 to 8. If accelerator_count is 1, 2, 3, or 4, Agent Platform co-schedules 8, 4, 2, or 2 replicas of the model into the same VM to save cost. When co-scheduling, CPU, memory and storage on the VM will be distributed to replicas on the VM. For example, one can expect a co-scheduled replica requesting 2 GPUs out of a 8-GPU VM will receive 25% of the CPU, memory and storage of the VM. Note that the feature is not compatible with multihost_gpu_node_count. When multihost_gpu_node_count is set, the co- scheduling will not be enabled. Corresponds to the JSON property acceleratorCount

Returns:

  • (Fixnum)


22393
22394
22395
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22393

def accelerator_count
  @accelerator_count
end

#accelerator_typeString

Immutable. The type of accelerator(s) that may be attached to the machine as per accelerator_count. Corresponds to the JSON property acceleratorType

Returns:

  • (String)


22399
22400
22401
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22399

def accelerator_type
  @accelerator_type
end

#gpu_partition_sizeString

Optional. Immutable. The Nvidia GPU partition size. When specified, the requested accelerators will be partitioned into smaller GPU partitions. For example, if the request is for 8 units of NVIDIA A100 GPUs, and gpu_partition_size="1g.10gb", the service will create 8 * 7 = 56 partitioned MIG instances. The partition size must be a value supported by the requested accelerator. Refer to Nvidia GPU Partitioning for the available partition sizes. If set, the accelerator_count should be set to 1. Corresponds to the JSON property gpuPartitionSize

Returns:

  • (String)


22412
22413
22414
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22412

def gpu_partition_size
  @gpu_partition_size
end

#machine_typeString

Immutable. The type of the machine. See the list of machine types supported for prediction See the list of machine types supported for custom training. For DeployedModel this field is optional, and the default value is n1- standard-2. For BatchPredictionJob or as part of WorkerPoolSpec this field is required. Corresponds to the JSON property machineType

Returns:

  • (String)


22424
22425
22426
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22424

def machine_type
  @machine_type
end

#reservation_affinityGoogle::Apis::AiplatformV1::GoogleCloudAiplatformV1ReservationAffinity

A ReservationAffinity can be used to configure a Vertex AI resource (e.g., a DeployedModel) to draw its Compute Engine resources from a Shared Reservation, or exclusively from on-demand capacity. Corresponds to the JSON property reservationAffinity



22431
22432
22433
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22431

def reservation_affinity
  @reservation_affinity
end

#tpu_topologyString

Immutable. The topology of the TPUs. Corresponds to the TPU topologies available from GKE. (Example: tpu_topology: "2x2x1"). Corresponds to the JSON property tpuTopology

Returns:

  • (String)


22437
22438
22439
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22437

def tpu_topology
  @tpu_topology
end

Instance Method Details

#update!(**args) ⇒ Object

Update properties of this object



22444
22445
22446
22447
22448
22449
22450
22451
# File 'lib/google/apis/aiplatform_v1/classes.rb', line 22444

def update!(**args)
  @accelerator_count = args[:accelerator_count] if args.key?(:accelerator_count)
  @accelerator_type = args[:accelerator_type] if args.key?(:accelerator_type)
  @gpu_partition_size = args[:gpu_partition_size] if args.key?(:gpu_partition_size)
  @machine_type = args[:machine_type] if args.key?(:machine_type)
  @reservation_affinity = args[:reservation_affinity] if args.key?(:reservation_affinity)
  @tpu_topology = args[:tpu_topology] if args.key?(:tpu_topology)
end