Class: Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReinforcementTuningSpec

Inherits:
Object
  • Object
show all
Includes:
Core::Hashable, Core::JsonObjectSupport
Defined in:
lib/google/apis/aiplatform_v1beta1/classes.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb

Overview

Spec for Reinforcement Tuning.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(**args) ⇒ GoogleCloudAiplatformV1beta1ReinforcementTuningSpec

Returns a new instance of GoogleCloudAiplatformV1beta1ReinforcementTuningSpec.



45835
45836
45837
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45835

def initialize(**args)
   update!(**args)
end

Instance Attribute Details

#composite_reward_configGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1CompositeReinforcementTuningRewardConfig

Composite reward function configuration for reinforcement tuning. Corresponds to the JSON property compositeRewardConfig



45804
45805
45806
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45804

def composite_reward_config
  @composite_reward_config
end

#hyper_parametersGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReinforcementTuningHyperParameters

Hyperparameters for Reinforcement Tuning. Corresponds to the JSON property hyperParameters



45809
45810
45811
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45809

def hyper_parameters
  @hyper_parameters
end

#single_reward_configGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1SingleReinforcementTuningRewardConfig

SingleReinforcementTuningRewardConfig defines a single reward function configuration for RL tuning. Each reward calculation/evaluation consists of two stages: stage 1: parse the part of information important from sample response via regex extract or simply take the sample response unmodified. stage 2: Call specific reward scorer to compute the reward and also output whether the sample answer is correct. While wrong answer and correct answer should get assigned different rewards, correct answers could also get assigned different rewards. Corresponds to the JSON property singleRewardConfig



45821
45822
45823
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45821

def single_reward_config
  @single_reward_config
end

#training_dataset_uriString

Cloud Storage path to file containing training dataset for tuning. The dataset must be formatted as a JSONL file. Corresponds to the JSON property trainingDatasetUri

Returns:

  • (String)


45827
45828
45829
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45827

def training_dataset_uri
  @training_dataset_uri
end

#validation_dataset_uriString

Cloud Storage path to file containing validation dataset for tuning. The dataset must be formatted as a JSONL file. Corresponds to the JSON property validationDatasetUri

Returns:

  • (String)


45833
45834
45835
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45833

def validation_dataset_uri
  @validation_dataset_uri
end

Instance Method Details

#update!(**args) ⇒ Object

Update properties of this object



45840
45841
45842
45843
45844
45845
45846
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 45840

def update!(**args)
  @composite_reward_config = args[:composite_reward_config] if args.key?(:composite_reward_config)
  @hyper_parameters = args[:hyper_parameters] if args.key?(:hyper_parameters)
  @single_reward_config = args[:single_reward_config] if args.key?(:single_reward_config)
  @training_dataset_uri = args[:training_dataset_uri] if args.key?(:training_dataset_uri)
  @validation_dataset_uri = args[:validation_dataset_uri] if args.key?(:validation_dataset_uri)
end