Class: Google::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ValidateReinforcementTuningRewardRequest

Inherits:
Object
  • Object
show all
Includes:
Core::Hashable, Core::JsonObjectSupport
Defined in:
lib/google/apis/aiplatform_v1beta1/classes.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb,
lib/google/apis/aiplatform_v1beta1/representations.rb

Overview

Request message for GenAiTuningService.ValidateReinforcementTuningReward.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(**args) ⇒ GoogleCloudAiplatformV1beta1ValidateReinforcementTuningRewardRequest

Returns a new instance of GoogleCloudAiplatformV1beta1ValidateReinforcementTuningRewardRequest.



64390
64391
64392
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 64390

def initialize(**args)
   update!(**args)
end

Instance Attribute Details

#composite_reward_configGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1CompositeReinforcementTuningRewardConfig

Composite reward function configuration for reinforcement tuning. Corresponds to the JSON property compositeRewardConfig



64364
64365
64366
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 64364

def composite_reward_config
  @composite_reward_config
end

#exampleGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1ReinforcementTuningExample

User-facing format for Gemini Reinforcement Tuning examples on Vertex. Corresponds to the JSON property example



64369
64370
64371
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 64369

def example
  @example
end

#sample_responseGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1Content

The structured data content of a message. A Content message contains a role field, which indicates the producer of the content, and a parts field, which contains the multi-part data of the message. Corresponds to the JSON property sampleResponse



64376
64377
64378
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 64376

def sample_response
  @sample_response
end

#single_reward_configGoogle::Apis::AiplatformV1beta1::GoogleCloudAiplatformV1beta1SingleReinforcementTuningRewardConfig

SingleReinforcementTuningRewardConfig defines a single reward function configuration for RL tuning. Each reward calculation/evaluation consists of two stages: stage 1: parse the part of information important from sample response via regex extract or simply take the sample response unmodified. stage 2: Call specific reward scorer to compute the reward and also output whether the sample answer is correct. While wrong answer and correct answer should get assigned different rewards, correct answers could also get assigned different rewards. Corresponds to the JSON property singleRewardConfig



64388
64389
64390
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 64388

def single_reward_config
  @single_reward_config
end

Instance Method Details

#update!(**args) ⇒ Object

Update properties of this object



64395
64396
64397
64398
64399
64400
# File 'lib/google/apis/aiplatform_v1beta1/classes.rb', line 64395

def update!(**args)
  @composite_reward_config = args[:composite_reward_config] if args.key?(:composite_reward_config)
  @example = args[:example] if args.key?(:example)
  @sample_response = args[:sample_response] if args.key?(:sample_response)
  @single_reward_config = args[:single_reward_config] if args.key?(:single_reward_config)
end