Class: Kiribi::RuriV3::Ruri30M::Model
- Inherits:
-
Object
- Object
- Kiribi::RuriV3::Ruri30M::Model
- Defined in:
- lib/kiribi/ruri_v3/ruri30m.rb
Instance Attribute Summary collapse
-
#onnx_model ⇒ Object
readonly
Returns the value of attribute onnx_model.
-
#tokenizer ⇒ Object
readonly
Returns the value of attribute tokenizer.
Instance Method Summary collapse
- #embedding(text) ⇒ Object
- #embedding_normalized(text) ⇒ Object
-
#initialize ⇒ Model
constructor
A new instance of Model.
Constructor Details
#initialize ⇒ Model
Returns a new instance of Model.
21 22 23 24 |
# File 'lib/kiribi/ruri_v3/ruri30m.rb', line 21 def initialize @tokenizer = Tokenizers.from_file(TOKENIZER_FILEPATH) @onnx_model = OnnxRuntime::Model.new(MODEL_FILEPATH) end |
Instance Attribute Details
#onnx_model ⇒ Object (readonly)
Returns the value of attribute onnx_model.
19 20 21 |
# File 'lib/kiribi/ruri_v3/ruri30m.rb', line 19 def onnx_model @onnx_model end |
#tokenizer ⇒ Object (readonly)
Returns the value of attribute tokenizer.
19 20 21 |
# File 'lib/kiribi/ruri_v3/ruri30m.rb', line 19 def tokenizer @tokenizer end |
Instance Method Details
#embedding(text) ⇒ Object
26 27 28 29 30 31 32 33 34 |
# File 'lib/kiribi/ruri_v3/ruri30m.rb', line 26 def (text) encoded = tokenizer.encode(text) batch = { input_ids: [encoded.ids], attention_mask: [encoded.attention_mask] } outputs = onnx_model.predict(batch) outputs["sentence_embedding"][0] end |
#embedding_normalized(text) ⇒ Object
36 37 38 39 40 |
# File 'lib/kiribi/ruri_v3/ruri30m.rb', line 36 def (text) vec = (text) norm = Math.sqrt(vec.sum { it * it }) vec.map { it / norm } end |