Class: Vivlio::Starter::CLI::IndexCommands::ScoringEngine
- Inherits:
-
Object
- Object
- Vivlio::Starter::CLI::IndexCommands::ScoringEngine
- Defined in:
- lib/vivlio/starter/cli/index/scoring_engine.rb
Overview
スコアリングエンジン
Constant Summary collapse
- WEIGHTS =
スコアリング係数
{ tf: 1.0, # 出現頻度 idf: 5.0, # IDF 係数 definition: 30.0, # 定義パターンボーナス technical: 15.0, # 専門用語ボーナス heading: 20.0, # 見出し近傍ボーナス first_occurrence: 10.0 # 章の冒頭出現ボーナス }.freeze
Instance Attribute Summary collapse
-
#scores ⇒ Object
readonly
Returns the value of attribute scores.
Instance Method Summary collapse
-
#add_score(term, component, value) ⇒ Object
用語にスコアを追加.
-
#calculate_tfidf(term, tf, df, doc_count) ⇒ Object
TF-IDF スコアを計算.
-
#debug_scores(term) ⇒ Object
デバッグ用: スコアの内訳を表示.
-
#filter_by_threshold(threshold) ⇒ Hash
閾値以上のスコアを持つ用語を取得.
-
#initialize ⇒ ScoringEngine
constructor
A new instance of ScoringEngine.
-
#reset! ⇒ Object
スコアをリセット.
Constructor Details
#initialize ⇒ ScoringEngine
Returns a new instance of ScoringEngine.
39 40 41 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 39 def initialize @scores = Hash.new { |h, k| h[k] = { total: 0.0, components: {} } } end |
Instance Attribute Details
#scores ⇒ Object (readonly)
Returns the value of attribute scores.
37 38 39 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 37 def scores @scores end |
Instance Method Details
#add_score(term, component, value) ⇒ Object
用語にスコアを追加
47 48 49 50 51 52 53 54 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 47 def add_score(term, component, value) weight = WEIGHTS[component] || 1.0 weighted_value = value * weight @scores[term][:components][component] ||= 0.0 @scores[term][:components][component] += weighted_value @scores[term][:total] += weighted_value end |
#calculate_tfidf(term, tf, df, doc_count) ⇒ Object
TF-IDF スコアを計算
61 62 63 64 65 66 67 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 61 def calculate_tfidf(term, tf, df, doc_count) return if tf.zero? idf = Math.log((doc_count + 1.0) / (df + 1.0)) + 1.0 add_score(term, :tf, tf) add_score(term, :idf, idf) end |
#debug_scores(term) ⇒ Object
デバッグ用: スコアの内訳を表示
84 85 86 87 88 89 90 91 92 93 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 84 def debug_scores(term) data = @scores[term] return nil unless data { term: term, total: data[:total].round(2), components: data[:components].transform_values { it.round(2) } } end |
#filter_by_threshold(threshold) ⇒ Hash
閾値以上のスコアを持つ用語を取得
72 73 74 75 76 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 72 def filter_by_threshold(threshold) @scores.select { _2[:total] >= threshold } .sort_by { -_2[:total] } .to_h end |
#reset! ⇒ Object
スコアをリセット
79 80 81 |
# File 'lib/vivlio/starter/cli/index/scoring_engine.rb', line 79 def reset! @scores.clear end |