Class: Woods::Retrieval::Ranker

Inherits:
Object
  • Object
show all
Defined in:
lib/woods/retrieval/ranker.rb

Overview

Ranks search candidates using weighted signal scoring and diversity adjustment.

Combines multiple ranking signals into a final score:

  • Semantic similarity from vector search

  • Keyword match quality

  • Recency (git change frequency)

  • Importance (PageRank / structural importance)

  • Type match (bonus when result type matches query target_type)

  • Diversity (penalty for too many results of same type/namespace)

After initial scoring, applies Reciprocal Rank Fusion (RRF) when candidates come from multiple retrieval sources.

Examples:

ranker = Ranker.new(metadata_store: store)
ranked = ranker.rank(candidates, classification: classification)

Constant Summary collapse

WEIGHTS =

Signal weights for ranking — sum to 1.0.

{
  semantic: 0.40,
  keyword: 0.20,
  recency: 0.15,
  importance: 0.10,
  type_match: 0.10,
  diversity: 0.05
}.freeze
RRF_K =

RRF constant — balances rank position vs. absolute score. Standard value from the original RRF paper (Cormack et al., 2009).

60

Instance Method Summary collapse

Constructor Details

#initialize(metadata_store:, graph_store: nil) ⇒ Ranker

Returns a new instance of Ranker.

Parameters:

  • metadata_store (#find)

    Store that resolves identifiers to unit metadata

  • graph_store (#pagerank, nil) (defaults to: nil)

    Optional graph store exposing PageRank scores. When present, PageRank rank-percentile replaces the bucketed importance signal.



40
41
42
43
# File 'lib/woods/retrieval/ranker.rb', line 40

def initialize(metadata_store:, graph_store: nil)
  @metadata_store = 
  @graph_store = graph_store
end

Instance Method Details

#rank(candidates, classification:) ⇒ Array<Candidate>

Rank candidates by weighted signal scoring with diversity adjustment.

Parameters:

Returns:

  • (Array<Candidate>)

    Re-ranked candidates (best first)



50
51
52
53
54
55
56
57
58
59
60
61
# File 'lib/woods/retrieval/ranker.rb', line 50

def rank(candidates, classification:)
  return [] if candidates.empty?

  # Apply RRF if candidates come from multiple sources
  candidates = apply_rrf(candidates) if multi_source?(candidates)

  scored = score_candidates(candidates, classification)
  sorted = sorted_by_weighted_score(scored)
  apply_diversity_penalty(sorted)

  sorted.map { |item| item[:candidate] }
end