Class: Woods::Retrieval::QueryClassifier

Inherits:
Object
  • Object
show all
Defined in:
lib/woods/retrieval/query_classifier.rb

Overview

Classifies natural language queries to determine retrieval strategy.

Uses heuristic pattern matching to determine:

  • Intent: what the user wants to do

  • Scope: how broad the search should be

  • Target type: what kind of code unit to look for

  • Framework context: whether this is about Rails/gems vs app code

Defined Under Namespace

Classes: Classification

Constant Summary collapse

INTENTS =
%i[understand locate trace debug implement reference compare framework].freeze
SCOPES =
%i[pinpoint focused exploratory comprehensive].freeze
STOP_WORDS =
Set.new(%w[the a an is are was were be been being have has had do does did will would could
should may might can shall in on at to for of and or but not with by from as
this that these those it its how what when where why who which]).freeze
INTENT_PATTERNS =

Intent patterns — order matters (first match wins)

{
  locate: /\b(where|find|which file|locate|look for|search for)\b/i,
  trace: /\b(trace|follow|track|call(s|ed by)|depends on|used by|who calls|what calls)\b/i,
  debug: /\b(bug|error|fix|broken|failing|wrong|issue|problem|crash|exception)\b/i,
  implement: /\b(implement|add|create|build|write|make|generate)\b/i,
  compare: /\b(compare|difference|vs|versus|between|contrast)\b/i,
  # rubocop:disable Layout/LineLength
  framework: /\b(how does rails|what does rails|rails .+ work|work.+\brails\b|in rails\b|activerecord|actioncontroller|activejob|actionmailer|actioncable|actiontext|activestorage|solid_queue|solid_cache|solid_cable|kamal|propshaft|importmap|hotwire|turbo|stimulus|zeitwerk)\b/i,
  # rubocop:enable Layout/LineLength
  reference: /\b(show me|what is|what are|list|options for|api|interface|signature)\b/i,
  understand: /\b(how|why|explain|understand|what happens|describe|overview)\b/i
}.freeze
SCOPE_PATTERNS =

Scope patterns

{
  pinpoint: /\b(exactly|specific|this one|just the|only the)\b/i,
  comprehensive: /\b(all|every|entire|whole|complete|everything)\b/i,
  exploratory: /\b(related|around|near|similar|like|associated)\b/i
}.freeze
TARGET_PATTERNS =

Target type patterns

{
  model: /\b(model|activerecord|association|schema|table|column|scope|validation)\b/i,
  controller: /\b(controller|action|route|endpoint|api|request|response|filter|callback)\b/i,
  service: /\b(service|interactor|operation|command|use.?case|business.?logic)\b/i,
  job: /\b(job|worker|background|async|sidekiq|queue|perform)\b/i,
  mailer: /\b(mailer|email|notification|send.?mail)\b/i,
  graphql: /\b(graphql|mutation|query|type|resolver|field|argument|schema)\b/i,
  concern: /\b(concern|mixin|module|included|extend)\b/i,
  route: /\b(route|path|url|endpoint|uri|http|get|post|put|patch|delete)\b/i,
  middleware: /\b(middleware|rack|request.?pipeline|before.?action)\b/i,
  i18n: /\b(i18n|translation|locale|internationalization|t\(|translate)\b/i,
  pundit_policy: /\b(pundit|authorize|policy|allowed|permitted)\b/i,
  configuration: /\b(config|initializer|environment|setting|configure)\b/i,
  engine: /\b(engine|mountable|mount|railtie|plugin|isolated.?namespace)\b/i,
  view_template: /\b(view|template|partial|render|erb|layout|html)\b/i,
  # rubocop:disable Layout/LineLength
  migration: /\b(migration|migrate|schema.?change|add.?column|remove.?column|create.?table|drop.?table|db.?migrate)\b/i,
  action_cable_channel: /\b(action.?cable|websocket|broadcast|cable.?channel|subscription.?channel|realtime|real.?time)\b/i,
  scheduled_job: /\b(schedule[dr]?|recurring|cron|periodic|every\s+\d|daily|hourly|weekly|solid.?queue.*recur|sidekiq.?cron|whenever)\b/i,
  rake_task: /\b(rake|rake.?task|lib.?tasks?|maintenance.?script|batch.?script)\b/i
  # rubocop:enable Layout/LineLength
}.freeze

Instance Method Summary collapse

Instance Method Details

#classify(query) ⇒ Classification

Classify a query string

Parameters:

  • query (String)

    Natural language query

Returns:



75
76
77
78
79
80
81
82
83
# File 'lib/woods/retrieval/query_classifier.rb', line 75

def classify(query)
  Classification.new(
    intent: detect_intent(query),
    scope: detect_scope(query),
    target_type: detect_target_type(query),
    framework_context: framework_query?(query),
    keywords: extract_keywords(query)
  )
end