Class: Woods::Retrieval::QueryClassifier
- Inherits:
-
Object
- Object
- Woods::Retrieval::QueryClassifier
- Defined in:
- lib/woods/retrieval/query_classifier.rb
Overview
Classifies natural language queries to determine retrieval strategy.
Uses heuristic pattern matching to determine:
-
Intent: what the user wants to do
-
Scope: how broad the search should be
-
Target type: what kind of code unit to look for
-
Framework context: whether this is about Rails/gems vs app code
Defined Under Namespace
Classes: Classification
Constant Summary collapse
- INTENTS =
%i[understand locate trace debug implement reference compare framework].freeze
- SCOPES =
%i[pinpoint focused exploratory comprehensive].freeze
- STOP_WORDS =
Set.new(%w[the a an is are was were be been being have has had do does did will would could should may might can shall in on at to for of and or but not with by from as this that these those it its how what when where why who which]).freeze
- INTENT_PATTERNS =
Intent patterns — order matters (first match wins)
{ locate: /\b(where|find|which file|locate|look for|search for)\b/i, trace: /\b(trace|follow|track|call(s|ed by)|depends on|used by|who calls|what calls)\b/i, debug: /\b(bug|error|fix|broken|failing|wrong|issue|problem|crash|exception)\b/i, implement: /\b(implement|add|create|build|write|make|generate)\b/i, compare: /\b(compare|difference|vs|versus|between|contrast)\b/i, # rubocop:disable Layout/LineLength framework: /\b(how does rails|what does rails|rails .+ work|work.+\brails\b|in rails\b|activerecord|actioncontroller|activejob|actionmailer|actioncable|actiontext|activestorage|solid_queue|solid_cache|solid_cable|kamal|propshaft|importmap|hotwire|turbo|stimulus|zeitwerk)\b/i, # rubocop:enable Layout/LineLength reference: /\b(show me|what is|what are|list|options for|api|interface|signature)\b/i, understand: /\b(how|why|explain|understand|what happens|describe|overview)\b/i }.freeze
- SCOPE_PATTERNS =
Scope patterns
{ pinpoint: /\b(exactly|specific|this one|just the|only the)\b/i, comprehensive: /\b(all|every|entire|whole|complete|everything)\b/i, exploratory: /\b(related|around|near|similar|like|associated)\b/i }.freeze
- TARGET_PATTERNS =
Target type patterns
{ model: /\b(model|activerecord|association|schema|table|column|scope|validation)\b/i, controller: /\b(controller|action|route|endpoint|api|request|response|filter|callback)\b/i, service: /\b(service|interactor|operation|command|use.?case|business.?logic)\b/i, job: /\b(job|worker|background|async|sidekiq|queue|perform)\b/i, mailer: /\b(mailer|email|notification|send.?mail)\b/i, graphql: /\b(graphql|mutation|query|type|resolver|field|argument|schema)\b/i, concern: /\b(concern|mixin|module|included|extend)\b/i, route: /\b(route|path|url|endpoint|uri|http|get|post|put|patch|delete)\b/i, middleware: /\b(middleware|rack|request.?pipeline|before.?action)\b/i, i18n: /\b(i18n|translation|locale|internationalization|t\(|translate)\b/i, pundit_policy: /\b(pundit|authorize|policy|allowed|permitted)\b/i, configuration: /\b(config|initializer|environment|setting|configure)\b/i, engine: /\b(engine|mountable|mount|railtie|plugin|isolated.?namespace)\b/i, view_template: /\b(view|template|partial|render|erb|layout|html)\b/i, # rubocop:disable Layout/LineLength migration: /\b(migration|migrate|schema.?change|add.?column|remove.?column|create.?table|drop.?table|db.?migrate)\b/i, action_cable_channel: /\b(action.?cable|websocket|broadcast|cable.?channel|subscription.?channel|realtime|real.?time)\b/i, scheduled_job: /\b(schedule[dr]?|recurring|cron|periodic|every\s+\d|daily|hourly|weekly|solid.?queue.*recur|sidekiq.?cron|whenever)\b/i, rake_task: /\b(rake|rake.?task|lib.?tasks?|maintenance.?script|batch.?script)\b/i # rubocop:enable Layout/LineLength }.freeze
Instance Method Summary collapse
-
#classify(query) ⇒ Classification
Classify a query string.
Instance Method Details
#classify(query) ⇒ Classification
Classify a query string
75 76 77 78 79 80 81 82 83 |
# File 'lib/woods/retrieval/query_classifier.rb', line 75 def classify(query) Classification.new( intent: detect_intent(query), scope: detect_scope(query), target_type: detect_target_type(query), framework_context: framework_query?(query), keywords: extract_keywords(query) ) end |