Class: Kotoshu::Suggestions::Strategies::EditDistanceStrategy
- Inherits:
-
BaseStrategy
- Object
- BaseStrategy
- Kotoshu::Suggestions::Strategies::EditDistanceStrategy
- Defined in:
- lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb
Overview
Edit distance suggestion strategy with enhanced ranking. Generates suggestions by finding words with small edit distance, ranked by word frequency, keyboard proximity, and common typo patterns.
Multi-language support:
-
Automatically selects keyboard layout based on language_code
-
Loads frequency data from YAML files (Phase 1) or GitHub (Phase 2)
-
Supports language-specific typo patterns
This is MORE OOP than Spylls which uses standalone functions for edit distance operations.
Follows Open-Closed Principle: Extend by adding YAML files, NOT by modifying this class.
Instance Attribute Summary collapse
-
#keyboard_layout ⇒ Object
readonly
Returns the value of attribute keyboard_layout.
-
#language_code ⇒ Object
readonly
Returns the value of attribute language_code.
Attributes inherited from BaseStrategy
Instance Method Summary collapse
-
#adjacent_key_typo?(char1, char2) ⇒ Boolean
Check if a substitution is a keyboard-adjacent typo.
-
#adjacent_keys(key) ⇒ Array<String>
Get adjacent keys for a given key.
-
#frequency_bonus(word) ⇒ Integer
Get frequency bonus for a word.
-
#generate(context) ⇒ SuggestionSet
Generate suggestions based on enhanced edit distance scoring.
-
#handles?(context) ⇒ Boolean
Check if this strategy should handle the context.
-
#initialize(name: :edit_distance, language_code: 'en', keyboard_layout: nil, frequency_tiers: nil, **config) ⇒ EditDistanceStrategy
constructor
A new instance of EditDistanceStrategy.
-
#keyboard ⇒ Keyboard::Layout
Public method to get current keyboard being used.
-
#keyboard_name ⇒ String
Public method to get keyboard name.
Methods inherited from BaseStrategy
#calculate_ngram_similarity, #create_suggestion, #create_suggestion_set, #enabled?, #generate_ngrams, #get_config, #has_config?, #max_results, #priority, #to_s
Constructor Details
#initialize(name: :edit_distance, language_code: 'en', keyboard_layout: nil, frequency_tiers: nil, **config) ⇒ EditDistanceStrategy
Returns a new instance of EditDistanceStrategy.
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 36 def initialize(name: :edit_distance, language_code: 'en', keyboard_layout: nil, frequency_tiers: nil, **config) super(name: name, **config) @language_code = language_code # Use OOP registry for keyboard layout lookup @keyboard_layout = resolve_keyboard_layout(keyboard_layout) # Use custom frequency tiers if provided, otherwise load from Kelly data if frequency_tiers @frequency_tiers = frequency_tiers @common_words = Set.new else # Load frequency data for the language from Kelly JSON # This sets @frequency_tiers internally load_frequency_data(language_code) end end |
Instance Attribute Details
#keyboard_layout ⇒ Object (readonly)
Returns the value of attribute keyboard_layout.
27 28 29 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 27 def keyboard_layout @keyboard_layout end |
#language_code ⇒ Object (readonly)
Returns the value of attribute language_code.
27 28 29 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 27 def language_code @language_code end |
Instance Method Details
#adjacent_key_typo?(char1, char2) ⇒ Boolean
Check if a substitution is a keyboard-adjacent typo
74 75 76 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 74 def adjacent_key_typo?(char1, char2) @keyboard_layout.adjacent_keys(char1).include?(char2) end |
#adjacent_keys(key) ⇒ Array<String>
Get adjacent keys for a given key
82 83 84 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 82 def adjacent_keys(key) @keyboard_layout.adjacent_keys(key) end |
#frequency_bonus(word) ⇒ Integer
Get frequency bonus for a word
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 90 def frequency_bonus(word) return 0 unless @frequency_tiers word_downcase = word.downcase # Top 50: 200 bonus return 200 if @frequency_tiers[:top_50]&.include?(word_downcase) # Top 200: 100 bonus return 100 if @frequency_tiers[:top_200]&.include?(word_downcase) # Top 1000: 50 bonus return 50 if @frequency_tiers[:top_1000]&.include?(word_downcase) # Not in common words: no bonus 0 end |
#generate(context) ⇒ SuggestionSet
Generate suggestions based on enhanced edit distance scoring.
Scoring factors:
-
Edit distance (primary factor)
-
Word frequency (common words rank higher)
-
Keyboard proximity (adjacent key typos rank higher)
-
Common typo patterns (missing double letters, etc.)
118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 118 def generate(context) word = context.word max_dist = get_config(:max_distance, 2) min_confidence = get_config(:min_confidence, 0.75) # Higher threshold for quality min_similarity = get_config(:min_jaro_similarity, 0.70) # Minimum Jaro-Winkler similarity (0.0-1.0) min_results = get_config(:min_results, 3) # Always return at least 3 suggestions if available # Get all dictionary words all_words = dictionary_words(context) # Calculate enhanced scores for all candidates candidates = [] all_words.each do |dict_word| next if dict_word == word dist = edit_distance(word, dict_word) next if dist > max_dist || dist <= 0 # Calculate enhanced score (lower is better) score = calculate_enhanced_score(word, dict_word, dist) candidates << [dict_word, dist, score] end # Sort by enhanced score (lower is better) sorted_candidates = candidates.sort_by { |_, _, score| score } # Calculate confidence scores with threshold filtering if sorted_candidates.empty? return SuggestionSet.empty end max_score = sorted_candidates.map { |_, _, s| s.to_f }.max min_score = sorted_candidates.map { |_, _, s| s.to_f }.min score_range = (max_score - min_score).abs # Create suggestions with confidence-based filtering suggestions = [] sorted_candidates.each do |dict_word, dist, score| # Normalize score to confidence (0.0 to 1.0) # Lower score = higher confidence if score_range > 0 normalized = (score.to_f - min_score) / score_range # 0 to 1 confidence = 1.0 - normalized # Invert: lower score = higher confidence else confidence = 1.0 end # Calculate Jaro-Winkler similarity for additional filtering jaro_similarity = calculate_ngram_similarity(word, dict_word) # Skip low-confidence or low-similarity suggestions (unless we need more for min_results) if confidence < min_confidence || jaro_similarity < min_similarity next if suggestions.size >= min_results end suggestions << Suggestion.new( word: dict_word, distance: dist, confidence: confidence, source: @name, original_length: word.length, ngram_score: jaro_similarity, # Now stores Jaro-Winkler similarity (0.0-1.0) enhanced_score: score ) # Stop when we have enough high-quality suggestions break if suggestions.size >= max_results end SuggestionSet.new(suggestions, max_size: max_results) end |
#handles?(context) ⇒ Boolean
Check if this strategy should handle the context.
194 195 196 197 198 199 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 194 def handles?(context) return false unless enabled? # Only handle if the word is not in the dictionary !dictionary_lookup(context, context.word) end |
#keyboard ⇒ Keyboard::Layout
Public method to get current keyboard being used
58 59 60 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 58 def keyboard @keyboard_layout end |
#keyboard_name ⇒ String
Public method to get keyboard name
65 66 67 |
# File 'lib/kotoshu/suggestions/strategies/edit_distance_strategy.rb', line 65 def keyboard_name @keyboard_layout.name end |