Class: Kotoshu::Dictionary::PlainText
- Defined in:
- lib/kotoshu/dictionary/plain_text.rb
Overview
Plain text dictionary backend.
This dictionary reads from simple plain text word lists, with support for comments and various formatting options.
File format:
-
One word per line
-
Lines starting with # are comments
-
Empty lines are ignored
-
Supports multi-word phrases (e.g., “New York”)
Instance Attribute Summary collapse
-
#case_sensitive ⇒ Boolean
readonly
Whether lookups are case-sensitive.
-
#path ⇒ String
readonly
The path to the dictionary file (or nil if created from array).
-
#word_pattern ⇒ Regexp?
readonly
Pattern for word filtering.
Attributes inherited from Base
#language_code, #locale, #metadata
Class Method Summary collapse
-
.from_string(text, language_code:, locale: nil, case_sensitive: false) ⇒ PlainText
Create a dictionary from a string.
-
.from_words(words, language_code:, locale: nil, case_sensitive: false) ⇒ PlainText
Create a dictionary from an array of words.
Instance Method Summary collapse
-
#add_word(word, flags: []) ⇒ Boolean
Add a word to the dictionary.
-
#initialize(path, language_code:, locale: nil, case_sensitive: false, word_pattern: nil, metadata: {}) ⇒ PlainText
constructor
Create a new PlainText dictionary.
-
#lookup(word) ⇒ Boolean
Check if a word exists in the dictionary.
-
#remove_word(word) ⇒ Boolean
Remove a word from the dictionary.
-
#suggest(word, max_suggestions: 10) ⇒ Array<String>
Generate spelling suggestions.
-
#words ⇒ Array<String>
Get all words in the dictionary.
Methods inherited from Base
#each_word, #empty?, load, #lookup?, register_type, registry, #size, #to_s, #type, #words_matching, #words_with_prefix
Constructor Details
#initialize(path, language_code:, locale: nil, case_sensitive: false, word_pattern: nil, metadata: {}) ⇒ PlainText
Create a new PlainText dictionary.
47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 47 def initialize(path, language_code:, locale: nil, case_sensitive: false, word_pattern: nil, metadata: {}) super(language_code, locale: locale, metadata: ) @original_path = path @path = resolve_path(path) @case_sensitive = case_sensitive @word_pattern = word_pattern @words = load_words(@path) @word_set = build_word_set # Register this dictionary type self.class.register_type(:plain_text) unless Dictionary.registry.key?(:plain_text) end |
Instance Attribute Details
#case_sensitive ⇒ Boolean (readonly)
Returns Whether lookups are case-sensitive.
34 35 36 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 34 def case_sensitive @case_sensitive end |
#path ⇒ String (readonly)
Returns The path to the dictionary file (or nil if created from array).
31 32 33 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 31 def path @path end |
#word_pattern ⇒ Regexp? (readonly)
Returns Pattern for word filtering.
37 38 39 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 37 def word_pattern @word_pattern end |
Class Method Details
.from_string(text, language_code:, locale: nil, case_sensitive: false) ⇒ PlainText
Create a dictionary from a string.
179 180 181 182 183 184 185 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 179 def self.from_string(text, language_code:, locale: nil, case_sensitive: false) words = text.split("\n").reject { |l| l.empty? || l.strip.start_with?("#") } .map(&:strip) from_words(words, language_code: language_code, locale: locale, case_sensitive: case_sensitive) end |
.from_words(words, language_code:, locale: nil, case_sensitive: false) ⇒ PlainText
Create a dictionary from an array of words.
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 150 def self.from_words(words, language_code:, locale: nil, case_sensitive: false) dict = allocate dict.instance_variable_set(:@language_code, language_code.dup.freeze) dict.instance_variable_set(:@locale, locale&.dup&.freeze) dict.instance_variable_set(:@path, nil) dict.instance_variable_set(:@case_sensitive, case_sensitive) dict.instance_variable_set(:@word_pattern, nil) dict.instance_variable_set(:@words, words.dup.map { |w| case_sensitive ? w : w.downcase }) dict.instance_variable_set(:@word_set, dict.instance_variable_get(:@words).each_with_index.to_h) dict.instance_variable_set(:@metadata, {}.freeze) # Register this dictionary type (unless already registered) register_type(:plain_text) unless Dictionary.registry.key?(:plain_text) dict end |
Instance Method Details
#add_word(word, flags: []) ⇒ Boolean
Add a word to the dictionary.
105 106 107 108 109 110 111 112 113 114 115 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 105 def add_word(word, flags: []) return false if word.nil? || word.empty? lookup_word = @case_sensitive ? word : word.downcase return false if @word_set.key?(lookup_word) @words << lookup_word @word_set[lookup_word] = @words.length - 1 true end |
#lookup(word) ⇒ Boolean
Check if a word exists in the dictionary.
66 67 68 69 70 71 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 66 def lookup(word) return false if word.nil? || word.empty? lookup_word = @case_sensitive ? word : word.downcase @word_set.key?(lookup_word) end |
#remove_word(word) ⇒ Boolean
Remove a word from the dictionary.
121 122 123 124 125 126 127 128 129 130 131 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 121 def remove_word(word) return false if word.nil? || word.empty? lookup_word = @case_sensitive ? word : word.downcase return false unless @word_set.key?(lookup_word) index = @word_set.delete(lookup_word) @words.delete_at(index) true end |
#suggest(word, max_suggestions: 10) ⇒ Array<String>
Generate spelling suggestions.
Uses edit distance to find similar words in the dictionary.
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 80 def suggest(word, max_suggestions: 10) return [] if word.nil? || word.empty? lookup_word = @case_sensitive ? word : word.downcase # Find words with same prefix prefix_len = [lookup_word.length - 1, 3].max prefix = lookup_word[0...prefix_len] candidates = @words.select { |w| w.start_with?(prefix) } # Calculate edit distances candidates.map do |dict_word| dist = edit_distance(lookup_word, dict_word) [dict_word, dist] end.select { |_, dist| dist.positive? && dist <= 2 } .sort_by { |_, dist| dist } .first(max_suggestions) .map(&:first) end |
#words ⇒ Array<String>
Get all words in the dictionary.
136 137 138 |
# File 'lib/kotoshu/dictionary/plain_text.rb', line 136 def words @words.dup end |