Class: Kotoshu::Readers::LatinScriptConditionChecker
- Inherits:
-
ConditionChecker
- Object
- ConditionChecker
- Kotoshu::Readers::LatinScriptConditionChecker
- Defined in:
- lib/kotoshu/readers/condition_checker.rb
Overview
Condition checker for Latin-script dictionaries.
Handles Hunspell condition syntax for Latin scripts:
-
‘.’ matches any stem
-
‘y’ or ‘abc’ (single char or string) matches stems ending with that string
-
‘[abc]’ matches stems ending with ‘a’, ‘b’, or ‘c’
-
‘[^y]’ matches stems NOT ending with ‘y’
-
‘[0-9]’ matches stems ending with a digit
-
‘[aeiou]y’ matches stems ending with vowel + ‘y’ (multi-char pattern)
-
‘[^aeiou]y’ matches stems ending with consonant + ‘y’ (multi-char pattern)
This is NOT suitable for RTL scripts or CJK languages.
Instance Attribute Summary collapse
-
#condition ⇒ Object
readonly
Returns the value of attribute condition.
-
#pattern ⇒ Object
readonly
Returns the value of attribute pattern.
-
#type ⇒ Object
readonly
Returns the value of attribute type.
Class Method Summary collapse
-
.compile(condition) ⇒ LatinScriptConditionChecker
Compile a condition string.
Instance Method Summary collapse
-
#compile_regex ⇒ Regexp?
Compile a regex pattern for multi-character conditions.
-
#initialize(condition:, type:) ⇒ LatinScriptConditionChecker
constructor
A new instance of LatinScriptConditionChecker.
-
#matches?(stem) ⇒ Boolean
Check if the stem matches the condition.
Constructor Details
#initialize(condition:, type:) ⇒ LatinScriptConditionChecker
Returns a new instance of LatinScriptConditionChecker.
99 100 101 102 103 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 99 def initialize(condition:, type:) @condition = condition @type = type @regex_pattern = compile_regex if type == :regex end |
Instance Attribute Details
#condition ⇒ Object (readonly)
Returns the value of attribute condition.
65 66 67 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 65 def condition @condition end |
#pattern ⇒ Object (readonly)
Returns the value of attribute pattern.
65 66 67 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 65 def pattern @pattern end |
#type ⇒ Object (readonly)
Returns the value of attribute type.
65 66 67 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 65 def type @type end |
Class Method Details
.compile(condition) ⇒ LatinScriptConditionChecker
Compile a condition string.
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 71 def self.compile(condition) return new(condition: nil, type: :any) if condition == '.' # Check if it's a bracket expression: [abc] or [^y] or [aeiou]y or [^aeiou]y # Note: [aeiou]y means "ends with vowel + y", not "ends with one of [aeiou]y" if condition =~ /^\[([^\]]+)\]/ content = $1 negated = content.start_with?('^') # Check if this is a multi-char pattern like [aeiou]y or [^aeiou]y # These should be used as regex patterns directly if content.length > 1 # For multi-char patterns, use the whole condition as a regex new(condition: condition, type: :regex) elsif negated # Single character negation: [^x] chars = content[1..] new(condition: chars, type: :not_ends_with) else # Single character set: [x] new(condition: content, type: :ends_with_any) end else # Bare character or string - matches stems ENDING with this string new(condition: condition, type: :ends_with) end end |
Instance Method Details
#compile_regex ⇒ Regexp?
Compile a regex pattern for multi-character conditions.
108 109 110 111 112 113 114 115 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 108 def compile_regex return nil unless @condition # Convert Hunspell condition to Ruby regex # [^aeiou]y -> /[^aeiou]y$/ # [aeiou]y -> /[aeiou]y$/ Regexp.new(@condition + '$') end |
#matches?(stem) ⇒ Boolean
Check if the stem matches the condition.
121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
# File 'lib/kotoshu/readers/condition_checker.rb', line 121 def matches?(stem) case @type when :any true when :ends_with stem.end_with?(@condition) when :ends_with_any @condition.chars.any? { |char| stem.end_with?(char) } when :not_ends_with # Check that stem doesn't end with ANY of the characters in the condition @condition.chars.none? { |char| stem.end_with?(char) } when :regex @regex_pattern.match?(stem) when :equals stem == @condition else false end end |