Class: Kotoshu::Readers::LatinScriptConditionChecker

Inherits:
ConditionChecker show all
Defined in:
lib/kotoshu/readers/condition_checker.rb

Overview

Condition checker for Latin-script dictionaries.

Handles Hunspell condition syntax for Latin scripts:

  • ‘.’ matches any stem

  • ‘y’ or ‘abc’ (single char or string) matches stems ending with that string

  • ‘[abc]’ matches stems ending with ‘a’, ‘b’, or ‘c’

  • ‘[^y]’ matches stems NOT ending with ‘y’

  • ‘[0-9]’ matches stems ending with a digit

  • ‘[aeiou]y’ matches stems ending with vowel + ‘y’ (multi-char pattern)

  • ‘[^aeiou]y’ matches stems ending with consonant + ‘y’ (multi-char pattern)

This is NOT suitable for RTL scripts or CJK languages.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(condition:, type:) ⇒ LatinScriptConditionChecker

Returns a new instance of LatinScriptConditionChecker.



99
100
101
102
103
# File 'lib/kotoshu/readers/condition_checker.rb', line 99

def initialize(condition:, type:)
  @condition = condition
  @type = type
  @regex_pattern = compile_regex if type == :regex
end

Instance Attribute Details

#conditionObject (readonly)

Returns the value of attribute condition.



65
66
67
# File 'lib/kotoshu/readers/condition_checker.rb', line 65

def condition
  @condition
end

#patternObject (readonly)

Returns the value of attribute pattern.



65
66
67
# File 'lib/kotoshu/readers/condition_checker.rb', line 65

def pattern
  @pattern
end

#typeObject (readonly)

Returns the value of attribute type.



65
66
67
# File 'lib/kotoshu/readers/condition_checker.rb', line 65

def type
  @type
end

Class Method Details

.compile(condition) ⇒ LatinScriptConditionChecker

Compile a condition string.

Parameters:

  • condition (String)

    The condition string (e.g., ‘[^y]’, ‘[abc]’, ‘y’, ‘.’, ‘[aeiou]y’)

Returns:



71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
# File 'lib/kotoshu/readers/condition_checker.rb', line 71

def self.compile(condition)
  return new(condition: nil, type: :any) if condition == '.'

  # Check if it's a bracket expression: [abc] or [^y] or [aeiou]y or [^aeiou]y
  # Note: [aeiou]y means "ends with vowel + y", not "ends with one of [aeiou]y"
  if condition =~ /^\[([^\]]+)\]/
    content = $1
    negated = content.start_with?('^')

    # Check if this is a multi-char pattern like [aeiou]y or [^aeiou]y
    # These should be used as regex patterns directly
    if content.length > 1
      # For multi-char patterns, use the whole condition as a regex
      new(condition: condition, type: :regex)
    elsif negated
      # Single character negation: [^x]
      chars = content[1..]
      new(condition: chars, type: :not_ends_with)
    else
      # Single character set: [x]
      new(condition: content, type: :ends_with_any)
    end
  else
    # Bare character or string - matches stems ENDING with this string
    new(condition: condition, type: :ends_with)
  end
end

Instance Method Details

#compile_regexRegexp?

Compile a regex pattern for multi-character conditions.

Returns:

  • (Regexp, nil)

    Compiled regex or nil



108
109
110
111
112
113
114
115
# File 'lib/kotoshu/readers/condition_checker.rb', line 108

def compile_regex
  return nil unless @condition

  # Convert Hunspell condition to Ruby regex
  # [^aeiou]y -> /[^aeiou]y$/
  # [aeiou]y -> /[aeiou]y$/
  Regexp.new(@condition + '$')
end

#matches?(stem) ⇒ Boolean

Check if the stem matches the condition.

Parameters:

  • stem (String)

    The stem to check

Returns:

  • (Boolean)

    True if the stem matches



121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
# File 'lib/kotoshu/readers/condition_checker.rb', line 121

def matches?(stem)
  case @type
  when :any
    true
  when :ends_with
    stem.end_with?(@condition)
  when :ends_with_any
    @condition.chars.any? { |char| stem.end_with?(char) }
  when :not_ends_with
    # Check that stem doesn't end with ANY of the characters in the condition
    @condition.chars.none? { |char| stem.end_with?(char) }
  when :regex
    @regex_pattern.match?(stem)
  when :equals
    stem == @condition
  else
    false
  end
end