Class: Kotoshu::Readers::Word

Inherits:
Struct
  • Object
show all
Defined in:
lib/kotoshu/readers/dic_reader.rb

Overview

Word entry from the dictionary file.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Attribute Details

#flagsSet<String>

Morphological flags

Returns:

  • (Set<String>)

    the current value of flags



11
12
13
# File 'lib/kotoshu/readers/dic_reader.rb', line 11

def flags
  @flags
end

#stemString

The word stem

Returns:

  • (String)

    the current value of stem



11
12
13
# File 'lib/kotoshu/readers/dic_reader.rb', line 11

def stem
  @stem
end

Class Method Details

.from_line(line, context = {}) ⇒ Word

Create a word from a dictionary line.

Parameters:

  • line (String)

    The dictionary line

  • context (Hash) (defaults to: {})

    The reading context (for flag parsing)

Returns:

  • (Word)

    The parsed word



17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# File 'lib/kotoshu/readers/dic_reader.rb', line 17

def self.from_line(line, context = {})
  parts = line.split('/')
  stem = parts[0].strip
  flags_str = parts[1]

  flags = if flags_str && context[:flag_format]
            parse_flags(flags_str, context[:flag_format], context[:flag_synonyms])
          elsif flags_str
            flags_str.chars.to_set
          else
            Set.new
          end

  new(stem:, flags:)
end

.parse_flags(string, flag_format, flag_synonyms = {}) ⇒ Set<String>

Parse flags from string.

Parameters:

  • string (String)

    Flag string

  • flag_format (String)

    Flag format (‘short’, ‘long’, ‘num’, ‘UTF-8’)

  • flag_synonyms (Hash) (defaults to: {})

    Flag synonyms map

Returns:

  • (Set<String>)

    Parsed flags



39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
# File 'lib/kotoshu/readers/dic_reader.rb', line 39

def self.parse_flags(string, flag_format, flag_synonyms = {})
  return Set.new if string.nil? || string.empty?

  # Check flag synonyms
  if flag_synonyms && string =~ /^\d+$/
    return flag_synonyms[string] || Set.new
  end

  case flag_format
  when 'short'
    string.chars.to_set
  when 'long'
    string.scan(/../).to_set
  when 'num'
    string.scan(/\d+/).to_set
  when 'UTF-8'
    string.chars.to_set
  else
    string.chars.to_set
  end
end