Class: Kotoshu::Readers::AffReader
- Inherits:
-
Object
- Object
- Kotoshu::Readers::AffReader
- Defined in:
- lib/kotoshu/readers/aff_reader.rb
Overview
AFF file reader for Hunspell affix files.
This class reads .aff files and creates an Aff data structure.
Constant Summary collapse
- BOOLEAN_DIRECTIVES =
Directives that are single boolean flags
%w[ COMPLEXPREFIXES FULLSTRIP NOSPLITSUGS CHECKSHARPS CHECKCOMPOUNDCASE CHECKCOMPOUNDDUP CHECKCOMPOUNDREP CHECKCOMPOUNDTRIPLE SIMPLIFIEDTRIPLE ONLYMAXDIFF COMPOUNDMORESUFFIXES ].freeze
- STRING_DIRECTIVES =
Directives that are single string values
%w[SET FLAG KEY TRY WORDCHARS LANG].freeze
- INTEGER_DIRECTIVES =
Directives that are single integer values
%w[MAXDIFF MAXNGRAMSUGS MAXCPDSUGS COMPOUNDMIN COMPOUNDWORDMAX].freeze
- FLAG_DIRECTIVES =
Directives that are single flag values
%w[ NOSUGGEST KEEPCASE CIRCUMFIX NEEDAFFIX FORBIDDENWORD WARN COMPOUNDFLAG COMPOUNDBEGIN COMPOUNDMIDDLE COMPOUNDEND ONLYINCOMPOUND COMPOUNDPERMITFLAG COMPOUNDFORBIDFLAG FORCEUCASE SUBSTANDARD SYLLABLENUM COMPOUNDROOT ].freeze
- SYNONYMS =
Outdated directive names and their synonyms
{ 'PSEUDOROOT' => 'NEEDAFFIX', 'COMPOUNDLAST' => 'COMPOUNDEND' }.freeze
Instance Attribute Summary collapse
-
#encoding ⇒ Object
readonly
Returns the value of attribute encoding.
-
#flag_format ⇒ Object
readonly
Returns the value of attribute flag_format.
-
#path ⇒ Object
readonly
Returns the value of attribute path.
Instance Method Summary collapse
-
#initialize(path, encoding: 'UTF-8') ⇒ AffReader
constructor
Create a new AFF reader.
-
#read(source = nil) ⇒ Hash
Read the aff file and return the aff data structure.
Constructor Details
#initialize(path, encoding: 'UTF-8') ⇒ AffReader
Create a new AFF reader.
50 51 52 53 54 55 |
# File 'lib/kotoshu/readers/aff_reader.rb', line 50 def initialize(path, encoding: 'UTF-8') @path = path @encoding = detect_encoding(path) || encoding @flag_format = 'short' @flag_synonyms = {} end |
Instance Attribute Details
#encoding ⇒ Object (readonly)
Returns the value of attribute encoding.
43 44 45 |
# File 'lib/kotoshu/readers/aff_reader.rb', line 43 def encoding @encoding end |
#flag_format ⇒ Object (readonly)
Returns the value of attribute flag_format.
43 44 45 |
# File 'lib/kotoshu/readers/aff_reader.rb', line 43 def flag_format @flag_format end |
#path ⇒ Object (readonly)
Returns the value of attribute path.
43 44 45 |
# File 'lib/kotoshu/readers/aff_reader.rb', line 43 def path @path end |
Instance Method Details
#read(source = nil) ⇒ Hash
Read the aff file and return the aff data structure.
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 |
# File 'lib/kotoshu/readers/aff_reader.rb', line 61 def read(source = nil) reader = source || FileReader.new(@path, @encoding) data = { 'SFX' => {}, 'PFX' => {}, 'FLAG' => 'short' } reader.each do |_line_no, line| dir_value = read_directive(reader, line) next unless dir_value directive, value = dir_value # Update flag format when FLAG directive is encountered (BEFORE using it) if directive == 'FLAG' @flag_format = value end # Re-parse FLAG directive value now that @flag_format is updated if directive == 'FLAG' && value.is_a?(String) # No re-parsing needed for FLAG, just update the format end # SFX/PFX have multiple entries if %w[SFX PFX].include?(directive) data[directive][value.first.flag] = value else data[directive] = value end # Update flag synonyms when AF directive is encountered (AFTER storing it) if directive == 'AF' @flag_synonyms = value end # Note: We don't reset_encoding during iteration because it closes # the file and breaks the iteration. The FileReader is initialized # with UTF-8 encoding which handles most cases. end data end |