Class: Kotoshu::Algorithms::Capitalization::GermanCasing

Inherits:
Casing
  • Object
show all
Defined in:
lib/kotoshu/algorithms/capitalization.rb

Overview

Redefines lower because in German “SS” can be lowercased both as “ss” and “ß”.

Example:

german = Kotoshu::Algorithms::Capitalization::GermanCasing.new
german.lower('STRASSE')  # => ['straße', 'strasse']

Instance Method Summary collapse

Methods inherited from Casing

#capitalize, #coerce, #corrections, #lowerfirst, #upper, #variants

Instance Method Details

#guess(word) ⇒ Symbol

Guess word’s capitalization, accounting for German ß handling.

In German uppercased words, ß (which is lowercase, and usually uppercased as SS) is allowed: “straße” => “STRAßE”

Parameters:

  • word (String)

    The word to analyze

Returns:

  • (Symbol)

    One of the Type constants



262
263
264
265
266
267
268
269
270
271
272
# File 'lib/kotoshu/algorithms/capitalization.rb', line 262

def guess(word)
  result = super

  # Check if removing ß makes it ALL caps
  if word.include?('ß')
    word_without_ss = word.gsub('ß', '')
    return Type::ALL if super(word_without_ss) == Type::ALL
  end

  result
end

#lower(word) ⇒ Array<String>

Lowercase word, generating both “ss” and “ß” variants where applicable.

Parameters:

  • word (String)

    The word to lowercase

Returns:

  • (Array<String>)

    List of lowercased variants



248
249
250
251
252
253
# File 'lib/kotoshu/algorithms/capitalization.rb', line 248

def lower(word)
  lowered = super.first
  return [lowered] unless word.include?('SS')

  [*sharp_s_variants(lowered), lowered]
end

#sharp_s_variants(text, start = 0) ⇒ Array<String>

Generate sharp S (ß) variants for all “ss” occurrences.

Parameters:

  • text (String)

    The text to process

  • start (Integer) (defaults to: 0)

    Starting position for search

Returns:

  • (Array<String>)

    All variants with ß replacements



234
235
236
237
238
239
240
241
242
# File 'lib/kotoshu/algorithms/capitalization.rb', line 234

def sharp_s_variants(text, start = 0)
  pos = text.index('ss', start)
  return [] unless pos

  replaced = text[0...pos] + 'ß' + text[(pos + 2)..]
  [replaced,
   *sharp_s_variants(replaced, pos + 1),
   *sharp_s_variants(text, pos + 2)]
end