Class: Kotoshu::Cache::LanguageCache

Inherits:
BaseCache
  • Object
show all
Defined in:
lib/kotoshu/cache/language_cache.rb

Overview

Language cache for dynamic dictionary and grammar rule downloads.

Manages per-language dictionary and grammar rule downloads from a remote repository. Resources are cached locally in ‘$XDG_CACHE_HOME/kotoshu/languages/code/` with metadata for versioning and expiration.

Extends BaseCache for common download, metadata, and validation logic.

Examples:

Getting a cached spelling dictionary

cache = LanguageCache.new
result = cache.get('en')
# => { aff_path: "~/.cache/kotoshu/languages/en/spelling/index.aff",
#      dic_path: "~/.cache/kotoshu/languages/en/spelling/index.dic",
#      metadata: { ... } }

Checking cache statistics

stats = cache.stats
# => { hits: 5, misses: 1, hit_rate: 0.83, ... }

Constant Summary collapse

RESOURCE_TYPES =

Supported resource types

%w[spelling grammar frequency].freeze
AVAILABLE_LANGUAGES =

Available languages

%w[de en es fr pt ru].freeze

Instance Attribute Summary

Attributes inherited from BaseCache

#cache_path, #cache_ttl, #github_url, #source_registry, #url_base

Instance Method Summary collapse

Methods inherited from BaseCache

#available?, #clean, #clear, #clear_all, #download, #get, #initialize, #reset_stats, #stats

Constructor Details

This class inherits a constructor from Kotoshu::Cache::BaseCache

Instance Method Details

#available_languagesArray<String>

Get list of available languages.

Returns:

  • (Array<String>)

    List of supported language codes



117
118
119
# File 'lib/kotoshu/cache/language_cache.rb', line 117

def available_languages
  AVAILABLE_LANGUAGES.dup
end

#cache_sizeInteger

Get cache size in bytes (override for language-specific tracking).

Returns:

  • (Integer)

    Total size in bytes



139
140
141
142
143
144
145
# File 'lib/kotoshu/cache/language_cache.rb', line 139

def cache_size
  total = 0
  Dir.glob(File.join(@cache_path, "languages", "**", "*.dic")).each do |path|
    total += File.size(path) if File.file?(path)
  end
  total
end

#cached_resourcesArray<String>

List all cached resources.

Returns:

  • (Array<String>)

    List of cached resource identifiers



150
151
152
153
154
155
156
# File 'lib/kotoshu/cache/language_cache.rb', line 150

def cached_resources
  Dir.glob(File.join(@cache_path, "languages", "**", "metadata.json")).map do |path|
    relative = Pathname.new(path).relative_path_from(Pathname.new(@cache_path))
    parts = relative.to_s.split("/")
    "#{parts[1]}:#{parts[2]}"
  end.uniq
end

#frequency_available?(language_code) ⇒ Boolean

Check if frequency data is available for a language.

Parameters:

  • language_code (String)

    ISO 639-1 language code

Returns:

  • (Boolean)

    True if frequency data exists



109
110
111
112
# File 'lib/kotoshu/cache/language_cache.rb', line 109

def frequency_available?(language_code)
  resource_id = "#{language_code}:frequency"
  available?(resource_id)
end

#get_dictionary(language, force_download: false) ⇒ Hash

Alias for get_spelling for backward compatibility.

Parameters:

  • language (String)

    Language code

  • force_download (Boolean) (defaults to: false)

    Force re-download

Returns:

  • (Hash)

    Dictionary paths and metadata



90
91
92
# File 'lib/kotoshu/cache/language_cache.rb', line 90

def get_dictionary(language, force_download: false)
  get_spelling(language, force_download)
end

#get_grammar(language, force_download: false) ⇒ Hash

Get or download grammar rules for a language.

Parameters:

  • language (String)

    Language code

  • force_download (Boolean) (defaults to: false)

    Force re-download

Returns:

  • (Hash)

    Rules path and metadata



99
100
101
102
103
# File 'lib/kotoshu/cache/language_cache.rb', line 99

def get_grammar(language, force_download: false)
  resource_id = "#{language}:grammar"
  result = get(resource_id, force_download: force_download)
  result || download_grammar(language)
end

#get_spelling(language, force_download: false) ⇒ Hash

Get or download spelling dictionary for a language.

Parameters:

  • language (String)

    Language code (e.g., ‘en’, ‘de’)

  • force_download (Boolean) (defaults to: false)

    Force re-download even if cached

Returns:

  • (Hash)

    Dictionary paths and metadata



37
38
39
40
41
# File 'lib/kotoshu/cache/language_cache.rb', line 37

def get_spelling(language, force_download: false)
  resource_id = "#{language}:spelling"
  result = get(resource_id, force_download: force_download)
  result || download_spelling(language)
end

#install_local(language, aff:, dic:, force: false) ⇒ Hash

Install a spelling dictionary from local files (no download). Used by ResourceManager.setup_from_local when the user already has .aff/.dic files on disk. Symlinks the source files into the cache directory so subsequent cache lookups find them. Existing symlinks are replaced when force: true; existing real files raise ArgumentError unless force: true.

Parameters:

  • language (String)

    Language code

  • aff (String)

    Path to .aff file

  • dic (String)

    Path to .dic file

  • force (Boolean) (defaults to: false)

    Overwrite existing install

Returns:

  • (Hash)

    Installed paths



55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
# File 'lib/kotoshu/cache/language_cache.rb', line 55

def install_local(language, aff:, dic:, force: false)
  require "fileutils"

  resource_id = "#{language}:spelling"
  lang_path = resource_dir_for(resource_id)
  FileUtils.mkdir_p(lang_path)

  target_aff = File.join(lang_path, "index.aff")
  target_dic = File.join(lang_path, "index.dic")

  if File.exist?(target_aff) || File.symlink?(target_aff)
    raise ArgumentError, "#{target_aff} already exists (use force: true to overwrite)" unless force

    File.unlink(target_aff)
  end
  if File.exist?(target_dic) || File.symlink?(target_dic)
    raise ArgumentError, "#{target_dic} already exists (use force: true to overwrite)" unless force

    File.unlink(target_dic)
  end

  File.symlink(File.expand_path(aff), target_aff)
  File.symlink(File.expand_path(dic), target_dic)

  ((resource_id),
                 (language, "spelling", "local-source"))

  { aff_path: target_aff, dic_path: target_dic, source: :local }
end

#language_info(language_code) ⇒ Hash

Get language metadata (word count, license, source).

Parameters:

  • language_code (String)

    The language code

Returns:

  • (Hash)

    Language info



125
126
127
128
129
130
131
132
133
134
# File 'lib/kotoshu/cache/language_cache.rb', line 125

def language_info(language_code)
  {
    "de" => { name: "German", word_count: 75_873, license: "GPL", source: "igerman98" },
    "en" => { name: "English", word_count: 49_568, license: "LGPL/MPL/GPL", source: "SCOWL" },
    "es" => { name: "Spanish", word_count: 57_344, license: "GPL", source: "LibreOffice" },
    "fr" => { name: "French", word_count: 84_310, license: "MPL 2.0", source: "Grammalecte" },
    "pt" => { name: "Portuguese", word_count: 312_368, license: "LGPLv3 + MPL", source: "VERO" },
    "ru" => { name: "Russian", word_count: 146_269, license: "BSD-style", source: "Alexander Lebedev" }
  }[language_code] || { name: language_code.upcase, word_count: 0, license: "Unknown", source: "Unknown" }
end

#metadata_path_for(resource_id) ⇒ String

Get metadata file path for a resource.

Parameters:

  • resource_id (String)

    The resource identifier

Returns:

  • (String)

    Metadata file path



414
415
416
417
418
# File 'lib/kotoshu/cache/language_cache.rb', line 414

def (resource_id)
  language = extract_language(resource_id)
  type = extract_type(resource_id)
  File.join(@cache_path, "languages", language, type, "metadata.json")
end

#resource_dir_for(resource_id) ⇒ String

Get resource directory path.

Parameters:

  • resource_id (String)

    The resource identifier

Returns:

  • (String)

    Resource directory path



424
425
426
427
428
# File 'lib/kotoshu/cache/language_cache.rb', line 424

def resource_dir_for(resource_id)
  language = extract_language(resource_id)
  type = extract_type(resource_id)
  File.join(@cache_path, "languages", language, type)
end

#resource_files_exist?(resource_id) ⇒ Boolean

Check if all resource files exist.

Parameters:

  • resource_id (String)

    The resource identifier

Returns:

  • (Boolean)

    True if all files exist



434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
# File 'lib/kotoshu/cache/language_cache.rb', line 434

def resource_files_exist?(resource_id)
  type = extract_type(resource_id)
  return false unless type

  lang_path = resource_dir_for(resource_id)

  case type
  when "spelling"
    File.exist?(File.join(lang_path, "index.aff")) &&
      File.exist?(File.join(lang_path, "index.dic"))
  when "grammar"
    File.exist?(File.join(lang_path, "rules.yaml"))
  when "frequency"
    File.exist?(File.join(lang_path, "frequency.json"))
  else
    false
  end
end

#supports_resource?(resource_id) ⇒ Boolean

Check if a resource type is supported.

Parameters:

  • resource_id (String)

    The resource identifier (e.g., “en:spelling”)

Returns:

  • (Boolean)

    True if supported



162
163
164
165
166
167
168
# File 'lib/kotoshu/cache/language_cache.rb', line 162

def supports_resource?(resource_id)
  parts = resource_id.split(":")
  return false unless parts.size == 2

  language, type = parts
  AVAILABLE_LANGUAGES.include?(language) && RESOURCE_TYPES.include?(type)
end