Class: Kotoshu::Cache::LanguageCache
- Defined in:
- lib/kotoshu/cache/language_cache.rb
Overview
Language cache for dynamic dictionary and grammar rule downloads.
Manages per-language dictionary and grammar rule downloads from a remote repository. Resources are cached locally in ‘$XDG_CACHE_HOME/kotoshu/languages/code/` with metadata for versioning and expiration.
Extends BaseCache for common download, metadata, and validation logic.
Constant Summary collapse
- RESOURCE_TYPES =
Supported resource types
%w[spelling grammar frequency].freeze
- AVAILABLE_LANGUAGES =
Available languages
%w[de en es fr pt ru].freeze
Instance Attribute Summary
Attributes inherited from BaseCache
#cache_path, #cache_ttl, #github_url, #source_registry, #url_base
Instance Method Summary collapse
-
#available_languages ⇒ Array<String>
Get list of available languages.
-
#cache_size ⇒ Integer
Get cache size in bytes (override for language-specific tracking).
-
#cached_resources ⇒ Array<String>
List all cached resources.
-
#frequency_available?(language_code) ⇒ Boolean
Check if frequency data is available for a language.
-
#get_dictionary(language, force_download: false) ⇒ Hash
Alias for get_spelling for backward compatibility.
-
#get_grammar(language, force_download: false) ⇒ Hash
Get or download grammar rules for a language.
-
#get_spelling(language, force_download: false) ⇒ Hash
Get or download spelling dictionary for a language.
-
#install_local(language, aff:, dic:, force: false) ⇒ Hash
Install a spelling dictionary from local files (no download).
-
#language_info(language_code) ⇒ Hash
Get language metadata (word count, license, source).
-
#metadata_path_for(resource_id) ⇒ String
Get metadata file path for a resource.
-
#resource_dir_for(resource_id) ⇒ String
Get resource directory path.
-
#resource_files_exist?(resource_id) ⇒ Boolean
Check if all resource files exist.
-
#supports_resource?(resource_id) ⇒ Boolean
Check if a resource type is supported.
Methods inherited from BaseCache
#available?, #clean, #clear, #clear_all, #download, #get, #initialize, #reset_stats, #stats
Constructor Details
This class inherits a constructor from Kotoshu::Cache::BaseCache
Instance Method Details
#available_languages ⇒ Array<String>
Get list of available languages.
117 118 119 |
# File 'lib/kotoshu/cache/language_cache.rb', line 117 def available_languages AVAILABLE_LANGUAGES.dup end |
#cache_size ⇒ Integer
Get cache size in bytes (override for language-specific tracking).
139 140 141 142 143 144 145 |
# File 'lib/kotoshu/cache/language_cache.rb', line 139 def cache_size total = 0 Dir.glob(File.join(@cache_path, "languages", "**", "*.dic")).each do |path| total += File.size(path) if File.file?(path) end total end |
#cached_resources ⇒ Array<String>
List all cached resources.
150 151 152 153 154 155 156 |
# File 'lib/kotoshu/cache/language_cache.rb', line 150 def cached_resources Dir.glob(File.join(@cache_path, "languages", "**", "metadata.json")).map do |path| relative = Pathname.new(path).relative_path_from(Pathname.new(@cache_path)) parts = relative.to_s.split("/") "#{parts[1]}:#{parts[2]}" end.uniq end |
#frequency_available?(language_code) ⇒ Boolean
Check if frequency data is available for a language.
109 110 111 112 |
# File 'lib/kotoshu/cache/language_cache.rb', line 109 def frequency_available?(language_code) resource_id = "#{language_code}:frequency" available?(resource_id) end |
#get_dictionary(language, force_download: false) ⇒ Hash
Alias for get_spelling for backward compatibility.
90 91 92 |
# File 'lib/kotoshu/cache/language_cache.rb', line 90 def get_dictionary(language, force_download: false) get_spelling(language, force_download) end |
#get_grammar(language, force_download: false) ⇒ Hash
Get or download grammar rules for a language.
99 100 101 102 103 |
# File 'lib/kotoshu/cache/language_cache.rb', line 99 def get_grammar(language, force_download: false) resource_id = "#{language}:grammar" result = get(resource_id, force_download: force_download) result || download_grammar(language) end |
#get_spelling(language, force_download: false) ⇒ Hash
Get or download spelling dictionary for a language.
37 38 39 40 41 |
# File 'lib/kotoshu/cache/language_cache.rb', line 37 def get_spelling(language, force_download: false) resource_id = "#{language}:spelling" result = get(resource_id, force_download: force_download) result || download_spelling(language) end |
#install_local(language, aff:, dic:, force: false) ⇒ Hash
Install a spelling dictionary from local files (no download). Used by ResourceManager.setup_from_local when the user already has .aff/.dic files on disk. Symlinks the source files into the cache directory so subsequent cache lookups find them. Existing symlinks are replaced when force: true; existing real files raise ArgumentError unless force: true.
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/kotoshu/cache/language_cache.rb', line 55 def install_local(language, aff:, dic:, force: false) require "fileutils" resource_id = "#{language}:spelling" lang_path = resource_dir_for(resource_id) FileUtils.mkdir_p(lang_path) target_aff = File.join(lang_path, "index.aff") target_dic = File.join(lang_path, "index.dic") if File.exist?(target_aff) || File.symlink?(target_aff) raise ArgumentError, "#{target_aff} already exists (use force: true to overwrite)" unless force File.unlink(target_aff) end if File.exist?(target_dic) || File.symlink?(target_dic) raise ArgumentError, "#{target_dic} already exists (use force: true to overwrite)" unless force File.unlink(target_dic) end File.symlink(File.(aff), target_aff) File.symlink(File.(dic), target_dic) ((resource_id), (language, "spelling", "local-source")) { aff_path: target_aff, dic_path: target_dic, source: :local } end |
#language_info(language_code) ⇒ Hash
Get language metadata (word count, license, source).
125 126 127 128 129 130 131 132 133 134 |
# File 'lib/kotoshu/cache/language_cache.rb', line 125 def language_info(language_code) { "de" => { name: "German", word_count: 75_873, license: "GPL", source: "igerman98" }, "en" => { name: "English", word_count: 49_568, license: "LGPL/MPL/GPL", source: "SCOWL" }, "es" => { name: "Spanish", word_count: 57_344, license: "GPL", source: "LibreOffice" }, "fr" => { name: "French", word_count: 84_310, license: "MPL 2.0", source: "Grammalecte" }, "pt" => { name: "Portuguese", word_count: 312_368, license: "LGPLv3 + MPL", source: "VERO" }, "ru" => { name: "Russian", word_count: 146_269, license: "BSD-style", source: "Alexander Lebedev" } }[language_code] || { name: language_code.upcase, word_count: 0, license: "Unknown", source: "Unknown" } end |
#metadata_path_for(resource_id) ⇒ String
Get metadata file path for a resource.
414 415 416 417 418 |
# File 'lib/kotoshu/cache/language_cache.rb', line 414 def (resource_id) language = extract_language(resource_id) type = extract_type(resource_id) File.join(@cache_path, "languages", language, type, "metadata.json") end |
#resource_dir_for(resource_id) ⇒ String
Get resource directory path.
424 425 426 427 428 |
# File 'lib/kotoshu/cache/language_cache.rb', line 424 def resource_dir_for(resource_id) language = extract_language(resource_id) type = extract_type(resource_id) File.join(@cache_path, "languages", language, type) end |
#resource_files_exist?(resource_id) ⇒ Boolean
Check if all resource files exist.
434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 |
# File 'lib/kotoshu/cache/language_cache.rb', line 434 def resource_files_exist?(resource_id) type = extract_type(resource_id) return false unless type lang_path = resource_dir_for(resource_id) case type when "spelling" File.exist?(File.join(lang_path, "index.aff")) && File.exist?(File.join(lang_path, "index.dic")) when "grammar" File.exist?(File.join(lang_path, "rules.yaml")) when "frequency" File.exist?(File.join(lang_path, "frequency.json")) else false end end |
#supports_resource?(resource_id) ⇒ Boolean
Check if a resource type is supported.
162 163 164 165 166 167 168 |
# File 'lib/kotoshu/cache/language_cache.rb', line 162 def supports_resource?(resource_id) parts = resource_id.split(":") return false unless parts.size == 2 language, type = parts AVAILABLE_LANGUAGES.include?(language) && RESOURCE_TYPES.include?(type) end |