Module: Canon::Cache

Defined in:
lib/canon/cache.rb

Overview

Cache for expensive operations during document comparison

Provides thread-safe caching with size limits to prevent memory bloat. Uses LRU (Least Recently Used) eviction when cache is full.

Examples:

Cache a parsed document

key = Cache.key_for_document(xml_string, :xml, :none)
parsed = Cache.fetch(:document_parse, key) { parse_xml(xml_string) }

Clear all caches (e.g., between test cases)

Cache.clear_all

Constant Summary collapse

MAX_CACHE_SIZE =

Maximum number of entries per cache category

100

Class Method Summary collapse

Class Method Details

.clear_allObject

Clear all caches

Useful for tests or when memory needs to be freed



54
55
56
57
# File 'lib/canon/cache.rb', line 54

def clear_all
  @caches&.each_value(&:clear)
  @caches = nil
end

.clear_category(category) ⇒ Object

Clear a specific cache category

Parameters:

  • category (Symbol)

    Cache category to clear



62
63
64
65
66
# File 'lib/canon/cache.rb', line 62

def clear_category(category)
  return unless @caches&.key?(category)

  @caches[category]&.clear
end

.fetch(category, key) { ... } ⇒ Object

Fetch a value from cache, or compute and cache it

Parameters:

  • category (Symbol)

    Cache category (:document_parse, :format_detect, etc.)

  • key (String)

    Cache key

Yields:

  • Block to compute value if not cached

Returns:

  • (Object)

    Cached or computed value



28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/canon/cache.rb', line 28

def fetch(category, key)
  cache = cache_for(category)

  # Check if key exists
  if cache.key?(key)
    # Update access time for LRU
    cache[key][:accessed] = Time.now
    return cache[key][:value]
  end

  # Compute and cache the value
  value = yield

  # Evict oldest entry if cache is full
  if cache.size >= MAX_CACHE_SIZE
    oldest_key = cache.min_by { |_, v| v[:accessed] }&.first
    cache.delete(oldest_key) if oldest_key
  end

  cache[key] = { value: value, accessed: Time.now }
  value
end

.key_for_c14n(content, with_comments) ⇒ Object

Generate cache key for XML canonicalization



87
88
89
# File 'lib/canon/cache.rb', line 87

def key_for_c14n(content, with_comments)
  "c14n:#{with_comments}:#{content_hash(content)}"
end

.key_for_document(content, format, preprocessing) ⇒ Object

Generate cache key for document parsing



76
77
78
# File 'lib/canon/cache.rb', line 76

def key_for_document(content, format, preprocessing)
  "doc:#{format}:#{preprocessing}:#{content_hash(content)}"
end

.key_for_format_detection(content) ⇒ Object

Generate cache key for format detection



81
82
83
84
# File 'lib/canon/cache.rb', line 81

def key_for_format_detection(content)
  preview = content[0..100].b
  "fmt:#{content_hash(preview + content.length.to_s)}"
end

.key_for_preprocessing(content, preprocessing) ⇒ Object

Generate cache key for preprocessing



92
93
94
# File 'lib/canon/cache.rb', line 92

def key_for_preprocessing(content, preprocessing)
  "pre:#{preprocessing}:#{content_hash(content)}"
end

.statsHash

Get cache statistics

Returns:

  • (Hash)

    Statistics about cache usage



71
72
73
# File 'lib/canon/cache.rb', line 71

def stats
  @caches&.transform_values(&:size) || {}
end