Class: Pikuri::UrlCache
- Inherits:
-
Object
- Object
- Pikuri::UrlCache
- Defined in:
- lib/pikuri/url_cache.rb
Overview
On-disk cache for string-keyed text payloads. Used by the bundled tools to avoid re-fetching the same page or re-issuing the same web-search query within a TTL window: Tool::WebScrape.visit caches the rendered Markdown for a URL, and Tool::Search::Engines.search caches the rendered result list for a query (the query string itself acts as the key — keys are SHA-256 hashed, so any opaque string works).
Each tool wires its own UrlCache instance against a dedicated subdirectory under ROOT_DIR, so a web_search query string and a web_scrape URL string can never collide on the same cache file. There is no global default singleton — pass a fresh instance to whichever code needs caching, or use NULL to disable caching entirely.
One file per entry, named <sha256>.txt under #initialize‘s dir. Freshness is tracked via the file’s mtime; there is no sidecar metadata. Stale entries are simply overwritten the next time #fetch is called with the same key. To clear the cache, rm -rf the directory.
Not thread-safe: if two callers race on the same cold key, both compute and both write the same file. That is the intended tradeoff to keep this under a few dozen lines — the worst-case cost is a duplicate fetch.
Constant Summary collapse
- ROOT_DIR =
Root directory under which per-tool cache subdirectories live. Follows the XDG Base Directory spec: $XDG_CACHE_HOME/pikuri/url_cache if the env var is set to a non-empty value, else ~/.cache/pikuri/url_cache. Each tool picks its own subdir (e.g. “#{ROOT_DIR}/web_scrape”) so keys from different tools cannot collide. The directory is created lazily on first cache write; pikuri does not pre-create it.
begin xdg = ENV['XDG_CACHE_HOME'] cache_home = xdg && !xdg.empty? ? xdg : File.join(Dir.home, '.cache') File.join(cache_home, 'pikuri', 'url_cache') end.freeze
- DEFAULT_TTL =
Default freshness window: 2 hours, in seconds.
Long enough to cover a single interactive session — revisiting a scraped page or re-running a similar search within the same working window hits the cache. Short enough that resuming the next day doesn’t serve stale news, docs, or search results. Reference points: opencode keeps no cache, the
pi-web-fetchcommunity extension uses 15 minutes,pi-web-searchuses 5; 2 hours sits comfortably above the “single follow-up” window those numbers are aimed at without holding content across days. 2 * 60 * 60
- NULL =
Null cache: a drop-in replacement that always misses and never persists. Use this in tests (or anywhere else you want caching off) without giving up the #fetch contract.
Object.new
Instance Method Summary collapse
-
#fetch(url) ⇒ String
Return the cached payload for
urlif a fresh entry exists, otherwise yield to compute it, persist the result, and return it. -
#fresh?(path) ⇒ Boolean
True when
pathexists and was written within the TTL window. -
#initialize(ttl:, dir:) ⇒ UrlCache
constructor
A new instance of UrlCache.
-
#path_for(url) ⇒ String
Absolute path of the cache file for
url.
Constructor Details
#initialize(ttl:, dir:) ⇒ UrlCache
Returns a new instance of UrlCache.
60 61 62 63 |
# File 'lib/pikuri/url_cache.rb', line 60 def initialize(ttl:, dir:) @ttl = ttl @dir = dir end |
Instance Method Details
#fetch(url) ⇒ String
Return the cached payload for url if a fresh entry exists, otherwise yield to compute it, persist the result, and return it.
The block is only invoked on a miss. If the block raises, no file is written — errors are not cached.
74 75 76 77 78 79 80 81 82 |
# File 'lib/pikuri/url_cache.rb', line 74 def fetch(url) path = path_for(url) return File.read(path) if fresh?(path) content = yield FileUtils.mkdir_p(@dir) File.write(path, content) content end |
#fresh?(path) ⇒ Boolean
Returns true when path exists and was written within the TTL window.
87 88 89 |
# File 'lib/pikuri/url_cache.rb', line 87 def fresh?(path) File.exist?(path) && Time.now - File.mtime(path) < @ttl end |
#path_for(url) ⇒ String
Returns absolute path of the cache file for url.
93 94 95 |
# File 'lib/pikuri/url_cache.rb', line 93 def path_for(url) File.join(@dir, "#{Digest::SHA256.hexdigest(url)}.txt") end |