Module: Relaton::W3c::SafeRealize
- Included in:
- DataFetcher, DataParser
- Defined in:
- lib/relaton/w3c/safe_realize.rb
Overview
Thin wrapper over lutaml-hal’s ‘realize`. Successful objects are cached by w3c_api (it caches realized objects keyed by URL), so this only remembers resources that failed terminally and returns nil for them — so one broken link doesn’t abort the crawl and isn’t re-fetched on every reference.
Transient failures are retried upstream: w3c_api retries HTTP 403 (the W3C rate-limit signal) and connection/timeout errors, and lutaml-hal retries 429 and 5xx. By the time an error surfaces here it is terminal.
Class Method Summary collapse
Instance Method Summary collapse
Class Method Details
.skipped ⇒ Object
21 22 23 |
# File 'lib/relaton/w3c/safe_realize.rb', line 21 def self.skipped @skipped end |
Instance Method Details
#realize(obj, parent_resource: nil) ⇒ Object
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/relaton/w3c/safe_realize.rb', line 29 def realize(obj, parent_resource: nil) href = resolve_href(obj) return nil if SafeRealize.skipped.key?(href) obj.realize(parent_resource: parent_resource) rescue Lutaml::Hal::ConnectionError, Lutaml::Hal::TimeoutError, Faraday::Error, Net::OpenTimeout => e # Network-level failure (already retried by w3c_api). The resource itself # is fine, so don't skip it permanently — a later reference can try again. Util.warn "Failed to realize object: #{href}, error: #{e.}" nil rescue Lutaml::Hal::NotFoundError Util.warn "Object not found: #{href}" SafeRealize.skipped[href] = true nil rescue Lutaml::Hal::Error => e # Definitive upstream error (403 rate-limit, 5xx, 429) already retried by # w3c_api / lutaml-hal. Skip the broken/unavailable resource rather than # re-hitting it for every link that references it. Util.warn "Skipping #{href}, upstream error after retries: #{e.}" SafeRealize.skipped[href] = true nil end |