Module: NEU::MODS::Canonicalize
- Defined in:
- lib/neu/mods/canonicalize.rb
Overview
Lightweight whitespace canonicalization used by the *no-op guard* – does an edit actually change anything, or only insignificant whitespace? (Cerberus’s MODSMerge uses this to avoid minting an unchanged OCFL MODS version.) This is deliberately distinct from TextNormalizer below: this one only folds whitespace; TextNormalizer cleans curator freetext for the access copy.
Constant Summary collapse
- NBSP =
U+00A0 non-breaking space, built from codepoint
[0xA0].pack("U")
Class Method Summary collapse
-
.canonical_ws(str) ⇒ Object
s doesn’t match U+00A0 (NBSP) in Ruby’s default mode, so fold NBSP to a plain space first, then collapse any whitespace run to one space + strip.
-
.whitespace_equivalent?(current, incoming) ⇒ Boolean
Treat values differing only by insignificant whitespace (NBSP vs space, collapsible runs, leading/trailing) as equal.
Class Method Details
.canonical_ws(str) ⇒ Object
s doesn’t match U+00A0 (NBSP) in Ruby’s default mode, so fold NBSP to a plain space first, then collapse any whitespace run to one space + strip.
17 18 19 |
# File 'lib/neu/mods/canonicalize.rb', line 17 def canonical_ws(str) str.to_s.tr(NBSP, " ").gsub(/\s+/, " ").strip end |
.whitespace_equivalent?(current, incoming) ⇒ Boolean
Treat values differing only by insignificant whitespace (NBSP vs space, collapsible runs, leading/trailing) as equal.
23 24 25 |
# File 'lib/neu/mods/canonicalize.rb', line 23 def whitespace_equivalent?(current, incoming) canonical_ws(current) == canonical_ws(incoming) end |