Module: NEU::MODS::Canonicalize

Defined in:
lib/neu/mods/canonicalize.rb

Overview

Lightweight whitespace canonicalization used by the *no-op guard* – does an edit actually change anything, or only insignificant whitespace? (Cerberus’s MODSMerge uses this to avoid minting an unchanged OCFL MODS version.) This is deliberately distinct from TextNormalizer below: this one only folds whitespace; TextNormalizer cleans curator freetext for the access copy.

Constant Summary collapse

NBSP =

U+00A0 non-breaking space, built from codepoint

[0xA0].pack("U")

Class Method Summary collapse

Class Method Details

.canonical_ws(str) ⇒ Object

s doesn’t match U+00A0 (NBSP) in Ruby’s default mode, so fold NBSP to a plain space first, then collapse any whitespace run to one space + strip.



17
18
19
# File 'lib/neu/mods/canonicalize.rb', line 17

def canonical_ws(str)
  str.to_s.tr(NBSP, " ").gsub(/\s+/, " ").strip
end

.whitespace_equivalent?(current, incoming) ⇒ Boolean

Treat values differing only by insignificant whitespace (NBSP vs space, collapsible runs, leading/trailing) as equal.

Returns:

  • (Boolean)


23
24
25
# File 'lib/neu/mods/canonicalize.rb', line 23

def whitespace_equivalent?(current, incoming)
  canonical_ws(current) == canonical_ws(incoming)
end