Module: NEU::MODS::Projection

Included in:
Document
Defined in:
lib/neu/mods/projection.rb

Overview

Node -> plain data. The read contract: what a MODS document *projects to* for indexing/display. Behavior-preserving with Atlas’s prior ‘mods`-gem-based extraction (verified by the conformance corpus), reimplemented in Nokogiri so DRS depends on Nokogiri alone. Mixed into Document; operates on `doc`.

Empty-value conventions mirror Atlas: scalar fields are “” when absent (matching ‘.text.squish` on an empty node set), except `permanent_url` and `date_created`, which are nil when their node is absent. Arrays are [].

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.compose_title(parts) ⇒ Object

Pure title composition over a parts hash, factored out of #plain_title so callers that already hold the parts – e.g. Atlas’s access-copy model – can compose the display title WITHOUT re-parsing XML on the read path (reaching for Nokogiri in a decorator is the smell this avoids). Keys: :non_sort :title :subtitle :part_name :part_number (nil or “” for absent). Returns “” when there is no title. Exposed as NEU::MODS.compose_title.



43
44
45
46
47
48
49
# File 'lib/neu/mods/projection.rb', line 43

def self.compose_title(parts)
  return "" if parts[:title].to_s.strip.empty?

  optional = { ": " => parts[:subtitle], " - " => parts[:part_name], ", " => parts[:part_number] }
  suffix = optional.filter_map { |sep, val| "#{sep}#{val}" unless val.to_s.strip.empty? }.join
  "#{parts[:non_sort]}#{parts[:title]}#{suffix}"
end

Instance Method Details

#abstractObject

— Abstract / access —————————————————



53
54
55
# File 'lib/neu/mods/projection.rb', line 53

def abstract
  join_paragraphs(abstract_nodes)
end

#access_conditionObject



57
58
59
# File 'lib/neu/mods/projection.rb', line 57

def access_condition
  join_paragraphs(doc.xpath("/mods:mods/mods:accessCondition", NAMESPACE))
end

#date_createdObject

Parsed dateCreated, or nil if no originInfo/dateCreated, or “” if present but unparseable (mirrors Atlas’s safe_date_parse rescue).



122
123
124
125
126
127
128
129
130
131
132
133
134
# File 'lib/neu/mods/projection.rb', line 122

def date_created
  node = doc.at_xpath("/mods:mods/mods:originInfo/mods:dateCreated", NAMESPACE)
  return nil unless node

  str = NEU::MODS.canonical_ws(node.text)
  return nil if str.empty?

  begin
    DateTime.parse(str)
  rescue Date::Error
    ""
  end
end

#digital_originObject



100
# File 'lib/neu/mods/projection.rb', line 100

def digital_origin = text_at("/mods:mods/mods:physicalDescription/mods:digitalOrigin")

#extentObject



99
# File 'lib/neu/mods/projection.rb', line 99

def extent = text_at("/mods:mods/mods:physicalDescription/mods:extent")

#formatObject



98
# File 'lib/neu/mods/projection.rb', line 98

def format = text_at("/mods:mods/mods:physicalDescription/mods:form")

#genresObject



102
103
104
# File 'lib/neu/mods/projection.rb', line 102

def genres
  doc.xpath("/mods:mods/mods:genre", NAMESPACE).map { |g| clean(g.text) }
end

#identifiersObject



111
112
113
# File 'lib/neu/mods/projection.rb', line 111

def identifiers
  doc.xpath("/mods:mods/mods:identifier", NAMESPACE).map { |i| clean(i.text) }
end

#keywordsObject

The editable free-text keyword set (Cerberus simple form): topics under the attribute-free keyword subjects only.



65
66
67
# File 'lib/neu/mods/projection.rb', line 65

def keywords
  keyword_subjects.flat_map { |s| s.xpath("mods:topic", NAMESPACE).map { |t| t.text.strip } }
end

#languagesObject

— Scalars / simple arrays ——————————————–



89
90
91
92
93
94
95
# File 'lib/neu/mods/projection.rb', line 89

def languages
  doc.xpath("/mods:mods/mods:language", NAMESPACE).map do |lang|
    term = lang.at_xpath("mods:languageTerm[@type='text']", NAMESPACE) ||
           lang.at_xpath("mods:languageTerm", NAMESPACE)
    clean(term&.text)
  end.compact
end

#namesObject

All top-level names as { name:, role: }. ‘name` reproduces the `mods` gem’s display_value_w_date (including its quirks – faithfully, so existing Solr/ display output is preserved). ‘role` prefers the type=“text” roleTerm, falling back to the raw code (NOT MARC-relator-translated – see README).



81
82
83
84
85
# File 'lib/neu/mods/projection.rb', line 81

def names
  doc.xpath("/mods:mods/mods:name", NAMESPACE).map do |node|
    { name: name_display_value_w_date(node), role: name_role(node) }
  end
end

#permanent_urlObject



115
116
117
118
# File 'lib/neu/mods/projection.rb', line 115

def permanent_url
  node = doc.at_xpath("/mods:mods/mods:identifier[@type='hdl']", NAMESPACE)
  node && clean(node.text)
end

#plain_titleObject

Composed display title (the former Atlas MODSDecoration#plain_title), driven off the scoped primary title.



33
34
35
# File 'lib/neu/mods/projection.rb', line 33

def plain_title
  Projection.compose_title(title_parts)
end


106
107
108
109
# File 'lib/neu/mods/projection.rb', line 106

def related_series
  doc.xpath("/mods:mods/mods:relatedItem[@type='series']/mods:titleInfo/mods:title", NAMESPACE)
     .map { |t| clean(t.text) }
end

#resource_typeObject



97
# File 'lib/neu/mods/projection.rb', line 97

def resource_type = text_at("/mods:mods/mods:typeOfResource")

#title_partsObject

Structured primary-title parts. nil for an absent part (the Cerberus form treats nil as “not present”); to_h coerces to “” for the Atlas main_title.



20
21
22
23
24
25
26
27
28
29
# File 'lib/neu/mods/projection.rb', line 20

def title_parts
  ti = primary_title_info
  {
    non_sort: child_text(ti, "mods:nonSort"),
    subtitle: child_text(ti, "mods:subTitle"),
    title: child_text(ti, "mods:title"),
    part_name: child_text(ti, "mods:partName"),
    part_number: child_text(ti, "mods:partNumber")
  }
end

#to_hObject

The complete read projection, keyed to Atlas’s Metadata::MODS attribute names – a drop-in source for ‘convert_xml_to_json`.



140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
# File 'lib/neu/mods/projection.rb', line 140

def to_h
  {
    main_title: title_parts.transform_values(&:to_s),
    names: names,
    languages: languages,
    date_created: date_created,
    resource_type: resource_type,
    genres: genres,
    format: format,
    extent: extent,
    digital_origin: digital_origin,
    abstract: abstract,
    related_series: related_series,
    topical_subjects: topical_subjects,
    identifiers: identifiers,
    permanent_url: permanent_url,
    access_condition: access_condition
  }
end

#topical_subjectsObject

Every <topic> under any top-level <subject> (the access-copy projection, equivalent to Atlas’s extract_topical_subjects).



71
72
73
# File 'lib/neu/mods/projection.rb', line 71

def topical_subjects
  doc.xpath("/mods:mods/mods:subject/mods:topic", NAMESPACE).map { |t| clean(t.text) }
end