Module: NEU::MODS::Projection
- Included in:
- Document
- Defined in:
- lib/neu/mods/projection.rb
Overview
Node -> plain data. The read contract: what a MODS document *projects to* for indexing/display. Behavior-preserving with Atlas’s prior ‘mods`-gem-based extraction (verified by the conformance corpus), reimplemented in Nokogiri so DRS depends on Nokogiri alone. Mixed into Document; operates on `doc`.
Empty-value conventions mirror Atlas: scalar fields are “” when absent (matching ‘.text.squish` on an empty node set), except `permanent_url` and `date_created`, which are nil when their node is absent. Arrays are [].
Class Method Summary collapse
-
.compose_title(parts) ⇒ Object
Pure title composition over a parts hash, factored out of #plain_title so callers that already hold the parts – e.g.
Instance Method Summary collapse
-
#abstract ⇒ Object
— Abstract / access —————————————————.
- #access_condition ⇒ Object
-
#date_created ⇒ Object
Parsed dateCreated, or nil if no originInfo/dateCreated, or “” if present but unparseable (mirrors Atlas’s safe_date_parse rescue).
- #digital_origin ⇒ Object
- #extent ⇒ Object
- #format ⇒ Object
- #genres ⇒ Object
- #identifiers ⇒ Object
-
#keywords ⇒ Object
The editable free-text keyword set (Cerberus simple form): topics under the attribute-free keyword subjects only.
-
#languages ⇒ Object
— Scalars / simple arrays ——————————————–.
-
#names ⇒ Object
All top-level names as { name:, role: }.
- #permanent_url ⇒ Object
-
#plain_title ⇒ Object
Composed display title (the former Atlas MODSDecoration#plain_title), driven off the scoped primary title.
- #related_series ⇒ Object
- #resource_type ⇒ Object
-
#title_parts ⇒ Object
Structured primary-title parts.
-
#to_h ⇒ Object
The complete read projection, keyed to Atlas’s Metadata::MODS attribute names – a drop-in source for ‘convert_xml_to_json`.
-
#topical_subjects ⇒ Object
Every <topic> under any top-level <subject> (the access-copy projection, equivalent to Atlas’s extract_topical_subjects).
Class Method Details
.compose_title(parts) ⇒ Object
Pure title composition over a parts hash, factored out of #plain_title so callers that already hold the parts – e.g. Atlas’s access-copy model – can compose the display title WITHOUT re-parsing XML on the read path (reaching for Nokogiri in a decorator is the smell this avoids). Keys: :non_sort :title :subtitle :part_name :part_number (nil or “” for absent). Returns “” when there is no title. Exposed as NEU::MODS.compose_title.
43 44 45 46 47 48 49 |
# File 'lib/neu/mods/projection.rb', line 43 def self.compose_title(parts) return "" if parts[:title].to_s.strip.empty? optional = { ": " => parts[:subtitle], " - " => parts[:part_name], ", " => parts[:part_number] } suffix = optional.filter_map { |sep, val| "#{sep}#{val}" unless val.to_s.strip.empty? }.join "#{parts[:non_sort]}#{parts[:title]}#{suffix}" end |
Instance Method Details
#abstract ⇒ Object
— Abstract / access —————————————————
53 54 55 |
# File 'lib/neu/mods/projection.rb', line 53 def abstract join_paragraphs(abstract_nodes) end |
#access_condition ⇒ Object
57 58 59 |
# File 'lib/neu/mods/projection.rb', line 57 def access_condition join_paragraphs(doc.xpath("/mods:mods/mods:accessCondition", NAMESPACE)) end |
#date_created ⇒ Object
Parsed dateCreated, or nil if no originInfo/dateCreated, or “” if present but unparseable (mirrors Atlas’s safe_date_parse rescue).
122 123 124 125 126 127 128 129 130 131 132 133 134 |
# File 'lib/neu/mods/projection.rb', line 122 def date_created node = doc.at_xpath("/mods:mods/mods:originInfo/mods:dateCreated", NAMESPACE) return nil unless node str = NEU::MODS.canonical_ws(node.text) return nil if str.empty? begin DateTime.parse(str) rescue Date::Error "" end end |
#digital_origin ⇒ Object
100 |
# File 'lib/neu/mods/projection.rb', line 100 def digital_origin = text_at("/mods:mods/mods:physicalDescription/mods:digitalOrigin") |
#extent ⇒ Object
99 |
# File 'lib/neu/mods/projection.rb', line 99 def extent = text_at("/mods:mods/mods:physicalDescription/mods:extent") |
#format ⇒ Object
98 |
# File 'lib/neu/mods/projection.rb', line 98 def format = text_at("/mods:mods/mods:physicalDescription/mods:form") |
#genres ⇒ Object
102 103 104 |
# File 'lib/neu/mods/projection.rb', line 102 def genres doc.xpath("/mods:mods/mods:genre", NAMESPACE).map { |g| clean(g.text) } end |
#identifiers ⇒ Object
111 112 113 |
# File 'lib/neu/mods/projection.rb', line 111 def identifiers doc.xpath("/mods:mods/mods:identifier", NAMESPACE).map { |i| clean(i.text) } end |
#keywords ⇒ Object
The editable free-text keyword set (Cerberus simple form): topics under the attribute-free keyword subjects only.
65 66 67 |
# File 'lib/neu/mods/projection.rb', line 65 def keywords keyword_subjects.flat_map { |s| s.xpath("mods:topic", NAMESPACE).map { |t| t.text.strip } } end |
#languages ⇒ Object
— Scalars / simple arrays ——————————————–
89 90 91 92 93 94 95 |
# File 'lib/neu/mods/projection.rb', line 89 def languages doc.xpath("/mods:mods/mods:language", NAMESPACE).map do |lang| term = lang.at_xpath("mods:languageTerm[@type='text']", NAMESPACE) || lang.at_xpath("mods:languageTerm", NAMESPACE) clean(term&.text) end.compact end |
#names ⇒ Object
All top-level names as { name:, role: }. ‘name` reproduces the `mods` gem’s display_value_w_date (including its quirks – faithfully, so existing Solr/ display output is preserved). ‘role` prefers the type=“text” roleTerm, falling back to the raw code (NOT MARC-relator-translated – see README).
81 82 83 84 85 |
# File 'lib/neu/mods/projection.rb', line 81 def names doc.xpath("/mods:mods/mods:name", NAMESPACE).map do |node| { name: name_display_value_w_date(node), role: name_role(node) } end end |
#permanent_url ⇒ Object
115 116 117 118 |
# File 'lib/neu/mods/projection.rb', line 115 def permanent_url node = doc.at_xpath("/mods:mods/mods:identifier[@type='hdl']", NAMESPACE) node && clean(node.text) end |
#plain_title ⇒ Object
Composed display title (the former Atlas MODSDecoration#plain_title), driven off the scoped primary title.
33 34 35 |
# File 'lib/neu/mods/projection.rb', line 33 def plain_title Projection.compose_title(title_parts) end |
#related_series ⇒ Object
106 107 108 109 |
# File 'lib/neu/mods/projection.rb', line 106 def doc.xpath("/mods:mods/mods:relatedItem[@type='series']/mods:titleInfo/mods:title", NAMESPACE) .map { |t| clean(t.text) } end |
#resource_type ⇒ Object
97 |
# File 'lib/neu/mods/projection.rb', line 97 def resource_type = text_at("/mods:mods/mods:typeOfResource") |
#title_parts ⇒ Object
Structured primary-title parts. nil for an absent part (the Cerberus form treats nil as “not present”); to_h coerces to “” for the Atlas main_title.
20 21 22 23 24 25 26 27 28 29 |
# File 'lib/neu/mods/projection.rb', line 20 def title_parts ti = primary_title_info { non_sort: child_text(ti, "mods:nonSort"), subtitle: child_text(ti, "mods:subTitle"), title: child_text(ti, "mods:title"), part_name: child_text(ti, "mods:partName"), part_number: child_text(ti, "mods:partNumber") } end |
#to_h ⇒ Object
The complete read projection, keyed to Atlas’s Metadata::MODS attribute names – a drop-in source for ‘convert_xml_to_json`.
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 |
# File 'lib/neu/mods/projection.rb', line 140 def to_h { main_title: title_parts.transform_values(&:to_s), names: names, languages: languages, date_created: date_created, resource_type: resource_type, genres: genres, format: format, extent: extent, digital_origin: digital_origin, abstract: abstract, related_series: , topical_subjects: topical_subjects, identifiers: identifiers, permanent_url: permanent_url, access_condition: access_condition } end |
#topical_subjects ⇒ Object
Every <topic> under any top-level <subject> (the access-copy projection, equivalent to Atlas’s extract_topical_subjects).
71 72 73 |
# File 'lib/neu/mods/projection.rb', line 71 def topical_subjects doc.xpath("/mods:mods/mods:subject/mods:topic", NAMESPACE).map { |t| clean(t.text) } end |