Class: Relaton::Iso::DataParser

Inherits:
Object
  • Object
show all
Defined in:
lib/relaton/iso/data_parser.rb

Overview

Parses one ISO Open Data record (‘iso_deliverables_metadata.jsonl` line) into an `Relaton::Iso::ItemData`.

See www.iso.org/open-data.html for the field reference.

Constant Summary collapse

ATTRS =
%i[
  type docidentifier docnumber edition language script title status ics
  date contributor abstract copyright source relation place
  structuredidentifier ext
].freeze
DOCTYPES =
{
  "IS" => "international-standard",
  "TS" => "technical-specification",
  "TR" => "technical-report",
  "PAS" => "publicly-available-specification",
  "GUIDE" => "guide",
  "IWA" => "international-workshop-agreement",
  "R" => "recommendation",
  "ISP" => "international-standard",
  "DATA" => "international-standard",
  "TTA" => "international-standard",
}.freeze
SUPPLEMENT_DOCTYPES =
{
  "Amd" => "amendment",
  "Cor" => "technical-corrigendum",
  "Add" => "addendum",
}.freeze
DOC_URL =
"https://www.iso.org/standard/%d.html"
OBP_URL =
"https://www.iso.org/obp/ui/en/#!iso:std:%d:en"
RSS_URL =
"https://www.iso.org/contents/data/standard/%s/%s/%d.detail.rss"

Instance Method Summary collapse

Constructor Details

#initialize(pub, ref_index = {}, errors = {}, tc_index = {}, amend_index = {}, date_index = {}) ⇒ DataParser

Returns a new instance of DataParser.

Parameters:

  • pub (Hash)

    one Open Data record

  • ref_index (Hash{Integer=>String}) (defaults to: {})

    map of Open Data ‘id` -> `reference`, used to resolve `replaces` / `replacedBy` (which are numeric IDs in the source).

  • errors (Hash) (defaults to: {})

    error accumulator (‘Hash.new(true)`); fields are AND-ed across all records by the `report_errors` machinery.

  • tc_index (Hash{String=>Hash}) (defaults to: {})

    map of TC/SC reference -> ‘{ “en” => title, “fr” => title }`, used to resolve the human committee label from the Open Data technical-committees dataset.

  • amend_index (Hash{String=>Array<String>}) (defaults to: {})

    map of base reference -> list of supplement (Amd/Cor/Add) references that target it. Open Data records the supplement -> base direction only via the reference string, so we pre-build the reverse map.

  • date_index (Hash{String=>String}) (defaults to: {})

    map of reference -> ‘publicationDate`, used to attach a `published` date to each emitted relation’s bibitem when the related document is itself present in the Open Data feed.



64
65
66
67
68
69
70
71
# File 'lib/relaton/iso/data_parser.rb', line 64

def initialize(pub, ref_index = {}, errors = {}, tc_index = {}, amend_index = {}, date_index = {})
  @pub = pub
  @ref_index = ref_index
  @errors = errors
  @tc_index = tc_index
  @amend_index = amend_index
  @date_index = date_index
end

Instance Method Details

#parseObject



73
74
75
# File 'lib/relaton/iso/data_parser.rb', line 73

def parse
  ItemData.new(**ATTRS.each_with_object({}) { |a, h| h[a] = send(a) })
end