pubid-core Gem Version Build Status Pull Requests Commits since latest

Badges

Core & Meta Standards Organizations Regional Standards

pubid pubid

pubid-iso pubid-iso

pubid-cen pubid-cen

pubid-core pubid-core

pubid-iec pubid-iec

pubid-bsi pubid-bsi

pubid-ieee pubid-ieee

pubid-jis pubid-jis

pubid-nist pubid-nist

pubid-etsi pubid-etsi

pubid-itu pubid-itu

pubid-ccsds pubid-ccsds

pubid-plateau pubid-plateau

Quick Start

gem install pubid-core

require 'pubid/iso'

# Parse an identifier
id = Pubid::Iso.parse("ISO 9001:2015/Amd 1:2020")

# Access components
id.publisher.body  # => "ISO"
id.number.number   # => "9001"

# Render back to string (round-trip)
id.to_s  # => "ISO 9001:2015/Amd 1:2020"

# Generate URN
id.to_urn  # => "urn:iso:std:iso:9001:amd:2020:v1"

Overview

The PubID project promotes the use of interoperable identifiers across various domains, including ISO, IEC, NIST, and more.

The core of PubID is an identifier information model that allows a publisher to build a human- and machine-readable identification scheme for the unique identification of documents, standards, and other resources.

This identification scheme is designed to facilitate interoperability and data exchange between systems and organizations by providing a consistent way to identify, reference and utilize these resources.

A PubID typically incorporates various components, such as a mixture of an organizational prefix, document type, document stage, edition or version number, and other relevant information.

The key feature that a PubID provides is the ability to represent identifiers in a structured format that can be easily parsed and understood by both humans and machines, in a round-trippable manner.

See V2 Migration Status: ALL 18 FLAVORS COMPLETE for detailed status and [v2-usage-examples] for examples.

URN Generation (RFC 5141-bis)

PubID V2 implements RFC 5141-bis compliant URN generation for ISO and IEC identifiers with full test coverage.

require 'pubid/iso'
require 'pubid/iec'

# ISO: Parse and generate URN
id = Pubid::Iso.parse("ISO/IEC 13818-1:2015/Amd 3:2016")
id.to_urn
# => "urn:iso:std:iso-iec:13818:-1:amd:2016:v3"

# IEC: Parse and generate URN
id = Pubid::Iec.parse("IEC 60068-2-2:1974/Amd 1:1993")
id.to_urn
# => "urn:iec:std:iec:60068-2-2:1974:amd:1993:v1"

# Round-trip: parse URN back to identifier
id = Pubid::Iso.parse_urn("urn:iso:std:iso:9001:amd:2020:v1")
id.to_s  # => "ISO 9001/Amd 1:2020"
id.to_urn  # => "urn:iso:std:iso:9001:amd:2020:v1"

Features

  • ✅ RFC 5141-bis compliant for ISO URNs

  • ✅ IEC URN model compliant (IEC URI Model 2020-03-24)

  • ✅ Explicit language codes, always lowercase in URN output

  • ✅ Dynamic copublisher combinations (ISO/IEC/IEEE, ISO/ASTM, etc.)

  • ✅ Extended document types (DIR, DIR-SUP, IWA-SUP, TTA)

  • ✅ Typed stage codes (WD, CD, DIS, FDIS, PDAM, FDAM, etc.)

  • ✅ Harmonized stage codes (stage-XX.XX format)

  • ✅ Multi-level supplement support with correct nesting order

  • ✅ Round-trip parsing (parse_urn → to_urn) for ISO and IEC

Key Difference: ISO vs IEC Part Numbering

ISO URNs (RFC 5141): part number is a separate colon-separated field: urn:iso:std:iso:8601:-1:2019

IEC URNs (IEC URI Model): part number is part of the docnumber field: urn:iec:std:iec:60068-2-2:1974

Documentation

See URN Generation Guide for complete usage documentation.

See ISO URN Specification for the full RFC 5141-bis specification with ABNF grammar.

See IEC URN Specification for the full IEC URN model specification.

See RFC 5141-bis Compliance Report for certification details and test coverage.

See V2 Architecture Guide for architectural details including URN generation design.

Machine-Readable Serialization

PubID V2 provides two-way machine-readable conversion for all identifiers, supporting round-trip conversion between human-readable and structured formats.

Export to Hash

require 'pubid/iso'

id = Pubid::Iso.parse("ISO 9001:2015/Amd 1:2020")
id.to_h
# => {
#      flavor: "iso",
#      type: "amendment",
#      publisher: "ISO",
#      number: "9001",
#      year: "2015",
#      supplements: [
#        { type: "amendment", number: "1", year: "2020" }
#      ],
#      urn: "urn:iso:std:iso:9001:amd:2020:v1"
#    }

Export to JSON

require 'pubid/iso'

id = Pubid::Iso.parse("ISO 9001:2015")
id.to_json
# => '{"flavor":"iso","publisher":"ISO","number":"9001","year":"2015",...}'

Import from Hash

require 'pubid/iso'

hash = {
  flavor: "iso",
  publisher: "ISO",
  number: "9001",
  year: "2015"
}
id = Pubid::Serializable.from_h(hash)
id.to_s # => "ISO 9001:2015"

Import from JSON

require 'pubid/iso'

json = '{"flavor":"iso","publisher":"ISO","number":"9001","year":"2015"}'
id = Pubid::Serializable.from_json(json)
id.to_s # => "ISO 9001:2015"

Round-trip Conversion

All identifiers support full round-trip conversion preserving all attributes:

require 'pubid/iso'

original = Pubid::Iso.parse("ISO 9001:2015/Amd 1:2020/Cor 1:2021")
hash = original.to_h
restored = Pubid::Serializable.from_h(hash)

restored.to_s # => "ISO 9001:2015/Amd 1:2020/Cor 1:2021"
restored.to_urn # => "urn:iso:std:iso:9001:amd:2020:cor:2021:v1"

Utility Methods

PubID V2 provides convenience methods for identifier manipulation and comparison.

Excluding Attributes

The exclude method returns a copy of the identifier without specified attributes:

require 'pubid/iso'

id = Pubid::Iso.parse("ISO 9001:2015/Amd 1:2020")
id.exclude(:supplements).to_s
# => "ISO 9001:2015"

id.exclude(:year, :part).to_s
# => "ISO 9001"

Comparing Editions

The new_edition_of? method checks if an identifier is a newer edition of the same document:

require 'pubid/iso'

id1 = Pubid::Iso.parse("ISO 9001:2015")
id2 = Pubid::Iso.parse("ISO 9001:2019")

id2.new_edition_of?(id1) # => true
id1.new_edition_of?(id2) # => false

# Raises ArgumentError for different documents
id3 = Pubid::Iso.parse("ISO 9002:2019")
id3.new_edition_of?(id1) # => ArgumentError: Cannot compare edition: different number

Getting Root Identifier

The root method traverses supplement chains to return the base document:

require 'pubid/iso'

id = Pubid::Iso.parse("ISO 9001:2015/Amd 1:2020/Cor 1:2021")
id.root.to_s
# => "ISO 9001:2015"

# Returns self for base identifiers
base = Pubid::Iso.parse("ISO 9001:2015")
base.root.to_s # => "ISO 9001:2015"
base.root.equal?(base) # => true

Features

  • Immutable operations - All methods return new instances

  • Type-safe - Raises ArgumentError for invalid comparisons

  • Preserves all attributes - Round-trip through hash/JSON

  • Works with supplements - Traverses supplement chains correctly

Advanced Rendering Styles

ISO and IEC identifiers support multiple abbreviation forms for supplements to maintain round-trip fidelity with different official formats.

Overview

The rendering styles feature enables PubID to parse and render identifiers in multiple official formats while preserving the exact format that was parsed. This ensures perfect round-trip compatibility with standards organizations' documentation.

Supported formats: * ISO: Short (AMD, DAM, COR) vs Long (Amd, DaM, Cor) - distinguished by case * IEC: Short (AMD1, COR1) vs Long (Amd 1, Cor 1) - distinguished by spacing

Usage Examples

ISO Examples

# ISO - Preserves long form (mixed case)
id = Pubid::Iso.parse("ISO 8601:2019/DAmd 1")
id.to_s  # => "ISO 8601:2019/DAmd 1"

# ISO - Preserves short form (uppercase)
id = Pubid::Iso.parse("ISO 8601:2019/DAM 1")
id.to_s  # => "ISO 8601:2019/DAM 1"

# Corrigendum with long form
id = Pubid::Iso.parse("ISO/IEC 19115:2003/Cor 1:2006")
id.to_s  # => "ISO/IEC 19115:2003/Cor 1:2006"

IEC Examples

# IEC - Preserves long form (with space)
id = Pubid::Iec.parse("IEC 60050-351:2013/Amd 1:2016")
id.to_s  # => "IEC 60050-351:2013/Amd 1:2016"

# IEC - Preserves short form (no space)
id = Pubid::Iec.parse("IEC 60050-351:2013/AMD1:2016")
id.to_s  # => "IEC 60050-351:2013/AMD1:2016"

Key Features

  • Automatic format detection - No manual configuration required

  • Round-trip fidelity - Parse and render back to exact original format

  • Standards compliance - Supports both official ISO and IEC formats

  • Backward compatible - Existing code continues to work unchanged

Documentation

See Advanced Rendering Styles Guide for complete documentation including:

  • All supported formats for ISO and IEC

  • Detailed usage examples

  • Architecture and implementation details

  • Format comparison tables

Metadata Export

PubID includes a metadata export layer (Pubid::Export) that extracts structured schema information from all 22+ flavors — identifier types, typed stages, harmonized stage codes, abbreviations, and fixture examples — into a single JSON document suitable for website generation, tooling integration, and gap analysis.

Rake Tasks

# Export metadata for all flavors to lib/tasks/website-data.json
bundle exec rake export:website_data

# Audit library data against website publishers
bundle exec rake export:audit

The export task produces a JSON file covering 23 flavors and 162 identifier types. Each flavor’s identifier classes, typed stages, abbreviations, and up to 10 fixture examples are extracted automatically from the library source.

Programmatic API

require 'pubid/export'

# Export all flavors
data = Pubid::Export::Exporter.export_all
# => { "iso" => { identifier_types: [...], attributes: [...] }, ... }

# Export a single flavor
exporter = Pubid::Export::SchemeExporter.new(:iso)
result = exporter.export
result.to_h  # => { identifier_types: [...], attributes: [...] }

Output Schema

The exported JSON follows this structure:

{
  "<flavor>": {
    "identifier_types": [
      {
        "key": "is",
        "title": "International Standard",
        "short": "IS",
        "abbr": ["", "IS"],
        "typed_stages": [
          {
            "stage_code": "dis",
            "type_code": "is",
            "abbr": ["DIS", "FPD"],
            "name": "Draft International Standard",
            "harmonized_stages": ["40.00", "40.20", "40.60", ...]
          }
        ],
        "examples": ["ISO 9001:2015", "ISO/IEC 17031-1:2020", ...]
      }
    ],
    "attributes": ["number", "part", "date", "edition", ...]
  }
}
Table 1. Schema fields
Field Type Description

key

string

Machine-readable type key (e.g., is, tr, amd)

title

string

Human-readable type name (e.g., "International Standard")

short

string|null

Short abbreviation if defined by the publisher

abbr

string[]

All recognized abbreviations for this type (parsed input variants)

typed_stages

object[]

Development stages specific to this document type

typed_stages[].stage_code

string

Stage code (e.g., dis, cd, fdis)

typed_stages[].type_code

string

Document type this stage applies to

typed_stages[].abbr

string[]

Abbreviations recognized for this typed stage

typed_stages[].name

string

Full name (e.g., "Draft International Standard")

typed_stages[].harmonized_stages

string[]

ISO harmonized stage codes (e.g., 40.00, 50.60)

examples

string[]

Up to 10 real-world identifiers from test fixtures

attributes

string[]

Lutaml::Model attribute names on the identifier class

wrapper_types

object[]

Value-added overlay types (VAP, Redline, etc.) — same schema as identifier_types

Extraction Strategies

Different flavors use different internal architectures. The export layer uses the Strategy pattern to handle each without modifying existing code:

Table 2. Strategy classes
Strategy Flavors Pattern

SchemeExporter

ISO, IEC, ASTM, ASHRAE, ASME, CCSDS, CIE, CSA, JIS, JCGM, OIML, IDF, API, SAE, ANSI

Scheme.identifiers + per-class def self.type + TYPED_STAGES

RegistryExporter

BSI, CEN

TYPED_STAGES_REGISTRY on Scheme class (centralized registry)

IeeeExporter

IEEE

KEY_IDENTIFIER_CLASSES from Identifiers module + module-level typed stages

NistExporter

NIST

Scheme.identifiers + per-class typed_stages class method (not constant)

ItuExporter

ITU

Sector-based types with transform/model pattern

DataClassExporter

ETSI, Plateau

Lutaml::Model::Serializable as Scheme (no per-class identifiers)

Adding a new flavor requires only registering it in FLAVOR_STRATEGIES within Exporter — no existing strategy code changes.

Adding a New Flavor to Export

When adding a new publisher flavor to PubID, follow these steps to integrate it with the export layer:

  1. Ensure identifier classes define def self.type — Each identifier class in lib/pubid/{flavor}/identifiers/ should implement def self.type returning { key: :sym, title: "Name", short: "Abbr" }. This is the primary source of metadata.

  2. Choose the right strategy — Most flavors use one of the existing strategies:

    • SchemeExporter — if the flavor has a Scheme class with a class-level identifiers method that returns identifier classes. This is the most common pattern (ISO, IEC, ASTM, ASHRAE, etc.).

    • RegistryExporter — if the flavor uses a centralized TYPED_STAGES_REGISTRY on the Scheme class instead of per-class TYPED_STAGES (BSI, CEN).

    • IeeeExporter — only for IEEE, which has a unique constant-based discovery.

    • NistExporter — if typed stages come from a class method (not constant).

    • ItuExporter — for sector-based types with transform/model patterns.

    • DataClassExporter — for flavors where Scheme inherits from Lutaml::Model::Serializable (ETSI, Plateau).

  3. Register the flavor in lib/pubid/export/exporter.rb:

    [source,ruby]
    ----
    FLAVORS = %i[iso iec ... new_flavor].freeze
    FLAVOR_STRATEGIES = {
      iso: :scheme,
      # ...
      new_flavor: :scheme,  # or :registry, :data_class, etc.
    }.freeze
    ----
  4. Add the module mapping in FlavorExporter#scheme_module (in lib/pubid/export/flavor_exporter.rb):

    [source,ruby]
    ----
    when "new_flavor" then Pubid::NewFlavor
    ----
  5. If the flavor has wrapper types (value-added formats like VAP, Redline), add them to WRAPPER_CLASSES in FlavorExporter:

    [source,ruby]
    ----
    WRAPPER_CLASSES = {
      iec: %i[VapIdentifier],
      bsi: %i[ValueAddedPublication],
      new_flavor: %i[WrapperClassName],
    }.freeze
    ----
  6. Add fixture examples (optional) — Create spec/fixtures/{flavor}/identifiers/pass/{type_key}.txt with one identifier per line. Up to 10 examples per type are automatically picked up during export.

  7. Write tests — Add spec/pubid/export/{flavor}_exporter_spec.rb:

    [source,ruby]
    ----
    require "spec_helper"
    require "pubid/export"
    RSpec.describe Pubid::Export::SchemeExporter, "for new_flavor" do
      subject(:result) { described_class.new(:new_flavor).export }
      it "exports identifier types" do
        expect(result.identifier_types.size).to be > 0
      end
    end
    ----
  8. Regenerate data — Run bundle exec rake export:website_data to update lib/tasks/website-data.json.

Audit

The Pubid::Export::Auditor compares library-generated metadata against website publisher data to identify gaps:

  • Identifier types present in the library but missing from the website

  • Identifier types present on the website but not in the library

  • Per-flavor summary of mismatches

require 'pubid/export'

data = Pubid::Export::Exporter.export_all
auditor = Pubid::Export::Auditor.new(data)
results = auditor.audit(website_publishers)
puts auditor.summary(results)
# => "Audit Summary: 2 missing, 14 extra"

File Layout

lib/pubid/export.rb                  # Module entry point
lib/pubid/export/
├── result.rb                        # Immutable value objects (IdentifierTypeResult, TypedStageResult, FlavorResult)
├── flavor_exporter.rb               # Abstract base class
├── scheme_exporter.rb               # Strategy: Scheme.identifiers pattern
├── registry_exporter.rb             # Strategy: TYPED_STAGES_REGISTRY pattern
├── ieee_exporter.rb                 # Strategy: IEEE identifier discovery
├── nist_exporter.rb                 # Strategy: NIST per-class typed_stages
├── itu_exporter.rb                  # Strategy: ITU transform/model pattern
├── data_class_exporter.rb           # Strategy: Lutaml::Model::Serializable Scheme
├── exporter.rb                      # Orchestrator (FLAVOR_STRATEGIES dispatch)
└── auditor.rb                       # Library vs website gap analysis
lib/tasks/export.rake                # Rake tasks: export:website_data, export:audit

Repository

This repository is a monorepo for the PubID Ruby gems, which implement the PubID identifier data model and its various components.

The PubID Ruby gems implement identifiers from multiple standards organizations, making it easier for developers to work with them in their applications.

This repository contains all the pubid-* gems consolidated into a single monorepo for easier development and maintenance while preserving individual gem releases.

Structure

The repository is organized as follows:

.
├── lib/pubid/          # V2 implementation (ACTIVE)
│   ├── core/               # Shared base classes and components
│   ├── iso/                # ISO - Production ready
│   ├── iec/                # IEC - Production ready
│   ├── cen/                # CEN - Production ready
│   ├── idf/                # IDF - Production ready
│   ├── ieee/               # IEEE - Production ready
│   ├── nist/               # NIST - Production ready
│   └── ...                 # 22+ flavors total
├── data/                   # Pre-parse normalization data
│   ├── iso/update_codes.yaml
│   ├── iec/update_codes.yaml
│   └── ...
├── spec/                   # Tests
│   ├── pubid/              # Per-flavor tests
│   ├── fixtures/           # Bulk test fixtures
│   └── integration/        # Cross-gem tests
├── docs/                   # Documentation
├── .github/workflows/      # CI/CD workflows
├── Gemfile                 # Root Gemfile for development
├── Rakefile               # Monorepo management tasks
├── .rubocop.yml           # Shared RuboCop configuration
└── LICENSE.txt            # Shared license

PubID V2 Architecture

General

PubID V2 implements a completely redesigned parser architecture with clean separation of concerns and model-driven design. The implementation is located in [lib/pubid/](lib/pubid/).

Three-layer design

V2 implements a clean separation of concerns across three distinct layers:

┌─────────────────────────────────────┐
│    Pre-parser Normalization Layer    │  update_codes.yaml normalization
│  (Malformed Input → Cleaned Input)  │  BEFORE parsing
└──────────┬──────────────────────────┘
           │
┌──────────▼──────────────────────────┐
│         Parser Layer                │  Grammar-based parsing (Parslet)
│  (Syntax → Parse Tree)              │
└──────────┬──────────────────────────┘
           │
┌──────────▼──────────────────────────┐
│         Builder Layer               │  Transform parse tree to objects
│  (Parse Tree → Attributes)          │
└──────────┬──────────────────────────┘
           │
┌──────────▼──────────────────────────┐
│      Identifier Layer               │  Lutaml::Model serializable objects
│  (Attributes → String)              │
└─────────────────────────────────────┘

Where:

Pre-parser Normalization

Applies update_codes.yaml mappings from data/{flavor}/ directory to normalize malformed identifiers before parsing. This handles legacy formats, typos, and historical variations.

Parser Layer

Parslet-based grammar defining syntax rules. Handles pattern matching and tokenization.

Builder Layer

Transforms hash tree from parser into attribute hash. Handles special cases and edge conditions.

Identifier Layer

Lutaml::Model-based classes with rendering logic. Provides serialization and string representation.

Parser performance

The V2 parsers have been tested against real-world identifier databases to ensure high accuracy:

Parser Success Rate Examples Status Notes

NIST

98.47%

19,191/19,488

✅ Complete

Exceeds 95% target, handles historical NBS patterns

IEEE

87.95%

8,388/9,537

✅ Production Ready

Parser with TYPED_STAGE architecture, Joint Development support, Pattern 4 Relationships

ISO

90.0%

2,573/2,859

✅ URN generation

100% functional + URN generation complete, remaining types in progress

BSI

TBD

-

⚠️ Needs testing

Implementation complete, needs comprehensive tests

IEC

84.58%

823/973

✅ Production Ready

21 identifier types, comprehensive test coverage

CEN

83.2%

79/95

✅ Production Ready

Native & adopted standards, TYPED_STAGES architecture

ITU

96.5%

166/172

✅ Production Ready

Core identifiers complete, 6 combined ID limitations documented

JIS

TBD

-

⚠️ Needs testing

Implementation complete, needs comprehensive tests

ETSI

TBD

-

⚠️ Needs testing

Implementation complete, needs comprehensive tests

Usage examples

NIST

The NIST parser handles standard NIST publications and historical NBS patterns:

require 'pubid'

# Parse standard NIST identifier
id = Pubid::Nist.parse("NIST SP 800-53r5")

# Access components
id.series    # => "SP"
id.number    # => "800-53"
id.revision  # => "r5"

# Render to string
id.to_s      # => "NIST SP 800-53r5"

# Parse historical NBS patterns
id = Pubid::Nist.parse("NBS LCIRC 1019r1963")
id.to_s      # => "NBS LCIRC 1019r1963"

# CSM volume-number format
id = Pubid::Nist.parse("NBS CSM v6n1")
id.to_s      # => "NBS CSM v6n1"

# Supplement with revision
id = Pubid::Nist.parse("NBS CIRC 154supprev")
id.to_s      # => "NBS CIRC 154supprev"

Language: * Legacy identifier: Preserve original with lang.first text * Multi-language text: Default to primary language.original_code * ISO Constructs: Check V2 components (Part, Language, Public) for NULL

IEC

The IEC parser handles complex patterns including adopted standards and dual-published identifiers:

require 'pubid'

# Parse standard IEC identifier
id = Pubid::Iec.parse("IEC 62014-5 IEEE Std 1734-2011")
id.class  # => Pubid::Iec::Identifiers::DualPublished
id.to_s   # => "IEC 62014-5 and IEEE Std 1734-2011"  # normalized

# Space-separated dual identifiers are auto-detected
id = Pubid::Iec.parse("ANSI C37.61-1973 and IEEE Std 321-1973")
id.to_s   # => "ANSI C37.61-1973 and IEEE Std 321-1973"
Pattern 4: Relationship Identifiers ✨

IEEE supports 7 relationship types with recursive identifier parsing:

Table 3. Relationship Types
Type Description Example

revision_of

Indicates this standard revises another

IEEE Std 802 (Revision of IEEE Std 801)

amendment_to

Indicates this is an amendment

IEEE Std 100 (Amendment to IEEE Std 99)

corrigendum_to

Indicates this corrects errors

IEEE Std 200 (Corrigendum to IEEE Std 199)

incorporates

Indicates incorporation of another standard

IEEE Std 300 (incorporates IEEE Std 299)

adoption_of

Indicates adoption of external standard

IEEE Std 400 (Adoption of ISO/IEC 9945-1:2009)

supplement_to

Indicates supplementary material

IEEE Std 500 (Supplement to IEEE Std 499)

draft_amendment_to

Indicates draft amendment

IEEE Std 600 (Draft Amendment to IEEE Std 599)

Parsing Pattern 4 Identifiers
require 'pubid/ieee'

# Parse relationship identifier
id = Pubid::Ieee.parse('IEEE Std 802 (Revision of IEEE Std 801)')
id.relationships.first.relationship_type  # => "revision_of"
id.relationships.first.related_identifiers.first.to_s  # => "IEEE Std 801"
id.to_s  # => "IEEE Std 802 (Revision of IEEE Std 801)" (perfect round-trip)

# Multiple related identifiers
id = Pubid::Ieee.parse('IEEE Std 100 (Amendment to IEEE Std 99, IEEE Std 98)')
id.relationships.first.related_identifiers.count # => 2

# Intermediate amendments (as amended by clause)
id = Pubid::Ieee.parse('IEEE Std 200 (Corrigendum to IEEE Std 199 as amended by IEEE Std 199a)')
id.relationships.first.intermediate_amendments.first.to_s # => "IEEE Std 199a"

Architecture: - Relationships are Lutaml::Model objects - Related identifiers are recursively parsed as full Base objects - Perfect round-trip fidelity maintained - Backward compatible with legacy attributes - 28/28 unit tests passing (100%)

Historical Sub-Flavors (AIEE & IRE) ✨

IEEE supports two historical predecessor organizations that merged in 1963 to form IEEE:

AIEE (American Institute of Electrical Engineers) 1884-1963
# Parse AIEE identifier with long date format
aiee = Pubid::Ieee.parse("AIEE No. 552, November 1955")
aiee.to_s                        # => "AIEE No. 552, November 1955"
aiee.to_s(date_format: :short)   # => "AIEE No. 552-1955"
aiee.to_s(date_format: :long)    # => "AIEE No. 552, November 1955"

# Parse AIEE identifier with short date format
aiee = Pubid::Ieee.parse("AIEE No. 59-1962")
aiee.to_s                        # => "AIEE No. 59-1962"
aiee.to_s(date_format: :long)    # => "AIEE No. 59, 1962"

AIEE Features: - Always uses "No" or "No." (never "Std") - Rendering profiles support both short (dash) and long (comma + optional month) formats - Preserves original parsed format by default - User can override output format with date_format: parameter

IRE (Institute of Radio Engineers) 1912-1963
# Parse IRE identifier (year-first format)
ire = Pubid::Ieee.parse("52 IRE 7.S2")
ire.to_s      # => "52 IRE 7.S2"
ire.year      # => 1952 (converts 2-digit to 4-digit internally)

# IRE with committee notation
ire = Pubid::Ieee.parse("61 IRE 28 S1")
ire.to_s      # => "61 IRE 28 S1"
ire.year      # => 1961

IRE Features: - Year-first format (unlike modern IEEE) - 2-digit years (12-63) automatically converted to 4-digit (1912-1963) - Committee notation: 7.S2 (committee 7, Standard 2), 28 S1 (committee 28, Standard 1) - Rendered output preserves 2-digit year format

Transitional Identifiers: - IEEE-AIEE No. 56 - Transitional period documents - IEEE-IRE X - Mixed publisher identifiers

Architecture: - AIEE/IRE are proper Lutaml::Model classes (not IEEE subclasses) - Separate historical organizations with distinct patterns - Compatible with Pattern 4 relationships (can be related identifiers) - Clean MODEL-DRIVEN implementation

Data Cleaning & Preprocessing

The IEEE parser automatically cleans common data quality issues:

  • HTML entity normalization (&x2122;, &&, &x2019;')

  • Number space correction (C57.1 2.25C57.12.25)

  • Year space correction (1 9961996)

  • Trailing comma/text removal (, Standard → ``)

New Copublisher Organizations:

Added support for additional copublisher organizations:

  • CSA (Canadian Standards Association)

  • ASME (American Society of Mechanical Engineers)

  • ASA (American Standards Association - for AIEE equivalence patterns)

# CSA copublisher
csa_id = Pubid::Ieee.parse("IEEE/CSA P844.1-2017")
csa_id.copublisher  # => ["CSA"]

# ASME in semicolon equivalence
asme_id = Pubid::Ieee.parse("IEEE Std 120-1955; ASME PTC 19.6-1955")

Corrigendum as Proper Identifier Type:

IEEE corrigenda are now first-class SupplementIdentifier objects with full base identifier parsing:

cor = Pubid::Ieee.parse("IEEE Std 535-2013/Cor. 1-2017")
cor.class                  # => Pubid::Ieee::Identifiers::Corrigendum
cor.cor_number             # => "1"
cor.cor_year               # => "2017"
cor.base_identifier        # => <Base identifier object>
cor.base_identifier.to_s   # => "IEEE Std 535-2013"
cor.to_s                   # => "IEEE Std 535-2013/Cor. 1-2017"

# Round-trip fidelity preserved
cor2 = Pubid::Ieee.parse("IEEE Std 802.1AC-2016/Cor. 1-2018")
cor2.to_s                  # => "IEEE Std 802.1AC-2016/Cor. 1-2018"

Extended Relationship Types:

Now supports 11 relationship types (added Reaffirmation and Redesignation):

Table 4. All Supported Relationship Types
Type Description Example

revision_of

Standard revises another

(Revision of IEEE Std X)

amendment_to

Amendment to a standard

(Amendment to IEEE Std X)

corrigendum_to

Correction to a standard

(Corrigendum to IEEE Std X)

incorporates

Incorporates another standard

(incorporates IEEE Std X)

adoption_of

Adopts external standard

(Adoption of ISO/IEC X)

supplement_to

Supplementary material

(Supplement to IEEE Std X)

draft_amendment_to

Draft amendment

(Draft Amendment to IEEE Std X)

draft_revision_of

Draft revision

(Draft Revision of IEEE Std X)

reaffirmation_of

Reaffirms validity

(Reaffirmation of ANSI N42.18-1980)

redesignation_of

Identifier redesignation

(Redesignation of ANSI N13.10-1974)

# Reaffirmation relationship
reaffirm = Pubid::Ieee.parse("ANSI N42.18-2004 (Reaffirmation of ANSI N42.18-1980)")
reaffirm.relationships.first.relationship_type  # => "reaffirmation_of"
reaffirm.to_s  # => "ANSI N42.18-2004 (Reaffirmation of ANSI N42.18-1980)"

# Redesignation relationship
redesig = Pubid::Ieee.parse("ANSI N42.18-2004 (Redesignation of ANSI N13.10-1974)")
redesig.relationships.first.relationship_type  # => "redesignation_of"

# Multiple relationships with semicolon separator
multi = Pubid::Ieee.parse("IEEE Std 100 (Reaffirmation of X; Redesignation of Y)")
multi.relationships.length  # => 2
multi.relationships[0].relationship_type  # => "reaffirmation_of"
multi.relationships[1].relationship_type  # => "redesignation_of"

ANSI P Prefix Support:

Added support for ANSI draft project identifiers (P prefix):

ansi_p = Pubid::Ieee.parse("ANSI PN42.34-2015")
ansi_p.publisher  # => "ANSI"
ansi_p.type       # => "P"
ansi_p.to_s       # => "ANSI PN42.34-2015"

Architecture:

All enhancements maintain strict MODEL-DRIVEN architecture: * Corrigendum as proper Lutaml::Model class * Relationships as component objects * MECE organization preserved * Three-layer separation maintained

ASME (American Society of Mechanical Engineers)

  • Status: ✅ 552/731 (75.51%)

  • Features: BPVC subdivisions, multi-char designators, CSA dual-publishing

  • Architecture: Complete V2 with MODEL-DRIVEN design

ASME Code Structure

ASME uses a designator + number system with special BPVC handling:

Standard Format:

ASME {DESIGNATOR}{NUMBER}-{YEAR}

Examples:
ASME B16.5-2020                    # Single-letter designator
ASME PTC-1-2022                    # Multi-char designator (Performance Test Code)
ASME Y14.43-2011                   # Alphanumeric number
ASME A17.1/CSA B44-2022            # CSA dual-published

BPVC (Boiler & Pressure Vessel Code) Format:

ASME BPVC.{SECTION}[.{SUBSECTION}][.{CODE}]-{YEAR}

Dotted Notation Examples:
ASME BPVC.I-2021                   # Section I only
ASME BPVC.III.1.NB-2021            # Section III, Subsection 1, Code NB
ASME BPVC.CC.BPV-2021              # Case Code BPV

Special Variants:
ASME BPVC COMPLETE CODE BIND-2019  # Complete code set
ASME BPVC-CC-BPV-2019              # Dash notation variant
Table 5. BPVC Components
Component Description

Roman Numerals

I through XIII for main sections

Letter Codes

NB, NC, ND, NE, NF, NG, NCA, NCD, BPV, SSC, NUC

Case Codes

CC.CODE format for special case rulings

Multi-Character Designators:

ASME uses 23+ multi-character codes for specialized document types:

Code Full Name

PTC

Performance Test Code

PVHO

Pressure Vessels for Human Occupancy

PCC

Post-Construction Code

NQA

Nuclear Quality Assurance

V&V

Verification & Validation

RA, QME, BTH, BPE, OM

And 18+ other specialized codes

Additional Features: - Reaffirmation notation: (R2020) - Language codes: (SPANISH) - Draft years: 20XX, 202X - Revision notes: [Draft Proposed Revision of…​]

Usage Examples
require 'pubid/asme'

# Parse standard code
id = Pubid::Asme.parse("ASME B16.5-2020")
id.code.designator  # => "B"
id.code.number      # => "16.5"
id.year             # => "2020"
id.to_s             # => "ASME B16.5-2020"

# Parse BPVC subdivision
bpvc = Pubid::Asme.parse("ASME BPVC.III.1.NB-2021")
bpvc.code.designator  # => "BPVC.III.1.NB"
bpvc.code.number      # => ""
bpvc.year             # => "2021"
bpvc.to_s             # => "ASME BPVC.III.1.NB-2021"

# Parse multi-char designator
ptc = Pubid::Asme.parse("ASME PTC-1-2022")
ptc.code.designator   # => "PTC"
ptc.code.number       # => "1"
ptc.to_s              # => "ASME PTC-1-2022"

# Parse CSA dual-published
dual = Pubid::Asme.parse("ASME A17.1/CSA B44-2022")
dual.code.designator  # => "A"
dual.code.number      # => "17.1"
dual.csa_number       # => "B44"
dual.to_s             # => "ASME A17.1/CSA B44-2022"

Known Limitations: All 731 ASME identifiers are normative (from official ASME sources). Current parser handles 75.51% (552/731) with opportunities for further enhancement in specialized patterns.

BSI (British Standards Institution)

Status: ✅ 47/47 integration tests (100%), 1,044/1,579 fixtures (66.12%) Architecture: Complete V2 with VALUE-ADDED PUBLICATION wrapper pattern Features: Adopted standards, consolidated identifiers, value-added publications, aerospace standards, new document types

BSI Value-Added Publications ✨

BSI supports value-added publication formats as wrapper identifiers following IEC VapIdentifier pattern:

# PDF format
pdf = Pubid::Bsi.parse("PD 5500:2018+A3:2020 PDF")
pdf.class  # => Pubid::Bsi::Identifiers::ValueAddedPublication
pdf.format # => "PDF"
pdf.to_s   # => "PD 5500:2018+A3:2020 PDF"

# Tracked Changes
tc = Pubid::Bsi.parse("PAS 96:2017 - TC")
tc.format  # => "TC"
tc.to_s    # => "PAS 96:2017 - TC"

# Book format
book = Pubid::Bsi.parse("PP 7722:2006 BOOK")
book.format # => "BOOK"
book.to_s   # => "PP 7722:2006 BOOK"

Architecture: ValueAddedPublication is a proper wrapper class (not boolean attributes), wraps any base identifier, preserves MODEL-DRIVEN consistency with IEC.

BSI Document Types

BSI supports multiple document type prefixes:

Prefix Type Example

BS

British Standard

BS 4592-0:2006+A1:2012

PD

Published Document

PD 5500:2021+A2:2022

DD

Draft Document

DD 240-1:1997

PAS

Publicly Available Specification

PAS 3002:2018+C1:2018

Aerospace Prefixes

Aerospace/Specialized Standards (27 prefixes)

BS A 109:2024, BS 2A 293:2005, BS SP 113:1954

Handbook

BSI Handbook

Handbook 17:1963

PP

Practice Guide (Published Practice)

PP 888:1982

BIP

British Industrial Practice

BIP 2225:2022

AerospaceStandard identifier type handles 27 aerospace/specialized prefixes (A, AU, B, C, F, G, HC, L, M, MA, PL, SP, TA, X, 2A-2X, 3A-3TA, 4F-4S, 5S, 7S) with proper TYPED_STAGES integration.

Handbook, PP (Practice Guide), and BIP (British Industrial Practice) identifier types with proper TYPED_STAGES integration.

BSI Adoption Patterns

BSI adopts international and European standards with prefix preservation:

# Adopted ISO standard
bsi = Pubid::Bsi.parse("BS ISO 37101:2016")
bsi.to_s  # => "BS ISO 37101:2016"

# Adopted European Norm with ISO (three-level)
bsi = Pubid::Bsi.parse("BS EN ISO 13485:2016+A11:2021")
bsi.publisher.body  # => "BS"
bsi.adopted_identifier.publisher.body  # => "EN"
# Three-level: BS wraps EN wraps ISO

# National Annex with supplements
na = Pubid::Bsi.parse("NA+A1:2012 to BS EN 1993-5:2007")
na.supplements.first.number  # => "1"
na.supplements.first.year    # => "2012"

Features: - Short year expansion: A1:15A1:2015 - Multiple supplement formats: +A1:2021, +C1:2018, +A11:2021 - Expert Commentary suffix: BS 5250:2021 ExComm - Value-added formats: PDF, TC (Tracked Changes), BOOK

CEN (European Committee for Standardization)

Status: ✅ 18/18 tests (100%) Architecture: Complete V2 with 4 identifier types Features: EN documents, technical specifications/reports, joint committee publications

CEN Document Types ✨

Identifier types:

Type Full Name Example

EN

European Norm

EN 1992-1-1:2004

CEN/TS

CEN Technical Specification

CEN/TS 14972

CLC/TR

CLC Technical Report

CLC/TR 62125:2008

ES

European Specification

ES 59008-6-1:1999

CR

CEN Report

CR 13933:2000

HD

CENELEC Harmonization Document

HD 384.7.711 S1:2003

ENV

European Prestandard

ENV ISO 11079:1999

CEN Parsing Examples
# CEN Technical Specification (slash separator!)
cen = Pubid::Cen.parse("CEN/TS 14972")
cen.to_s  # => "CEN/TS 14972" (slash, not space!)

# CLC Technical Report
clc = Pubid::Cen.parse("CLC/TR 62125:2008")
clc.to_s  # => "CLC/TR 62125:2008"

# Joint committee
joint = Pubid::Cen.parse("CEN/CLC/TR 17602-80-12:2021")
joint.publisher.copublisher  # => ["CLC"]
joint.to_s  # => "CEN/CLC/TR 17602-80-12:2021"

# European Prestandard with ISO adoption (NEW)
env = Pubid::Cen.parse("ENV ISO 11079:1999")
env.adopted_identifier.to_s  # => "ISO 11079:1999"
env.to_s  # => "ENV ISO 11079:1999"

Key Architectural Decision: CEN uses slash separator between publisher and type (unlike ISO’s space), implemented consistently throughout all identifier classes.

SAE (Society of Automotive Engineers) ✨

Status: Complete Architecture: Complete V2 implementation Features: 5 document types with letter suffix support

Table 6. SAE Document Types
Type Full Name Example

AMS

Aerospace Material Specification

SAE AMS 7904F:2024

AIR

Aerospace Information Report

SAE AIR 8466:2024

ARP

Aerospace Recommended Practice

SAE ARP 1234:2024

AS

Aerospace Standard

SAE AS 5678:2024

MA

Material Advisory

SAE MA 9012:2024

SAE Parsing Examples
require 'pubid/sae'

# Aerospace Material Specification with letter suffix
sae = Pubid::Sae.parse("SAE AMS 7904F:2024")
sae.type.to_s          # => "AMS"
sae.number.to_s        # => "7904F" (includes letter suffix)
sae.date.year          # => 2024
sae.to_s               # => "SAE AMS 7904F:2024"

# Aerospace Information Report
air = Pubid::Sae.parse("SAE AIR 8466:2024")
air.type.to_s  # => "AIR"
air.to_s       # => "SAE AIR 8466:2024"

# Perfect round-trip fidelity
parsed = Pubid::Sae.parse("SAE AMS 2813G:2022")
parsed.to_s    # => "SAE AMS 2813G:2022"

Architecture: - Standard V2 three-layer pattern (Parser/Builder/Identifier) - Code component handles letter suffixes (A-Z) - Date component for year publication - Type component for document classification - MODEL-DRIVEN design following established patterns

ISO

require 'pubid'

# Parse ISO identifier
id = Pubid::Iso.parse("ISO 19115:2003")

# Access components and render
# (Implementation details to be documented)

ISO parser architecture

Design overview

The ISO parser uses a three-layer architecture with strict separation of concerns:

Architecture layers
Input String
    ↓
┌──────────────────┐
│  Parser Layer    │  Grammar-based parsing (Parslet)
│                  │  - Publisher rules (
│                  │  - Type tokens (TR, TS, Guide, etc.)
│                  │  - Supplement patterns (/Amd, /FDAM)
│                  │  - Special patterns (DIR SUP, IWA)
└──────┬───────────┘
       │ Parse Tree (nested Hash)
       ↓
┌──────────────────┐
│  Builder Layer   │  Object construction
│                  │  - Class selection
│                  │  - Component creation
│                  │  - Supplement recursion
│                  │  - Special case handling
└──────┬───────────┘
       │ Model Objects
       ↓
┌──────────────────┐
│  Model Layer     │  Identifier classes
│                  │  - 16 identifier types
│                  │  - Component attributes
│                  │  - Rendering logic (#to_s)
└──────┬───────────┘
       │
       ↓
Output String

Component architecture

All identifiers use shared components for common attributes:

Component Purpose

Publisher

Handles publisher string and copublisher array. Uses to_s for rendering.

Type

Document type with abbr attribute (e.g., "TR", "TS", "PAS")

Date

Year-based dates for document publication

Code

Generic string values for number, part, stage_iteration

Language

Language codes with original_code attribute (e.g., "E/F/R")

Stage

Document development stage (WD, CD, DIS, etc.)

TypedStage

Combined stage+type for supplements (FDAM, PDAM, DAM, etc.)

Identifier class hierarchy

::Pubid::Identifier (parent)
  │
  ├─ SingleIdentifier (base documents)
  │   ├─ InternationalStandard (default)
  │   ├─ Guide
  │   ├─ TechnicalReport (TR)
  │   ├─ TechnicalSpecification (TS)
  │   ├─ Data (DATA)
  │   ├─ Pas (PAS)
  │   ├─ TechnologyTrendsAssessments (TTA)
  │   ├─ InternationalWorkshopAgreement (IWA)
  │   ├─ InternationalStandardizedProfile (ISP)
  │   ├─ Recommendation (R - legacy)
  │   └─ Directives (DIR)
  │
  └─ SupplementIdentifier (amendments to base)
      ├─ Amendment (Amd, FDAM, PDAM, DAM)
      ├─ Corrigendum (Cor, FDCOR, DCOR)
      ├─ Supplement (Suppl)
      ├─ Extract (Ext)
      └─ DirectivesSupplement (DIR SUP)

Usage examples

Basic parsing
require "pubid"

# International Standard
id = Pubid::Iso.parse("ISO 19115:2003")
id.class # => Pubid::Iso::Identifiers::InternationalStandard
id.to_s  # => "ISO 19115:2003"

# With copublisher
id = Pubid::Iso.parse("ISO/IEC 27001:2013")
id.publisher.to_s # => "ISO/IEC"

# Multiple copublishers
id = Pubid::Iso.parse("ISO/IEC/IEEE 8802-3:2021")
id.publisher.copublisher # => ["IEC", "IEEE"]
Document types
# Technical Report
id = Pubid::Iso.parse("ISO/IEC TR 29186:2012")
id.type.abbr # => "TR"

# Technical Specification
id = Pubid::Iso.parse("ISO/IEC TS 25011:2017")
id.type.abbr # => "TS"

# Guide with languages
id = Pubid::Iso.parse("ISO/IEC Guide 51:1999(E/F/R)")
id.languages.map(&:original_code) # => ["E/F/R"]

# Data
id = Pubid::Iso.parse("ISO/DATA 7:1979")
id.type.abbr # => "DATA"
Supplements
# Amendment
id = Pubid::Iso.parse("ISO 19110:2005/Amd 1:2011")
id.class # => Pubid::Iso::Identifiers::Amendment
id.base_identifier.to_s # => "ISO 19110:2005"
id.number.value # => "1"

# Staged amendment (FDAM = Final Draft Amendment)
id = Pubid::Iso.parse("ISO/IEC 8802-3:2021/FDAM 1")
id.typed_stage.abbreviation # => "FDAM"
id.typed_stage.stage_code.to_s # => "fdamd"

# Corrigendum
id = Pubid::Iso.parse("ISO/IEC 8802-21:2018/Cor 1:2018")
id.class # => Pubid::Iso::Identifiers::Corrigendum

# Multi-level (Amendment to Amendment gets Corrigendum)
id = Pubid::Iso.parse("ISO/IEC 13818-1:2015/Amd 3:2016/Cor 1:2017")
id.class # => Pubid::Iso::Identifiers::Corrigendum
id.base_identifier.class # => Pubid::Iso::Identifiers::Amendment
id.base_identifier.base_identifier.class # => Pubid::Iso::Identifiers::InternationalStandard
Special patterns
# Directives
id = Pubid::Iso.parse("ISO/IEC DIR 1:2022")
id.class # => Pubid::Iso::Identifiers::Directives
id.number.value # => "1"

# Directives Supplement
id = Pubid::Iso.parse("ISO/IEC DIR 1 ISO SUP:2022")
id.class # => Pubid::Iso::Identifiers::DirectivesSupplement
id.base_identifier.class # => Pubid::Iso::Identifiers::Directives
id.supplement_publisher.to_s # => "ISO"

# Bundled Directives (combined document + supplement)
id = Pubid::Iso.parse("ISO/IEC DIR 1:2022 + IEC SUP:2022")
id.class # => Pubid::Iso::Identifiers::BundledIdentifier
id.base_document.class # => Pubid::Iso::Identifiers::Directives
id.supplements.first.class # => Pubid::Iso::Identifiers::DirectivesSupplement
id.to_s # => "ISO/IEC DIR 1:2022 + IEC SUP:2022"

# International Workshop Agreement
id = Pubid::Iso.parse("IWA 14-1:2013")
id.class # => Pubid::Iso::Identifiers::InternationalWorkshopAgreement
id.to_s # => "IWA 14-1:2013"

Key design principles

Object-oriented design
  • No parent class modifications - All extensions through inheritance

  • Proper encapsulation - Private methods for internal logic

  • Single responsibility - Each class has one clear purpose

  • Open/closed principle - Extensible without modification

Component usage
  • Use Type.abbr not Type.value

  • Use Language.original_code not Language.value

  • Use Publisher.to_s not Publisher.body

  • Always check for nil before accessing component methods

MECE design
  • Each identifier class handles mutually exclusive patterns

  • No pattern overlap between classes

  • Parser rules are collectively exhaustive

  • Builder selects exactly one class per pattern

Supplement recursion

Multi-level supplements are built recursively:

"ISO/IEC 13818-1:2015/Amd 3:2016/Cor 1:2017"

Step 1: Build base
  InternationalStandard("ISO/IEC 13818-1:2015")

Step 2: Build first supplement wrapping base
  Amendment(
    base: InternationalStandard("ISO/IEC 13818-1:2015"),
    number: "3",
    year: 2016
  )

Step 3: Build second supplement wrapping first
  Corrigendum(
    base: Amendment(...),
    number: "1",
    year: 2017
  )

Result: Corrigendum → Amendment → InternationalStandard

Testing

Integration tests: spec/pubid/iso/identifier_spec.rb

Unit tests: spec/pubid/iso/**/*_spec.rb

Run tests:

bundle exec rspec spec/pubid/iso/identifier_spec.rb
bundle exec rspec spec/pubid/iso/

V2 architecture principles

The V2 implementation strictly follows these design principles:

Object-Oriented Design

Single Responsibility, Open/Closed, Liskov Substitution, Interface Segregation, Dependency Inversion

MECE Organization

Mutually Exclusive (no overlap), Collectively Exhaustive (full coverage), clear boundaries

Separation of Concerns

Parser handles syntax only, Builder handles transformation only, Identifier handles rendering only

Extensibility

Use inheritance and polymorphism, plugin/registry architecture, avoid hardcoding

Test Quality

Each class has dedicated spec file, no lowering pass thresholds, test actual behavior

Pre-parser Normalization (update_codes.yaml)

V2 applies pre-parsing normalization using data/{flavor}/update_codes.yaml files to handle malformed but commonly-seen identifiers before they reach the parser.

Purpose: Some legacy or incorrectly-formatted identifiers need correction before parsing. For example:

  • Legacy publisher names: NBS HBNIST HB (NBS was renamed to NIST in 1988)

  • Historical formats: FIPS.140-2 (MR/dotted format) parses as machine-readable

  • Typographical variations: IEEE Unapproved Draft StdIEEE Unapproved Draft Std

  • Missing separators: NBSCS e104NBS CS-E 104 (for Commercial Standard Emergency)

How it works:

# Pre-parser normalization is applied automatically
require 'pubid'

# update_codes.yaml: "NBS HB: NIST HB"
id = Pubid::Nist.parse("NBS HB 105-1")
id.to_s  # => "NIST HB 105-1" (NBS→NIST correction applied)

# update_codes.yaml: "FIPS.140-2: FIPS 140-2" (normalizes dotted format)
id = Pubid::Nist.parse("FIPS.140-2")
id.to_s  # => "FIPS.140-2" (MR format preserved via parsed_format)

Centralized UpdateCodes: The normalization logic is centralized in lib/pubid/core/update_codes.rb:

module Pubid
  module Core
    class UpdateCodes
      # Returns all update_codes for a given flavor
      def self.for_flavor(flavor)
        # Loads from data/{flavor}/update_codes.yaml
      end

      # Applies all matching update_codes to an identifier string
      def self.apply(code, flavor)
        # Iterates through codes, applies regex and exact matches
      end
    end
  end
end

Reference documentation: See docs/legacy-update-codes-reference.md for complete listing of all update_codes entries per flavor.

V2 file structure

data/                       # Pre-parsing normalization data
├── iso/update_codes.yaml        # ISO legacy format mappings
├── iec/update_codes.yaml        # IEC legacy format mappings
├── ieee/update_codes.yaml       # IEEE legacy format mappings
├── nist/update_codes.yaml       # NIST legacy format mappings
├── ccsds/update_codes.yaml      # CCSDS legacy format mappings
└── plateau/update_codes.yaml    # PLATEAU legacy format mappings

lib/pubid/              # V2 implementation
├── core/                   # Core module (UpdateCodes, Configuration)
├── components/             # Shared value objects (Publisher, Code, etc.)
├── rendering/              # Shared rendering helpers
├── iso/                    # ISO flavor
│   ├── parser.rb
│   ├── builder.rb
│   ├── scheme.rb
│   ├── identifiers/
│   ├── urn_generator.rb
│   └── urn_parser.rb
├── iec/                    # IEC flavor
├── nist/                   # NIST flavor
├── ieee/                   # IEEE flavor
└── ...                     # 22+ flavors total

spec/pubid/             # Tests
├── iso/                    # ISO tests
├── iec/                    # IEC tests
└── ...                     # Per-flavor tests

V2 Migration Status: ALL 18 FLAVORS COMPLETE

As of January 2026, all 18 flavors are production-ready with 99%+ overall success rate.

Implementation Summary

NIST: 99.96% accuracy - Fixed 29 FIPS month-year patterns OIML: Complete implementation with 9 types and supplements CIE: Dual-style system with 11 types and 3 language formats BSI/CEN/SAE: 4 CEN types, 3 BSI types, SAE flavor added Overall: 88,200+ identifiers validated across all flavors

Detailed Status

Flavor Total IDs Pass Rate Status Key Features

NIST

19,827

19,820

99.96%

✅ Perfect

All series, NBS historical patterns

IEC

12,289

12,289

100%

✅ Perfect

Sub-organizations, VAP, consolidation, rendering styles

JCGM

9

9

100%

✅ Perfect

Complete implementation with GUM-prefixed guides

OIML

80

80

100%

✅ Perfect

9 types, edition support, supplements

CIE

343

321

93.59%

✅ Excellent

Dual-style, 11 types, 3 language formats

ISO

7,572

7,496

99.00%

✅ Excellent

URN generation (RFC 5141-bis), bundled directives

IEEE

9,552

8,629

90.34%

✅ Enhanced

Pattern 4 relationships, AIEE/IRE, Joint Development, 90%+ achieved

JIS

10,555

10,555

100%

✅ Perfect

Complete Japanese Industrial Standards

ETSI

24,718

24,718

100%

✅ Perfect

European Telecommunications Standards

CCSDS

490

490

100%

✅ Perfect

Space data systems standards

ITU

2,041

2,041

100%

✅ Perfect

International Telecommunication Union

PLATEAU

115

115

100%

✅ Perfect

Japanese urban planning standards

ANSI

175

175

100%

✅ Perfect

American National Standards

CEN

95

95

100%

✅ Perfect

European Committee for Standardization

BSI

177

177

100%

✅ Perfect

British Standards Institution

IDF

17

17

100%

✅ Perfect

International Dairy Federation

SAE

N/A

N/A

100%

✅ Perfect

Society of Automotive Engineers

Total

88,200+

87,513+

99%+

Production Ready

18 flavors complete, IEEE at 90.34%

=== Architecture Quality

All 18 flavors implement:

* ✅ MODEL-DRIVEN architecture (Lutaml::Model throughout) * ✅ MECE organization (Mutually Exclusive, Collectively Exhaustive) * ✅ Three-layer separation (Parser/Builder/Identifier) * ✅ Component reuse (Publisher, Code, Date, etc.) * ✅ Round-trip fidelity (Parse → Object → String preserves format)

=== V2 Usage Examples

==== NIST: 99.96%

NIST parser handles all series including FIPS month-year patterns:

[source,ruby] ---- require 'pubid/nist'

# Standard NIST publication id = Pubid::Nist.parse("NIST SP 800-53r5")

# Access components id.series # ⇒ "SP" id.number # ⇒ "800-53" id.revision # ⇒ "r5"

# Render to string id.to_s # ⇒ "NIST SP 800-53r5" ==== OIML

OIML supports 9 identifier types with edition and supplement support:

[source,ruby] ---- require 'pubid/oiml'

# Standard recommendation (short format) rec = Pubid::Oiml.parse("OIML R 138:2007(E)")

rec.to_s(format: :short) # ⇒ "OIML R 138:2007(E)" rec.to_s(format: :long) # ⇒ "OIML R 138 Edition 2007 (E)"

# With edition number guide = Pubid::Oiml.parse("OIML E 5 6th Edition 2015 (E)") guide.edition # ⇒ "6" guide.year # ⇒ "2015"

# Amendment with recursive parsing amd = Pubid::Oiml.parse("Amendment (2009) to OIML R 138 Edition 2007 (E)") amd.base_identifier.to_s # ⇒ "OIML R 138 Edition 2007 (E)" amd.year # ⇒ "2009" ----

==== CCSDS: Space Data Systems Standards

CCSDS (Consultative Committee for Space Data Systems) parser handles space data systems standards with lutaml-model architecture:

[source,ruby] ---- require 'pubid/ccsds'

# Standard CCSDS document doc = Pubid::Ccsds.parse("CCSDS 727.0-B-5") doc.code.number # ⇒ "727.0" doc.version # ⇒ "B" doc.revision # ⇒ "5" doc.to_s # ⇒ "CCSDS 727.0-B-5"

# With color code doc = Pubid::Ccsds.parse("CCSDS 211.2-B-1 Magenta Book") doc.color # ⇒ "Magenta" doc.to_s # ⇒ "CCSDS 211.2-B-1 Magenta Book"

# Corrigendum supplement (with lutaml-model) cor = Pubid::Ccsds.parse("CCSDS 727.0-B-5 Cor. 1") cor.class # ⇒ Pubid::Ccsds::Identifiers::Corrigendum cor.cor_number # ⇒ 1 cor.base_identifier.to_s # ⇒ "CCSDS 727.0-B-5" cor.to_s # ⇒ "CCSDS 727.0-B-5 Cor. 1"

# Language translation doc = Pubid::Ccsds.parse("CCSDS 211.0-B-5 (Chinese)") doc.language # ⇒ "Chinese" doc.to_s # ⇒ "CCSDS 211.0-B-5 (Chinese)" ----

Architecture Quality:

* Lutaml::Model refactoring - SupplementIdentifier inherits from Identifiers::Base * Polymorphic attributes - attribute :base_identifier, Identifiers::Base, polymorphic: true * Type safety - Corrigendum uses attribute :cor_number, :integer * ✅ Serialization support - Automatic JSON/YAML/XML via lutaml-model * ✅ Zero breaking changes - All 16 tests passing (100%) * ✅ Consistent pattern - Matches ISO, IEC, NIST architecture exactly

Key Features:

* Version-revision numbering (e.g., B-5 = Version B, Revision 5) * Color book system (Green, Blue, Magenta, Yellow, Silver, Orange, Pink) * Corrigenda as proper supplement identifiers with recursive base parsing * Language translation support * Round-trip fidelity preserved