Class: Ucode::Commands::CanonicalBuildCommand

Inherits:
Object
  • Object
show all
Defined in:
lib/ucode/commands/canonical_build.rb

Overview

ucode canonical-build — Mode 1's canonical Unicode dataset build (TODO 21). Single pass: enrich each codepoint via Ucode::Coordinator, resolve its glyph via the 4-tier Glyphs::Resolver, write index.json + glyph.svg atomically, accumulate per-tier + per-block stats, and emit output/build-report.json.

This is the v0.2 replacement for the v0.1 cell-extractor pipeline in GlyphsCommand. The two coexist until the v0.1 pipeline is removed (TODOs 17-19); CanonicalBuildCommand is the path forward for production dataset runs.

Pre-conditions (per TODO 21)

  1. UCD + Unihan fetched for version (ucode fetch ucd, ucode fetch unihan).
  2. Ucode::Database built for version (ucode db build).
  3. Tier 1 fonts resolvable via the configured SourceConfig YAML.
  4. Code Charts PDFs cached (for Pillar 1) — optional, only if pillar-1 sources are configured.
  5. Last Resort UFO cloned (for Pillar 3) — optional, only if pillar-3 fallback is configured.

Missing pre-conditions cause silent fallthrough to lower tiers; the build report's by_tier totals surface what ran.

Instance Method Summary collapse

Instance Method Details

#call(version, output_root:, source_config_path: nil, resolver: nil, validate: true, baseline: nil) ⇒ Hash

Returns { version:, codepoint_count:, report_path:, validation_report_path:, validation_passed: }.

Parameters:

  • version (String)

    resolved UCD version

  • output_root (String, Pathname)
  • source_config_path (String, Pathname, nil) (defaults to: nil)

    override the Tier 1 font config YAML; nil uses the default (config/unicode17_tier1_fonts.yml).

  • resolver (Ucode::Glyphs::Resolver, nil) (defaults to: nil)

    inject a pre-built resolver (skips SourceBuilder); used by tests.

  • validate (Boolean) (defaults to: true)

    run Repo::BuildValidator after the build and emit validation-report.json. Default true; tests that don't care about validation pass false.

  • baseline (Hash{String=>Integer}, nil) (defaults to: nil)

    per-block expected built counts forwarded to the validator when validate: is true. nil skips the block_coverage check.

Returns:

  • (Hash)

    { version:, codepoint_count:, report_path:, validation_report_path:, validation_passed: }



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# File 'lib/ucode/commands/canonical_build.rb', line 57

def call(version, output_root:, source_config_path: nil,
         resolver: nil, validate: true, baseline: nil)
  root = Pathname.new(output_root)

  resolved_resolver = resolver || build_resolver(version, source_config_path)
  accumulator = Repo::BuildReportAccumulator.new(
    unicode_version: version,
    ucode_version: Ucode::VERSION,
  )

  coordinator = Coordinator.new
  writer = Repo::CodepointWriter.new(
    root,
    parallel_workers: workers,
    resolver: resolved_resolver,
    observer: accumulator,
  )

  ucd_dir = Cache.ucd_dir(version)
  unihan_dir = Cache.unihan_dir(version)
  codepoint_count = iterate(coordinator, ucd_dir, unihan_dir, writer,
                            accumulator)

  report = accumulator.to_report
  report_path = Repo::BuildReportWriter.new(root).write(report)

  result = {
    version: version,
    codepoint_count: codepoint_count,
    report_path: report_path,
    totals: report.totals.to_hash,
  }
  return result unless validate

  merge_validation_result(result, root, version, baseline)
end