Class: Ucode::Glyphs::PageRenderer
- Inherits:
-
Object
- Object
- Ucode::Glyphs::PageRenderer
- Defined in:
- lib/ucode/glyphs/page_renderer.rb
Overview
Strategy interface for PDF-page-to-SVG rendering.
Subclasses implement renderer_name, binary_name, and
build_command. The base class handles availability check,
command execution, error handling, and the renderer registry.
OCP: a new renderer is a new subclass file + one entry in
KNOWN_RENDERERS. The base class and existing renderers are not
modified.
Vector-only requirement: every renderer here must emit SVG
<path> elements (vector data) for the Code Charts PDFs, not
raster images. Callers verify this via path_count on the output.
Direct Known Subclasses
DvisvgmRenderer, MutoolRenderer, Pdf2svgRenderer, PdftocairoRenderer
Constant Summary collapse
- OUTPUT_FORMAT =
:svg- DEFAULT_SMOKE_FIXTURE =
Fixture used by
works?to smoke-test renderers. Resolved lazily so missing-fixture environments (installed gem without spec assets) don't fail at load time. File.("../../../spec/fixtures/pdfs/basic_latin.pdf", __dir__)
Class Method Summary collapse
-
.all ⇒ Array<Class>
Every known concrete renderer.
-
.available ⇒ Array<Class>
Renderers whose binary is installed.
-
.available? ⇒ Boolean
True if the binary is on PATH.
-
.binary_name ⇒ String, Symbol
The binary looked up on PATH.
-
.build_command(pdf_path, page_num, out_path) ⇒ Array<String>
Build the argv for the renderer.
-
.default ⇒ Class?
The first working renderer; falls back to the first available renderer if none have been smoke-tested yet (preserves eager-init paths).
- .find(name) ⇒ Class?
-
.output_format ⇒ Symbol
Always :svg for now; future formats (png, etc.) would warrant a separate renderer family.
-
.render(pdf_path, page_num, out_path) ⇒ Symbol
Render one page of
pdf_pathtoout_pathas SVG. -
.renderer_name ⇒ Symbol
Short identifier (e.g. :mutool).
-
.reset_working_cache! ⇒ Object
Clear the cached
workinglist. -
.working ⇒ Array<Class>
Renderers that actually produce SVG in the format
GridDetectorconsumes (smoke-tested once per process viaworks?, then cached). -
.works?(fixture_pdf: DEFAULT_SMOKE_FIXTURE) ⇒ Boolean
Smoke-test the binary by actually rendering one page of the fixture PDF AND verifying the output format is consumable by the downstream
GridDetector/CellExtractorpipeline.
Class Method Details
.all ⇒ Array<Class>
Returns every known concrete renderer.
154 155 156 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 154 def all @all ||= KNOWN_RENDERERS.map { |n| Ucode::Glyphs.const_get(n) }.freeze end |
.available ⇒ Array<Class>
Returns renderers whose binary is installed.
159 160 161 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 159 def available all.select(&:available?) end |
.available? ⇒ Boolean
Returns true if the binary is on PATH.
74 75 76 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 74 def available? system("which", binary_name.to_s, out: "/dev/null", err: "/dev/null") end |
.binary_name ⇒ String, Symbol
Returns the binary looked up on PATH.
53 54 55 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 53 def binary_name raise NotImplementedError end |
.build_command(pdf_path, page_num, out_path) ⇒ Array<String>
Build the argv for the renderer. Subclasses return an Array
suitable for Open3.capture2e (no shell interpolation).
69 70 71 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 69 def build_command(pdf_path, page_num, out_path) raise NotImplementedError end |
.default ⇒ Class?
Returns the first working renderer; falls back to the first available renderer if none have been smoke-tested yet (preserves eager-init paths). nil if nothing is installed.
187 188 189 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 187 def default working.first || available.first end |
.find(name) ⇒ Class?
180 181 182 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 180 def find(name) all.find { |r| r.renderer_name == name.to_sym } end |
.output_format ⇒ Symbol
Returns always :svg for now; future formats (png, etc.) would warrant a separate renderer family.
59 60 61 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 59 def output_format OUTPUT_FORMAT end |
.render(pdf_path, page_num, out_path) ⇒ Symbol
Render one page of pdf_path to out_path as SVG.
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 122 def render(pdf_path, page_num, out_path) unless available? raise PdfRenderError.new( "binary '#{binary_name}' not available on PATH", context: { renderer: name, binary: binary_name }, ) end out = Pathname.new(out_path) out.dirname.mkpath cmd = build_command(Pathname.new(pdf_path), page_num, out) output, status = Open3.capture2e(*cmd) unless status.success? && out.exist? && out.size.positive? raise PdfRenderError.new( "render failed for page #{page_num} of #{pdf_path} via '#{binary_name}'", context: { renderer: name, binary: binary_name, exit_status: status.exitstatus, output: output, }, ) end :ok end |
.renderer_name ⇒ Symbol
Returns short identifier (e.g. :mutool).
48 49 50 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 48 def renderer_name raise NotImplementedError end |
.reset_working_cache! ⇒ Object
Clear the cached working list. Useful when the environment
changes (e.g. a binary is installed mid-process) or in tests.
174 175 176 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 174 def reset_working_cache! @working = nil end |
.working ⇒ Array<Class>
Returns renderers that actually produce SVG in
the format GridDetector consumes (smoke-tested once per
process via works?, then cached). Subset of available.
166 167 168 169 170 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 166 def working return @working if @working @working = all.select(&:works?).freeze end |
.works?(fixture_pdf: DEFAULT_SMOKE_FIXTURE) ⇒ Boolean
Smoke-test the binary by actually rendering one page of the
fixture PDF AND verifying the output format is consumable by
the downstream GridDetector / CellExtractor pipeline.
Three things can make a renderer unusable for this codebase:
1. Binary not on PATH (`available?` catches this).
2. Binary on PATH but silently broken (e.g. Ubuntu's
`mupdf-tools` is built without LCMS, so `mutool` warns
"ICC support is not available" and emits zero bytes for
ICC-profiled PDFs).
3. Binary works but emits a flat-path SVG that GridDetector
can't parse (mutool's format: `<path id="font_X_Y">`
directly in `<defs>`, no `<use>` references). The grid
detector requires the `<g id="glyph-N-M">` + `<use>` form
produced by pdftocairo / pdf2svg.
The result is memoized per-renderer for the process lifetime — the binary's capabilities don't change mid-run.
When no fixture PDF is available (e.g. installed gem without
spec assets), degrades to available? — we can't smoke-test
without input, so we trust the binary's presence on PATH.
105 106 107 108 109 110 111 112 113 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 105 def works?(fixture_pdf: DEFAULT_SMOKE_FIXTURE) if !available? false elsif !File.exist?(fixture_pdf.to_s) true # no fixture to verify against; trust PATH else smoke_render_ok?(fixture_pdf) end end |