Class: Ucode::Glyphs::PageRenderer
- Inherits:
-
Object
- Object
- Ucode::Glyphs::PageRenderer
- Defined in:
- lib/ucode/glyphs/page_renderer.rb
Overview
Strategy interface for PDF-page-to-SVG rendering.
Subclasses implement ‘renderer_name`, `binary_name`, and `build_command`. The base class handles availability check, command execution, error handling, and the renderer registry.
OCP: a new renderer is a new subclass file + one entry in ‘KNOWN_RENDERERS`. The base class and existing renderers are not modified.
**Vector-only requirement**: every renderer here must emit SVG ‘<path>` elements (vector data) for the Code Charts PDFs, not raster images. Callers verify this via `path_count` on the output.
Direct Known Subclasses
DvisvgmRenderer, MutoolRenderer, Pdf2svgRenderer, PdftocairoRenderer
Constant Summary collapse
- OUTPUT_FORMAT =
:svg- DEFAULT_SMOKE_FIXTURE =
Fixture used by ‘works?` to smoke-test renderers. Resolved lazily so missing-fixture environments (installed gem without spec assets) don’t fail at load time.
File.("../../../spec/fixtures/pdfs/basic_latin.pdf", __dir__)
Class Method Summary collapse
-
.all ⇒ Array<Class>
Every known concrete renderer.
-
.available ⇒ Array<Class>
Renderers whose binary is installed.
-
.available? ⇒ Boolean
True if the binary is on PATH.
-
.binary_name ⇒ String, Symbol
The binary looked up on PATH.
-
.build_command(pdf_path, page_num, out_path) ⇒ Array<String>
Build the argv for the renderer.
-
.default ⇒ Class?
The first working renderer; falls back to the first available renderer if none have been smoke-tested yet (preserves eager-init paths).
- .find(name) ⇒ Class?
-
.output_format ⇒ Symbol
Always :svg for now; future formats (png, etc.) would warrant a separate renderer family.
-
.render(pdf_path, page_num, out_path) ⇒ Symbol
Render one page of ‘pdf_path` to `out_path` as SVG.
-
.renderer_name ⇒ Symbol
Short identifier (e.g. :mutool).
-
.reset_working_cache! ⇒ Object
Clear the cached ‘working` list.
-
.working ⇒ Array<Class>
Renderers that actually produce SVG in the format ‘GridDetector` consumes (smoke-tested once per process via `works?`, then cached).
-
.works?(fixture_pdf: DEFAULT_SMOKE_FIXTURE) ⇒ Boolean
Smoke-test the binary by actually rendering one page of the fixture PDF AND verifying the output format is consumable by the downstream ‘GridDetector` / `CellExtractor` pipeline.
Class Method Details
.all ⇒ Array<Class>
Returns every known concrete renderer.
154 155 156 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 154 def all @all ||= KNOWN_RENDERERS.map { |n| Ucode::Glyphs.const_get(n) }.freeze end |
.available ⇒ Array<Class>
Returns renderers whose binary is installed.
159 160 161 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 159 def available all.select(&:available?) end |
.available? ⇒ Boolean
Returns true if the binary is on PATH.
74 75 76 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 74 def available? system("which", binary_name.to_s, out: "/dev/null", err: "/dev/null") end |
.binary_name ⇒ String, Symbol
Returns the binary looked up on PATH.
53 54 55 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 53 def binary_name raise NotImplementedError end |
.build_command(pdf_path, page_num, out_path) ⇒ Array<String>
Build the argv for the renderer. Subclasses return an Array suitable for ‘Open3.capture2e` (no shell interpolation).
69 70 71 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 69 def build_command(pdf_path, page_num, out_path) raise NotImplementedError end |
.default ⇒ Class?
Returns the first working renderer; falls back to the first available renderer if none have been smoke-tested yet (preserves eager-init paths). nil if nothing is installed.
187 188 189 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 187 def default working.first || available.first end |
.find(name) ⇒ Class?
180 181 182 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 180 def find(name) all.find { |r| r.renderer_name == name.to_sym } end |
.output_format ⇒ Symbol
Returns always :svg for now; future formats (png, etc.) would warrant a separate renderer family.
59 60 61 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 59 def output_format OUTPUT_FORMAT end |
.render(pdf_path, page_num, out_path) ⇒ Symbol
Render one page of ‘pdf_path` to `out_path` as SVG.
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 122 def render(pdf_path, page_num, out_path) unless available? raise PdfRenderError.new( "binary '#{binary_name}' not available on PATH", context: { renderer: name, binary: binary_name }, ) end out = Pathname.new(out_path) out.dirname.mkpath cmd = build_command(Pathname.new(pdf_path), page_num, out) output, status = Open3.capture2e(*cmd) unless status.success? && out.exist? && out.size.positive? raise PdfRenderError.new( "render failed for page #{page_num} of #{pdf_path} via '#{binary_name}'", context: { renderer: name, binary: binary_name, exit_status: status.exitstatus, output: output, }, ) end :ok end |
.renderer_name ⇒ Symbol
Returns short identifier (e.g. :mutool).
48 49 50 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 48 def renderer_name raise NotImplementedError end |
.reset_working_cache! ⇒ Object
Clear the cached ‘working` list. Useful when the environment changes (e.g. a binary is installed mid-process) or in tests.
174 175 176 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 174 def reset_working_cache! @working = nil end |
.working ⇒ Array<Class>
Returns renderers that actually produce SVG in the format ‘GridDetector` consumes (smoke-tested once per process via `works?`, then cached). Subset of `available`.
166 167 168 169 170 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 166 def working return @working if @working @working = all.select(&:works?).freeze end |
.works?(fixture_pdf: DEFAULT_SMOKE_FIXTURE) ⇒ Boolean
Smoke-test the binary by actually rendering one page of the fixture PDF AND verifying the output format is consumable by the downstream ‘GridDetector` / `CellExtractor` pipeline.
Three things can make a renderer unusable for this codebase:
1. Binary not on PATH (`available?` catches this).
2. Binary on PATH but silently broken (e.g. Ubuntu's
`mupdf-tools` is built without LCMS, so `mutool` warns
"ICC support is not available" and emits zero bytes for
ICC-profiled PDFs).
3. Binary works but emits a flat-path SVG that GridDetector
can't parse (mutool's format: `<path id="font_X_Y">`
directly in `<defs>`, no `<use>` references). The grid
detector requires the `<g id="glyph-N-M">` + `<use>` form
produced by pdftocairo / pdf2svg.
The result is memoized per-renderer for the process lifetime —the binary’s capabilities don’t change mid-run.
When no fixture PDF is available (e.g. installed gem without spec assets), degrades to ‘available?` — we can’t smoke-test without input, so we trust the binary’s presence on PATH.
105 106 107 108 109 110 111 112 113 |
# File 'lib/ucode/glyphs/page_renderer.rb', line 105 def works?(fixture_pdf: DEFAULT_SMOKE_FIXTURE) if !available? false elsif !File.exist?(fixture_pdf.to_s) true # no fixture to verify against; trust PATH else smoke_render_ok?(fixture_pdf) end end |