Module: Rpdfium::Table::Debugger

Defined in:
lib/rpdfium/table/debugger.rb

Overview

Generates a debug visualization: the page rendered to PNG with the detected edges and cell bboxes overlaid. Equivalent to pdfplumber.Page.to_image().debug_tablefinder().

Implemented in pure Ruby: rasterizes the page via render(), then draws over the bitmap by manipulating the RGBA bytes, and finally saves to PNG.

Constant Summary collapse

RED =
[255, 0, 0, 200].freeze
GREEN =
[0, 200, 0, 200].freeze
BLUE =
[80, 80, 255, 120].freeze

Class Method Summary collapse

Class Method Details

.visualize(page, output_path, scale: 2.0, **table_opts) ⇒ Object



18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/rpdfium/table/debugger.rb', line 18

def visualize(page, output_path, scale: 2.0, **table_opts)
  extractor = Extractor.new(page, **table_opts)
  edges = extractor.edges
  intersections = extractor.intersections
  tables = extractor.tables

  w, h, bytes, _stride = page.render(scale: scale, output: :rgba)
  canvas = Canvas.new(w, h, bytes)

  # Draws edges. New format: each edge has orientation + x0/x1/top/bottom.
  # A horizontal edge has top == bottom; a vertical one has x0 == x1.
  edges.each do |e|
    canvas.line((e[:x0] * scale).to_i, (e[:top]    * scale).to_i,
                 (e[:x1] * scale).to_i, (e[:bottom] * scale).to_i, RED)
  end

  # Draws intersections (4px circles). They are Hashes keyed by [x, y].
  intersections.each_key do |(x, y)|
    canvas.dot((x * scale).to_i, (y * scale).to_i, GREEN, 4)
  end

  # Fills tables with transparent blue. Table#bbox is the tuple [x0, top, x1, bottom].
  tables.each do |t|
    x0, top, x1, bottom = t.bbox
    canvas.rect_fill((x0 * scale).to_i, (top    * scale).to_i,
                      (x1 * scale).to_i, (bottom * scale).to_i, BLUE)
  end

  Rpdfium::IO::PNG.write(output_path, w, h, canvas.bytes, stride: w * 4)
  output_path
end