Module: Pdfsink::Cli Private

Defined in:
lib/pdfsink/cli.rb

Overview

This module is part of a private API. You should avoid using this module if possible, as it may be removed or be changed in the future.

Low-level runner for the pdfsink-rs CLI binary.

This module is not intended for direct use – see the public API on Pdfsink, Document, and Page instead. Every method shells out to the binary, returning either its raw stdout (for text) or the parsed JSON it prints.

Constant Summary collapse

BINARY =

This constant is part of a private API. You should avoid using this constant if possible, as it may be removed or be changed in the future.

"pdfsink-rs"

Class Attribute Summary collapse

Class Method Summary collapse

Class Attribute Details

.binaryString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Absolute path to the pdfsink-rs binary. Search order:

1. PDFSINK_BIN environment variable (explicit override)
2. lib/pdfsink/ inside the gem (where extconf.rb copies the build)
3. ext/pdfsink/bin/ (dev / cargo-install location)
4. The bare name, resolved against PATH at exec time

Returns:

  • (String)


26
27
28
# File 'lib/pdfsink/cli.rb', line 26

def binary
  @binary ||= find_binary
end

Class Method Details

.info(path) ⇒ Hash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Document-level metadata for every page: dimensions, rotation, bbox, and per-page object counts.

Parameters:

  • path (String)

Returns:

  • (Hash)


48
49
50
# File 'lib/pdfsink/cli.rb', line 48

def info(path)
  run_json("info", path)
end

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Hyperlinks on a single page.

Returns:

  • (Array<Hash>)


85
86
87
# File 'lib/pdfsink/cli.rb', line 85

def links(path, page)
  run_json("links", path, page.to_s)
end

.objects(path, page) ⇒ Hash

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

All page objects (chars, lines, rects, curves, images, …) as a dict.

Returns:

  • (Hash)


78
79
80
# File 'lib/pdfsink/cli.rb', line 78

def objects(path, page)
  run_json("objects", path, page.to_s)
end

.search(path, page, pattern) ⇒ Array<Hash>

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Regex search matches for a single page.

Returns:

  • (Array<Hash>)


71
72
73
# File 'lib/pdfsink/cli.rb', line 71

def search(path, page, pattern)
  run_json("search", path, page.to_s, pattern)
end

.table(path, page, strategy) ⇒ Array<Array>?

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Extracted table for a single page, or nil if none was found.

Parameters:

  • strategy (String)

    one of “lines”, “lines_strict”, “text”, “explicit”

Returns:

  • (Array<Array>, nil)


93
94
95
# File 'lib/pdfsink/cli.rb', line 93

def table(path, page, strategy)
  run_json("table", path, page.to_s, strategy)
end

.text(path, page) ⇒ String

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Extracted text for a single page.

Parameters:

  • path (String)
  • page (Integer)

    1-based page number

Returns:

  • (String)


57
58
59
# File 'lib/pdfsink/cli.rb', line 57

def text(path, page)
  run("text", path, page.to_s)
end

.versionString

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

The pdfsink-rs version string, e.g. “pdfsink-rs 0.2.8”.

The CLI has no version subcommand, so this reports the crate version the gem was built against.

Returns:

  • (String)


39
40
41
# File 'lib/pdfsink/cli.rb', line 39

def version
  Pdfsink::PDFSINK_RS_VERSION
end

.words(path, page) ⇒ Array<Hash>

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Words with positions for a single page.

Returns:

  • (Array<Hash>)


64
65
66
# File 'lib/pdfsink/cli.rb', line 64

def words(path, page)
  run_json("words", path, page.to_s)
end