Module: Pdfsink
- Defined in:
- lib/pdfsink.rb,
lib/pdfsink/cli.rb,
lib/pdfsink/page.rb,
lib/pdfsink/error.rb,
lib/pdfsink/railtie.rb,
lib/pdfsink/version.rb,
lib/pdfsink/document.rb,
lib/pdfsink/table_strategy.rb
Overview
Pdfsink wraps the pdfsink-rs CLI, a fast pure-Rust PDF extraction tool, exposing text, word, object, table, link, and search extraction to Ruby.
Defined Under Namespace
Modules: Cli, TableStrategy Classes: BinaryNotFoundError, CommandError, Configuration, Document, Error, Page, ParseError, Railtie
Constant Summary collapse
- VERSION =
"0.1.0"- PDFSINK_RS_VERSION =
Version of the pdfsink-rs crate this gem builds and wraps.
"0.2.8"
Class Method Summary collapse
- .configuration ⇒ Configuration
-
.configure {|configuration| ... } ⇒ Object
Yields the configuration object for modification.
-
.extract_text(path, page: 1) ⇒ String
Extract the text of a single page in one call.
-
.open(path) ⇒ Document
Open a PDF document.
-
.version ⇒ String
The version of the underlying pdfsink-rs binary the gem was built with.
Class Method Details
.configuration ⇒ Configuration
41 42 43 |
# File 'lib/pdfsink.rb', line 41 def configuration @configuration ||= Configuration.new end |
.configure {|configuration| ... } ⇒ Object
Yields the configuration object for modification.
51 52 53 |
# File 'lib/pdfsink.rb', line 51 def configure yield(configuration) end |
.extract_text(path, page: 1) ⇒ String
Extract the text of a single page in one call.
70 71 72 |
# File 'lib/pdfsink.rb', line 70 def extract_text(path, page: 1) Cli.text(File.(path), page) end |