Ocrb
OCR
Installation
Install the gem and add to the application's Gemfile by executing:
bundle add ocrb
If bundler is not being used to manage dependencies, install the gem by executing:
gem install ocrb
Usage
Require the gem and call Ocrb.run with an image path and prompt:
require "ocrb"
text = Ocrb.run("receipt.jpg", "Extract the text from this image.")
puts text
By default, Ocrb.run uses the ollama CLI with the glm-ocr:bf16 model:
Ocrb.run("receipt.jpg", "Summarize the line items.")
If you want to use an OpenAI-compatible API instead, pass the built-in extractor explicitly:
require "ocrb"
text = Ocrb.run(
"receipt.jpg",
"Recognize total amount.",
extractor: Ocrb::Extractors::OpenAi.new(
url: "http://127.0.0.1:1234/v1",
model: "zai-org/glm-4.6v-flash",
api_key: ENV.fetch("OPENAI_API_KEY", "asdf"),
json: {type: 'object', properties: {amount: {type: 'string'}}} # can be `nil` or `true` or `response_format.json_schema.schema`
)
)
You can also resize the image before OCR by passing a resizer:
require "ocrb"
text = Ocrb.run(
"receipt.jpg",
"Extract all visible text.",
resizer: Ocrb::Resizers::Sips.new(resample_width: 1024)
)
Both extractor and resizer are duck-typed. Any object that responds to extract(image_path, prompt) or resize(image_path) can be passed in.
License
The gem is available as open source under the terms of the MIT License.