Module: Kreuzberg

Defined in:
lib/kreuzberg.rb,
lib/kreuzberg/cli.rb,
lib/kreuzberg/types.rb,
lib/kreuzberg/config.rb,
lib/kreuzberg/errors.rb,
lib/kreuzberg/result.rb,
lib/kreuzberg/version.rb,
lib/kreuzberg/api_proxy.rb,
lib/kreuzberg/cache_api.rb,
lib/kreuzberg/cli_proxy.rb,
lib/kreuzberg/mcp_proxy.rb,
lib/kreuzberg/djot_content.rb,
lib/kreuzberg/error_context.rb,
lib/kreuzberg/extraction_api.rb,
lib/kreuzberg/setup_lib_path.rb,
lib/kreuzberg/document_structure.rb,
lib/kreuzberg/validator_protocol.rb,
lib/kreuzberg/ocr_backend_protocol.rb,
lib/kreuzberg/post_processor_protocol.rb

Overview

Kreuzberg is a Ruby binding for the Rust core library providing document extraction, text extraction, and OCR capabilities.

Defined Under Namespace

Modules: APIProxy, CLI, CLIProxy, CacheAPI, Config, ContentLayer, ElementType, ErrorContext, Errors, ExtractionAPI, KeywordAlgorithm, MCPProxy, OcrBackendProtocol, OcrElementLevel, OutputFormat, PageUnitType, PdfAnnotationType, PostProcessorProtocol, RelationshipKind, ResultFormat, SetupLibPath, UriKind, ValidatorProtocol Classes: ArchiveEntry, BoundingBox, DocumentAnnotation, DocumentBoundingBox, DocumentNode, DocumentStructure, Element, ElementMetadata, ExtractedKeyword, HeaderMetadata, HtmlMetadata, ImageMetadata, Keyword, LinkMetadata, PdfAnnotation, PdfAnnotationBoundingBox, ProcessingWarning, Result, StructuredData, Table, Uri

Constant Summary collapse

ExtractionConfig =
Config::Extraction
PageConfig =
Config::PageConfig
ERROR_CODE_SUCCESS =
0
ERROR_CODE_GENERIC =
1
ERROR_CODE_PANIC =
2
ERROR_CODE_INVALID_ARGUMENT =
3
ERROR_CODE_IO =
4
ERROR_CODE_PARSING =
5
ERROR_CODE_OCR =
6
ERROR_CODE_MISSING_DEPENDENCY =
7
ERROR_CODE_EMBEDDING =
8
VERSION =
'4.9.2'

Class Method Summary collapse

Class Method Details

.clear_post_processorsObject

.clear_validatorsObject

.detect_mime_typeObject

.detect_mime_type_from_pathObject

.embedObject Also known as: native_embed

.embed_syncObject Also known as: native_embed_sync

.get_extensions_for_mimeObject

.list_ocr_backendsObject

.list_post_processorsObject

.list_validatorsObject

.register_ocr_backendObject

.register_post_processorObject

.register_validatorObject

.unregister_ocr_backendObject

.unregister_post_processorObject

.unregister_validatorObject

.validate_mime_typeObject