Module: Kreuzberg
- Defined in:
- lib/kreuzberg.rb,
lib/kreuzberg/cli.rb,
lib/kreuzberg/types.rb,
lib/kreuzberg/config.rb,
lib/kreuzberg/errors.rb,
lib/kreuzberg/result.rb,
lib/kreuzberg/version.rb,
lib/kreuzberg/api_proxy.rb,
lib/kreuzberg/cache_api.rb,
lib/kreuzberg/cli_proxy.rb,
lib/kreuzberg/mcp_proxy.rb,
lib/kreuzberg/djot_content.rb,
lib/kreuzberg/error_context.rb,
lib/kreuzberg/extraction_api.rb,
lib/kreuzberg/setup_lib_path.rb,
lib/kreuzberg/document_structure.rb,
lib/kreuzberg/validator_protocol.rb,
lib/kreuzberg/ocr_backend_protocol.rb,
lib/kreuzberg/post_processor_protocol.rb
Overview
Kreuzberg is a Ruby binding for the Rust core library providing document extraction, text extraction, and OCR capabilities.
Defined Under Namespace
Modules: APIProxy, CLI, CLIProxy, CacheAPI, Config, ContentLayer, ElementType, ErrorContext, Errors, ExtractionAPI, KeywordAlgorithm, MCPProxy, OcrBackendProtocol, OcrElementLevel, OutputFormat, PageUnitType, PdfAnnotationType, PostProcessorProtocol, RelationshipKind, ResultFormat, SetupLibPath, UriKind, ValidatorProtocol Classes: ArchiveEntry, BoundingBox, DocumentAnnotation, DocumentBoundingBox, DocumentNode, DocumentStructure, Element, ElementMetadata, ExtractedKeyword, HeaderMetadata, HtmlMetadata, ImageMetadata, Keyword, LinkMetadata, PdfAnnotation, PdfAnnotationBoundingBox, ProcessingWarning, Result, StructuredData, Table, Uri
Constant Summary collapse
- ExtractionConfig =
Config::Extraction
- PageConfig =
Config::PageConfig
- ERROR_CODE_SUCCESS =
0- ERROR_CODE_GENERIC =
1- ERROR_CODE_PANIC =
2- ERROR_CODE_INVALID_ARGUMENT =
3- ERROR_CODE_IO =
4- ERROR_CODE_PARSING =
5- ERROR_CODE_OCR =
6- ERROR_CODE_MISSING_DEPENDENCY =
7- ERROR_CODE_EMBEDDING =
8- VERSION =
'4.9.2'
Class Method Summary collapse
- .clear_post_processors ⇒ Object
- .clear_validators ⇒ Object
- .detect_mime_type ⇒ Object
- .detect_mime_type_from_path ⇒ Object
- .embed ⇒ Object (also: native_embed)
- .embed_sync ⇒ Object (also: native_embed_sync)
- .get_extensions_for_mime ⇒ Object
- .list_ocr_backends ⇒ Object
- .list_post_processors ⇒ Object
- .list_validators ⇒ Object
- .register_ocr_backend ⇒ Object
- .register_post_processor ⇒ Object
- .register_validator ⇒ Object
- .unregister_ocr_backend ⇒ Object
- .unregister_post_processor ⇒ Object
- .unregister_validator ⇒ Object
- .validate_mime_type ⇒ Object