Module: Kreuzberg::FormatMetadata

Overview

Format-specific metadata (discriminated union).

Only one format type can exist per extraction result. This provides type-safe, clean metadata without nested optionals.

Class Method Summary collapse

Class Method Details

.from_hash(hash) ⇒ Object



2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
# File 'lib/kreuzberg/native.rb', line 2539

def self.from_hash(hash)
  discriminator = hash[:format_type] || hash["format_type"]
  case discriminator
  when "pdf" then FormatMetadataPdf.from_hash(hash)
  when "docx" then FormatMetadataDocx.from_hash(hash)
  when "excel" then FormatMetadataExcel.from_hash(hash)
  when "email" then FormatMetadataEmail.from_hash(hash)
  when "pptx" then FormatMetadataPptx.from_hash(hash)
  when "archive" then FormatMetadataArchive.from_hash(hash)
  when "image" then FormatMetadataImage.from_hash(hash)
  when "xml" then FormatMetadataXml.from_hash(hash)
  when "text" then FormatMetadataText.from_hash(hash)
  when "html" then FormatMetadataHtml.from_hash(hash)
  when "ocr" then FormatMetadataOcr.from_hash(hash)
  when "csv" then FormatMetadataCsv.from_hash(hash)
  when "bibtex" then FormatMetadataBibtex.from_hash(hash)
  when "citation" then FormatMetadataCitation.from_hash(hash)
  when "fiction_book" then FormatMetadataFictionBook.from_hash(hash)
  when "dbf" then FormatMetadataDbf.from_hash(hash)
  when "jats" then FormatMetadataJats.from_hash(hash)
  when "epub" then FormatMetadataEpub.from_hash(hash)
  when "pst" then FormatMetadataPst.from_hash(hash)
  when "code" then FormatMetadataCode.from_hash(hash)
  else raise "Unknown discriminator: #{discriminator}"
  end
end