Class: Uniword::Infrastructure::MimeParser
- Inherits:
-
Object
- Object
- Uniword::Infrastructure::MimeParser
- Defined in:
- lib/uniword/infrastructure/mime_parser.rb
Overview
Parses MHTML (MIME HTML) files into Mhtml::Document model.
Parses MIME multipart structure, decodes content transfer encodings, creates typed MimePart objects, and populates Mhtml::Document.
Instance Method Summary collapse
-
#parse(path) ⇒ Mhtml::Document
Parse MHTML file and return a populated Mhtml::Document.
-
#parse_content(content) ⇒ Mhtml::Document
Parse MHTML content string.
Instance Method Details
#parse(path) ⇒ Mhtml::Document
Parse MHTML file and return a populated Mhtml::Document.
21 22 23 24 25 26 27 |
# File 'lib/uniword/infrastructure/mime_parser.rb', line 21 def parse(path) raise ArgumentError, "Path cannot be nil" if path.nil? raise ArgumentError, "File not found: #{path}" unless File.exist?(path) content = File.binread(path).force_encoding("UTF-8") parse_content(content) end |
#parse_content(content) ⇒ Mhtml::Document
Parse MHTML content string.
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
# File 'lib/uniword/infrastructure/mime_parser.rb', line 33 def parse_content(content) @content = content @boundary = extract_boundary @raw_parts = split_parts document = Mhtml::Document.new document.boundary = @boundary @raw_parts.each do |part| mime_part = parse_mime_part(part) next unless mime_part document.html_part = mime_part if mime_part.is_a?(Mhtml::HtmlPart) && !document.html_part document.add_part(mime_part) end (document) document end |