Class: Uniword::Diff::PackageDiffer
- Inherits:
-
Object
- Object
- Uniword::Diff::PackageDiffer
- Defined in:
- lib/uniword/diff/package_differ.rb
Overview
Compares two DOCX files at the ZIP/XML/OPC structural level.
Detects differences in:
-
ZIP entries (added/removed parts)
-
ZIP entry metadata (compression, text/binary flag, timestamps)
-
XML content (semantic equivalence via Canon, element structure)
-
OPC validation (content types, relationships, required parts)
Unlike DocumentDiffer (which compares loaded DocumentRoot models), PackageDiffer works on raw DOCX ZIP contents, detecting what Word or other applications changed during repair.
Constant Summary collapse
- REQUIRED_PARTS =
Required parts for a valid OOXML DOCX package.
%w[ [Content_Types].xml _rels/.rels word/document.xml ].freeze
- STANDARD_CONTENT_TYPES =
Standard DOCX parts and their expected content types.
{ "word/document.xml" => "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml", "word/styles.xml" => "application/vnd.openxmlformats-officedocument.wordprocessingml.styles+xml", "word/settings.xml" => "application/vnd.openxmlformats-officedocument.wordprocessingml.settings+xml", "word/fontTable.xml" => "application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+xml", "word/webSettings.xml" => "application/vnd.openxmlformats-officedocument.wordprocessingml.webSettings+xml", "word/theme/theme1.xml" => "application/vnd.openxmlformats-officedocument.theme+xml", "docProps/core.xml" => "application/vnd.openxmlformats-package.core-properties+xml", "docProps/app.xml" => "application/vnd.openxmlformats-officedocument.extended-properties+xml", }.freeze
Instance Method Summary collapse
-
#diff ⇒ PackageDiffResult
Perform structural diff and return a PackageDiffResult.
-
#initialize(old_path, new_path, canon: false, canon_profile: :spec_friendly) ⇒ PackageDiffer
constructor
Initialize with two DOCX file paths.
Constructor Details
#initialize(old_path, new_path, canon: false, canon_profile: :spec_friendly) ⇒ PackageDiffer
Initialize with two DOCX file paths.
59 60 61 62 63 64 65 |
# File 'lib/uniword/diff/package_differ.rb', line 59 def initialize(old_path, new_path, canon: false, canon_profile: :spec_friendly) @old_path = old_path @new_path = new_path @canon = canon @canon_profile = canon_profile end |
Instance Method Details
#diff ⇒ PackageDiffResult
Perform structural diff and return a PackageDiffResult.
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
# File 'lib/uniword/diff/package_differ.rb', line 70 def diff old_zip = Zip::File.open(@old_path) new_zip = Zip::File.open(@new_path) begin part_diff = diff_parts(old_zip, new_zip) content_diff = diff_xml_content(old_zip, new_zip, part_diff) = (old_zip, new_zip) opc = validate_opc(old_zip, new_zip) ensure old_zip.close new_zip.close end PackageDiffResult.new( old_path: @old_path, new_path: @new_path, added_parts: part_diff[:added], removed_parts: part_diff[:removed], modified_parts: part_diff[:modified], unchanged_parts: part_diff[:unchanged], xml_changes: content_diff, zip_metadata_changes: , opc_issues: opc, ) end |