Module: Metaclean::Mat2

Defined in:
lib/metaclean/mat2.rb

Constant Summary collapse

SUPPORTED_EXTS =

File extensions we know mat2 can handle. Keep this list conservative —if mat2 doesn’t actually support an extension, the call will fail gracefully via UNSUPPORTED_RE below, but we’d rather not even try. Deliberately ABSENT: Matroska (mkv/webm) — mat2 has no parser for it; ffmpeg owns those (Strategy::FFMPEG_FORMATS). QuickTime/MP4-audio (mov/m4a) — mat2 can’t write them and ExifTool already cleans them, so listing them only caused a wasted mat2 spawn that always soft-skipped. WMV (ASF) IS here on purpose: mat2 CAN write it but ExifTool can’t, so mat2 is the only tool that cleans .wmv — dropping it would make every .wmv permanently :failed.

%w[
  pdf png jpg jpeg tif tiff gif bmp svg webp
  mp3 flac ogg opus wav
  mp4 avi wmv
  docx xlsx pptx odt ods odp odg odf epub
  zip torrent
].freeze
UNSUPPORTED_RE =

Regex matching the messages mat2 prints when it can’t handle a file. We use this to distinguish “soft skip” from a real error. ‘i` flag = case-insensitive.

/(not supported|isn't supported|cannot be cleaned|unsupported file)/i.freeze

Class Method Summary collapse

Class Method Details

.available?Boolean

Memoized PATH check (same pattern as Exiftool.available?).

Returns:

  • (Boolean)


43
44
45
46
47
48
49
50
51
52
53
54
55
56
# File 'lib/metaclean/mat2.rb', line 43

def available?
  return @available if defined?(@available)

  out, _err, status = Open3.capture3('mat2', '--version')
  @available = status.success?
  # `mat2 --version` prints "mat2 0.14.0" — `.split.last` grabs the
  # version number regardless of whatever prefix appears. Captured here
  # so `version` reuses it instead of re-spawning the binary.
  @version = @available ? out.strip.split.last : nil
  @available
rescue Errno::ENOENT
  @version = nil
  @available = false
end

.cleaned_path_for(path) ⇒ Object

Builds the path mat2 will write to: ‘name.cleaned.ext`. We use File.dirname/basename/join instead of string concatenation so this works on Windows (\ separator) too.



120
121
122
123
124
125
# File 'lib/metaclean/mat2.rb', line 120

def cleaned_path_for(path)
  dir  = File.dirname(path)
  ext  = File.extname(path)
  stem = File.basename(path, ext)
  File.join(dir, "#{stem}.cleaned#{ext}")
end

.strip!(path) ⇒ Object

Strips metadata from ‘path` in place. Returns:

true           — stripped successfully
:no_metadata   — mat2 ran but found nothing to strip
:unsupported   — mat2 cannot handle this file type

Raises Metaclean::Error on hard failure.

We return symbols (instead of always raising) so the runner can show a friendly “skipped” message and continue with the next tool.



78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# File 'lib/metaclean/mat2.rb', line 78

def strip!(path)
  raise Error, 'mat2 not available' unless available?

  cleaned = cleaned_path_for(path)
  safe    = Metaclean.safe_path(path)

  # Defensive: if a stale `<name>.cleaned.<ext>` exists from an earlier
  # crashed run, remove it so we don't accidentally use old data.
  File.delete(cleaned) if File.exist?(cleaned)

  out, err, status = Open3.capture3('mat2', safe)

  # Success path first. mat2 only creates `<name>.cleaned.<ext>` when it
  # actually stripped something; no file after exit 0 means there was
  # nothing to remove. We check exit status BEFORE the "unsupported"
  # message so a successful run that merely warns about one embedded
  # stream isn't misreported as a soft skip.
  if status.success?
    return :no_metadata unless File.exist?(cleaned)

    FileUtils.mv(cleaned, safe)
    return true
  end

  # Failure path. A "not supported" message means a soft skip we report
  # so the runner can continue with the next tool, not a hard error.
  combined = "#{out}\n#{err}"
  return :unsupported if combined.match?(UNSUPPORTED_RE)

  # `err.strip.empty? ? out.strip : err.strip` picks whichever stream
  # has actual content — some tools log to stdout, others to stderr.
  raise Error, "mat2 failed: #{err.strip.empty? ? out.strip : err.strip}"
ensure
  # Interrupt-safety: if we were killed (Ctrl-C) between mat2 writing
  # `<name>.cleaned.<ext>` and the rename, don't leave the orphan behind.
  # On the success path it's already moved, so this is a no-op.
  File.delete(cleaned) if cleaned && File.exist?(cleaned)
end

.supports?(path) ⇒ Boolean

Quick check before we even try mat2 on a file. Used by Strategy to decide whether to add :mat2 to the pipeline.

Returns:

  • (Boolean)


64
65
66
67
68
# File 'lib/metaclean/mat2.rb', line 64

def supports?(path)
  return false unless available?

  SUPPORTED_EXTS.include?(File.extname(path).downcase.delete('.'))
end

.versionObject



58
59
60
# File 'lib/metaclean/mat2.rb', line 58

def version
  available? ? @version : nil
end