Class: MsgExtractor::Cfbf::File

Inherits:
Object
  • Object
show all
Defined in:
lib/msg_extractor/cfbf/file.rb

Overview

The assembled compound file: directory tree plus stream extraction. Streams smaller than the mini-stream cutoff (4096) live in the ministream and are chained through the miniFAT; larger streams are chained directly through the FAT.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(data) ⇒ File

Returns a new instance of File.

Raises:



34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# File 'lib/msg_extractor/cfbf/file.rb', line 34

def initialize(data)
  @data = data.encoding == Encoding::BINARY ? data : data.b
  raise InvalidFormatError, "file too small to be an OLE2 file" if @data.bytesize < 512
  @header = Header.new(@data.byteslice(0, 512))
  @fat = Fat.new(@data, @header)
  @directory = Directory.new(@fat.read_chain(@header.first_dir_sector))
  @root = @directory.root
  @mini_stream =
    if @root.size.zero?
      "".b
    else
      if @root.size > @data.bytesize
        raise CorruptFileError, "root ministream size #{@root.size} exceeds file size"
      end
      chain_bytes = @fat.read_chain(@root.start_sector)
      if chain_bytes.bytesize < @root.size
        raise CorruptFileError, "root ministream chain shorter than declared size"
      end
      chain_bytes.byteslice(0, @root.size)
    end
  @minifat =
    if @header.num_minifat_sectors.positive? && @header.first_minifat_sector != ENDOFCHAIN
      @fat.read_chain(@header.first_minifat_sector).unpack("V*")
    else
      []
    end
end

Instance Attribute Details

#rootObject (readonly)

Returns the value of attribute root.



8
9
10
# File 'lib/msg_extractor/cfbf/file.rb', line 8

def root
  @root
end

Class Method Details

.read(source) ⇒ Object

Accepts a filesystem path, a binary String of file content, or an IO. Strings beginning with the OLE signature are treated as content; all other Strings are treated as filesystem paths.



13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# File 'lib/msg_extractor/cfbf/file.rb', line 13

def self.read(source)
  data =
    if source.is_a?(::String)
      b = source.b
      if b.byteslice(0, 8) == Header::SIGNATURE
        b
      else
        begin
          ::File.binread(source)
        rescue Errno::ENAMETOOLONG, Errno::EINVAL, ArgumentError
          raise InvalidFormatError, "not an OLE2 file and not a readable path"
        end
      end
    elsif source.respond_to?(:read)
      source.read.b
    else
      raise ArgumentError, "cannot read MSG from #{source.class}"
    end
  new(data)
end

Instance Method Details

#entry(path) ⇒ Object

Path components separated by “/”, matched case-insensitively.



63
64
65
66
67
68
69
70
# File 'lib/msg_extractor/cfbf/file.rb', line 63

def entry(path)
  current = @root
  path.split("/").each do |part|
    current = current.children[part.upcase]
    return nil unless current
  end
  current
end

#read_stream(entry) ⇒ Object



77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
# File 'lib/msg_extractor/cfbf/file.rb', line 77

def read_stream(entry)
  return "".b if entry.size.zero?
  if entry.size > @data.bytesize
    raise CorruptFileError, "stream entry size #{entry.size} exceeds file size"
  end
  if entry.size < @header.mini_stream_cutoff
    read_mini_stream(entry)
  else
    chain_bytes = @fat.read_chain(entry.start_sector)
    if chain_bytes.bytesize < entry.size
      raise CorruptFileError, "FAT stream chain shorter than declared size"
    end
    chain_bytes.byteslice(0, entry.size)
  end
end

#stream(path) ⇒ Object



72
73
74
75
# File 'lib/msg_extractor/cfbf/file.rb', line 72

def stream(path)
  e = entry(path)
  e&.stream? ? read_stream(e) : nil
end