Class: MimeMagic
- Inherits:
-
Object
- Object
- MimeMagic
- Defined in:
- lib/mimemagic.rb,
lib/mimemagic/tables.rb,
lib/mimemagic/version.rb
Overview
Mime type detection
Constant Summary collapse
- EXTENSIONS =
{}
- TYPES =
{}
- MAGIC =
[]
- VERSION =
MimeMagic version string
'0.5.7'.freeze
Instance Attribute Summary collapse
-
#mediatype ⇒ Object
readonly
Returns the value of attribute mediatype.
-
#params ⇒ Object
readonly
Returns the value of attribute params.
-
#subtype ⇒ Object
readonly
Returns the value of attribute subtype.
-
#type ⇒ Object
readonly
Returns the value of attribute type.
Class Method Summary collapse
-
.[](type) ⇒ MimeMagic?
Syntactic sugar alias for constructor.
-
.add(type, extensions: [], parents: [], magic: [], comment: nil, aliases: []) ⇒ Object
Add a custom MIME type to the internal dictionary.
-
.aliases(type) ⇒ Array<MimeMagic>
Return the type's aliases.
-
.all_by_magic(io, default: false) ⇒ Array<MimeMagic>
Return all matching MIME types by magic content analysis.
-
.binary?(thing) ⇒ true, ...
Determine if an input is binary.
-
.by_extension(ext, default: false) ⇒ nil, MimeMagic
Look up MIME type by file extension.
-
.by_magic(io, default: false) ⇒ nil, MimeMagic
Look up MIME type by magic content analysis.
-
.by_path(path, default: false) ⇒ nil, MimeMagic
Look up MIME type by file path.
-
.canonical(type) ⇒ MimeMagic?
Return the canonical type.
-
.child?(child, parent, recurse: true) ⇒ true, false
Returns true if type is child of parent type.
- .coerce_default(thing, default) ⇒ Object
-
.default_type(thing = nil) ⇒ MimeMagic
Return either
application/octet-streamortext/plaindepending on whether the thing is binary. - .get_matches(parent) ⇒ Object
- .magic_match(io, method) ⇒ Object
- .magic_match_io(io, matches, buffer) ⇒ Object
- .open_mime_database ⇒ Object
- .parse_database ⇒ Object
-
.remove(type) ⇒ Object
Removes a MIME type from the dictionary.
- .str2int(s) ⇒ Object
Instance Method Summary collapse
-
#alias? ⇒ false, true
Determine if the type is an alias.
-
#aliases ⇒ Array<MimeMagic>
Return the type's aliases.
-
#audio? ⇒ Boolean
Determine if the type is audio.
-
#binary? ⇒ true, ...
Determine if the type is a descendant of
text/plain. -
#canonical ⇒ MimeMagic?
Return the canonical type.
-
#child_of?(parent, recurse: true) ⇒ true, false
Returns true if type is child of parent type.
-
#comment ⇒ nil, String
Get MIME comment.
-
#descendant_of?(ancestor) ⇒ true, false
Returns true if the ancestor type is anywhere in the subject type's lineage.
-
#eql?(other) ⇒ false, true
(also: #==)
Compare the equality of the type with another (or plain string).
-
#extensions ⇒ Array<String>
Get string list of file extensions.
-
#hash ⇒ Integer
Return the object's (the underlying type string) hash.
-
#image? ⇒ Boolean
Determine if the type is an image.
-
#initialize(type) ⇒ MimeMagic
constructor
Initialize a new MIME type by its string representation.
-
#inspect ⇒ String
Return a diagnostic representation of the object.
-
#lineage ⇒ Array<MimeMagic>
(also: #ancestor_types)
Fetches the entire inheritance hierarchy for the given MIME type.
-
#parents ⇒ Array<MimeMagic>
Fetches the immediate parent types.
-
#text? ⇒ Boolean
Returns true if type is a text format.
-
#to_s ⇒ String
Return the type as a string.
-
#video? ⇒ Boolean
Determine if the type is video.
Constructor Details
#initialize(type) ⇒ MimeMagic
Initialize a new MIME type by its string representation.
18 19 20 21 22 23 24 |
# File 'lib/mimemagic.rb', line 18 def initialize(type) @type, *params = type.to_s.strip.split(/(?:\s*;\s*)+/) # chop off params @type.downcase! # normalize the case # split parameter-value pairs if present @params = params.map { |x| x.split(/\s*=\s*/, 2) } unless params.empty? @mediatype, @subtype = @type.split ?/, 2 # split major and minor end |
Instance Attribute Details
#mediatype ⇒ Object (readonly)
Returns the value of attribute mediatype.
12 13 14 |
# File 'lib/mimemagic.rb', line 12 def mediatype @mediatype end |
#params ⇒ Object (readonly)
Returns the value of attribute params.
12 13 14 |
# File 'lib/mimemagic.rb', line 12 def params @params end |
#subtype ⇒ Object (readonly)
Returns the value of attribute subtype.
12 13 14 |
# File 'lib/mimemagic.rb', line 12 def subtype @subtype end |
#type ⇒ Object (readonly)
Returns the value of attribute type.
12 13 14 |
# File 'lib/mimemagic.rb', line 12 def type @type end |
Class Method Details
.[](type) ⇒ MimeMagic?
Syntactic sugar alias for constructor. No-op if type is already
a MimeMagic object. The argument is treated as a file extension
if it doesn't contain a /, and may return nil if it doesn't
resolve.
36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
# File 'lib/mimemagic.rb', line 36 def self.[] type # try noop first return type if type.is_a? self # now we handle the string type = type.to_s.strip # empty string should be default return default_type if type.empty? # this may return null return by_extension type unless type.include? ?/ # otherwise pass to constructor new type end |
.add(type, extensions: [], parents: [], magic: [], comment: nil, aliases: []) ⇒ Object
Add a custom MIME type to the internal dictionary.
61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
# File 'lib/mimemagic.rb', line 61 def self.add type, extensions: [], parents: [], magic: [], comment: nil, aliases: [] type = type.to_s.strip.downcase extensions = [extensions].flatten.compact aliases = [[aliases] || []].flatten.compact t = TYPES[type] = [extensions, [parents].flatten.compact, comment, type, aliases] aliases.each { |a| TYPES[a] = t } extensions.each {|ext| EXTENSIONS[ext] ||= type } MAGIC.unshift [type, magic] if magic true # output is ignored end |
.aliases(type) ⇒ Array<MimeMagic>
Return the type's aliases.
356 357 358 |
# File 'lib/mimemagic.rb', line 356 def self.aliases type self[type].aliases end |
.all_by_magic(io, default: false) ⇒ Array<MimeMagic>
This is a relatively slow operation.
Return all matching MIME types by magic content analysis. When
default is true or a value, the result will never be empty.
322 323 324 325 326 327 |
# File 'lib/mimemagic.rb', line 322 def self.all_by_magic io, default: false default = coerce_default io, default out = magic_match(io, :select).map { |mime| new mime.first } out << default if out.empty? and default out end |
.binary?(thing) ⇒ true, ...
Determine if an input is binary. Not to be confused with the instance method #binary?, which concerns the type.
369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 |
# File 'lib/mimemagic.rb', line 369 def self.binary? thing sample = '' # get some stuff out of the IO or get a substring if thing.is_a? MimeMagic return thing.binary? elsif %i[seek tell read].all? { |m| thing.respond_to? m } pos = thing.tell thing.seek 0, 0 sample = thing.read(256).to_s # handle empty thing.seek pos elsif thing.respond_to? :to_s str = thing.to_s # if it contains a slash it could be either a path or mimetype test = if str.include? ?/ canonical(str) || by_extension(str.split(?.).last) else by_extension str.split(?.).last end return test.binary? if test sample = str[0, 256] else # nil if we don't know what this thing is return end # consider this to be 'binary' if empty return true if sample.empty? # control codes minus ordinary whitespace /[\x0-\x8\xe-\x1f\x7f]/n.match? sample.b end |
.by_extension(ext, default: false) ⇒ nil, MimeMagic
Look up MIME type by file extension. When default is true or a
value, this method will always return a value.
277 278 279 280 281 282 |
# File 'lib/mimemagic.rb', line 277 def self.by_extension ext, default: false ext = ext.to_s.downcase.delete_prefix ?. default = coerce_default '', default mime = EXTENSIONS[ext] mime ? new(mime) : default end |
.by_magic(io, default: false) ⇒ nil, MimeMagic
This is a relatively slow operation.
Look up MIME type by magic content analysis. When default is true or a
value, this method will always return a value.
306 307 308 309 310 |
# File 'lib/mimemagic.rb', line 306 def self.by_magic io, default: false default = coerce_default io, default mime = magic_match(io, :find) or return default new mime.first end |
.by_path(path, default: false) ⇒ nil, MimeMagic
Look up MIME type by file path. When default is true or a value,
this method will always return a value.
292 293 294 |
# File 'lib/mimemagic.rb', line 292 def self.by_path path, default: false by_extension(File.extname(path), default: default) end |
.canonical(type) ⇒ MimeMagic?
Return the canonical type.
346 347 348 |
# File 'lib/mimemagic.rb', line 346 def self.canonical type self[type].canonical end |
.child?(child, parent, recurse: true) ⇒ true, false
Returns true if type is child of parent type.
336 337 338 |
# File 'lib/mimemagic.rb', line 336 def self.child?(child, parent, recurse: true) self[child].child_of? parent, recurse: recurse end |
.coerce_default(thing, default) ⇒ Object
417 418 419 420 421 422 423 424 425 |
# File 'lib/mimemagic.rb', line 417 def self.coerce_default thing, default case default when nil, false then nil when true then default_type thing when MimeMagic then default when String, -> x { x.respond_to? :to_s } then new default else default_type thing end end |
.default_type(thing = nil) ⇒ MimeMagic
Return either application/octet-stream or text/plain depending
on whether the thing is binary.
410 411 412 413 |
# File 'lib/mimemagic.rb', line 410 def self.default_type thing = nil return new 'application/octet-stream' unless thing new(binary?(thing) ? 'application/octet-stream' : 'text/plain') end |
.get_matches(parent) ⇒ Object
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
# File 'lib/mimemagic/tables.rb', line 18 def self.get_matches(parent) parent.elements.map {|match| if match['mask'] nil else type = match['type'] value = match['value'] offset = match['offset'].split(':').map {|x| x.to_i } offset = offset.size == 2 ? offset[0]..offset[1] : offset[0] case type when 'string' # This *one* pattern match, in the entirety of fd.o's mime types blows up the parser # because of the escape character \c, so right here we have a hideous hack to # accommodate that. if value == '\chapter' '\chapter' else value.gsub!(/\\(x[\dA-Fa-f]{1,2}|0\d{1,3}|\d{1,3}|.)/) { eval("\"\\#{$1}\"") } end when 'big16' value = str2int(value) value = ((value >> 8).chr + (value & 0xFF).chr) when 'big32' value = str2int(value) value = (((value >> 24) & 0xFF).chr + ((value >> 16) & 0xFF).chr + ((value >> 8) & 0xFF).chr + (value & 0xFF).chr) when 'little16' value = str2int(value) value = ((value & 0xFF).chr + (value >> 8).chr) when 'little32' value = str2int(value) value = ((value & 0xFF).chr + ((value >> 8) & 0xFF).chr + ((value >> 16) & 0xFF).chr + ((value >> 24) & 0xFF).chr) when 'host16' # use little endian value = str2int(value) value = ((value & 0xFF).chr + (value >> 8).chr) when 'host32' # use little endian value = str2int(value) value = ((value & 0xFF).chr + ((value >> 8) & 0xFF).chr + ((value >> 16) & 0xFF).chr + ((value >> 24) & 0xFF).chr) when 'byte' value = str2int(value) value = value.chr end children = get_matches(match) children.empty? ? [offset, value] : [offset, value, children] end }.compact end |
.magic_match(io, method) ⇒ Object
427 428 429 430 431 432 433 434 435 |
# File 'lib/mimemagic.rb', line 427 def self.magic_match(io, method) return magic_match(StringIO.new(io.to_s), method) unless io.respond_to?(:read) io.binmode if io.respond_to?(:binmode) io.set_encoding(Encoding::BINARY) if io.respond_to?(:set_encoding) buffer = "".encode(Encoding::BINARY) MAGIC.send(method) { |type, matches| magic_match_io(io, matches, buffer) } end |
.magic_match_io(io, matches, buffer) ⇒ Object
437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 |
# File 'lib/mimemagic.rb', line 437 def self.magic_match_io(io, matches, buffer) matches.any? do |offset, value, children| match = if Range === offset io.read(offset.begin, buffer) x = io.read(offset.end - offset.begin + value.bytesize, buffer) x && x.include?(value) else io.read(offset, buffer) io.read(value.bytesize, buffer) == value end io.rewind match && (!children || magic_match_io(io, children, buffer)) end end |
.open_mime_database ⇒ Object
67 68 69 70 |
# File 'lib/mimemagic/tables.rb', line 67 def self.open_mime_database path = MimeMagic::DATABASE_PATH File.open(path) end |
.parse_database ⇒ Object
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
# File 'lib/mimemagic/tables.rb', line 72 def self.parse_database file = open_mime_database doc = Nokogiri::XML(file) extensions = {} types = {} magics = [] (doc/'mime-info/mime-type').each do |mime| comments = Hash[*(mime/'comment').map {|comment| [comment['xml:lang'], comment.inner_text] }.flatten] type = mime['type'] subclass = (mime/'sub-class-of').map{|x| x['type']} exts = (mime/'glob').map do |x| x['pattern'] =~ /^\*\.([^\[\]]+)$/ ? $1.downcase : nil end.compact (mime/'magic').each do |magic| priority = magic['priority'].to_i matches = get_matches(magic) magics << [priority, type, matches] end aliases = (mime/'alias/@type').map { |a| a.value.downcase.strip.freeze } # XXX uhh do we only use the type if it has a file extension?? unless exts.empty? exts.each { |x| extensions[x] ||= type } types[type] = [exts, subclass, comments[nil], type, aliases] # don't add the aliases yet; we do that below end end magics = magics.sort {|a,b| [-a[0],a[1]] <=> [-b[0],b[1]] } common_types = [ "image/jpeg", # .jpg "image/png", # .png "image/gif", # .gif "image/tiff", # .tiff "image/bmp", # .bmp "image/vnd.adobe.photoshop", # .psd "image/webp", # .webp "image/svg+xml", # .svg "video/x-msvideo", # .avi "video/x-ms-wmv", # .wmv "video/mp4", # .mp4, .m4v "video/quicktime", # .mov "video/mpeg", # .mpeg "video/ogg", # .ogv "video/webm", # .webm "video/x-matroska", # .mkv "video/x-flv", # .flv "audio/mpeg", # .mp3 "audio/x-wav", # .wav "audio/aac", # .aac "audio/flac", # .flac "audio/mp4", # .m4a "audio/ogg", # .ogg "application/pdf", # .pdf "application/msword", # .doc "application/vnd.openxmlformats-officedocument.wordprocessingml.document", # .docx "application/vnd.ms-powerpoint", # .pps "application/vnd.openxmlformats-officedocument.presentationml.slideshow", # .ppsx "application/vnd.ms-excel", # .pps "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", # .ppsx ] common_magics = common_types.map do |common_type| magics.find { |_, type, _| type == common_type } end magics = (common_magics.compact + magics).uniq extensions.keys.sort.each do |key| EXTENSIONS[key] = extensions[key] end types.keys.sort.each do |key| exts, parents, comment, canon, aliases = *types[key] parents.sort! aliases.sort! # we are copying it i guess t = TYPES[key] = [exts, parents, comment, canon, aliases].freeze # now do the aliases oops they'll be out of order oh well aliases.each { |a| TYPES[a] = t } end magics.each do |priority, type, matches| MAGIC << [type, matches] end end |
.remove(type) ⇒ Object
All associated extensions and magic are removed too.
Removes a MIME type from the dictionary. You might want to do this if you're seeing impossible conflicts (for instance, application/x-gmc-link).
83 84 85 86 87 88 89 |
# File 'lib/mimemagic.rb', line 83 def self.remove(type) EXTENSIONS.delete_if {|ext, t| t == type } MAGIC.delete_if {|t, m| t == type } TYPES.delete(type) true # output is also ignored end |
.str2int(s) ⇒ Object
12 13 14 15 16 |
# File 'lib/mimemagic/tables.rb', line 12 def self.str2int(s) return s.to_i(16) if s[0..1].downcase == '0x' return s.to_i(8) if s[0..0].downcase == '0' s.to_i(10) end |
Instance Method Details
#alias? ⇒ false, true
Determine if the type is an alias.
148 149 150 |
# File 'lib/mimemagic.rb', line 148 def alias? type != canonical.type end |
#aliases ⇒ Array<MimeMagic>
Return the type's aliases.
138 139 140 141 142 |
# File 'lib/mimemagic.rb', line 138 def aliases TYPES.fetch(type.downcase, [nil, nil, nil, nil, []])[4].map do |t| self.class.new t end end |
#audio? ⇒ Boolean
Determine if the type is audio.
98 |
# File 'lib/mimemagic.rb', line 98 def audio?; mediatype == 'audio'; end |
#binary? ⇒ true, ...
Determine if the type is a descendant of text/plain. Not to be
confused with the class method binary?, which concerns
arbitrary input.
215 216 217 |
# File 'lib/mimemagic.rb', line 215 def binary? not lineage.include? 'text/plain' end |
#canonical ⇒ MimeMagic?
Return the canonical type. Returns nil if the type is unknown to
the registry.
124 125 126 127 128 129 130 131 132 |
# File 'lib/mimemagic.rb', line 124 def canonical # special case for application/octet-stream, which is apparently # not in there return self if type.downcase == 'application/octet-stream' t = TYPES[type.downcase] or return return self if type == t[3] self.class.new t[3] end |
#child_of?(parent, recurse: true) ⇒ true, false
Returns true if type is child of parent type. Behaves the same as
descendant_of? if recurse is true, which is the default.
178 179 180 181 182 |
# File 'lib/mimemagic.rb', line 178 def child_of?(parent, recurse: true) return descendant_of? parent if recurse return unless c = canonical c.parents.include? self.class[parent].canonical end |
#comment ⇒ nil, String
Get MIME comment.
115 116 117 |
# File 'lib/mimemagic.rb', line 115 def comment TYPES.fetch(type, [nil, nil, nil])[2].to_s.dup end |
#descendant_of?(ancestor) ⇒ true, false
Returns true if the ancestor type is anywhere in the subject
type's lineage. Always returns false if either self or
ancestor are unknown to the type registry.
160 161 162 163 164 165 166 167 168 |
# File 'lib/mimemagic.rb', line 160 def descendant_of? ancestor # always false if we don't know what this is c = canonical || self return false unless ancestor = self.class[ancestor] ancestor = ancestor.canonical || ancestor # ancestor canonical could be nil which will be false c.lineage.include? ancestor end |
#eql?(other) ⇒ false, true Also known as: ==
Compare the equality of the type with another (or plain string).
225 226 227 228 229 230 231 232 233 234 235 236 237 238 |
# File 'lib/mimemagic.rb', line 225 def eql?(other) return false unless other # coerce the rhs other = self.class[other] or return false # check for an exact match return true if type == other.type # now canonicalize both sides and check lhs = canonical rhs = other.canonical lhs && rhs && lhs.type == rhs.type end |
#extensions ⇒ Array<String>
Get string list of file extensions.
107 108 109 |
# File 'lib/mimemagic.rb', line 107 def extensions TYPES.fetch(type, [[]]).first.map { |e| e.to_s.dup } end |
#hash ⇒ Integer
Return the object's (the underlying type string) hash.
246 247 248 |
# File 'lib/mimemagic.rb', line 246 def hash type.hash end |
#image? ⇒ Boolean
Determine if the type is an image.
95 |
# File 'lib/mimemagic.rb', line 95 def image?; mediatype == 'image'; end |
#inspect ⇒ String
Return a diagnostic representation of the object.
262 263 264 265 266 267 |
# File 'lib/mimemagic.rb', line 262 def inspect out = @type out = [out, @params.map { |x| x.join ?= }].join ?; if @params and !@params.empty? %q[<%s "%s">] % [self.class, out] end |
#lineage ⇒ Array<MimeMagic> Also known as: ancestor_types
Fetches the entire inheritance hierarchy for the given MIME type.
203 204 205 |
# File 'lib/mimemagic.rb', line 203 def lineage ([canonical || self] + parents.map { |t| t.lineage }.flatten).uniq end |
#parents ⇒ Array<MimeMagic>
Fetches the immediate parent types.
188 189 190 191 192 193 194 195 196 197 |
# File 'lib/mimemagic.rb', line 188 def parents out = TYPES.fetch(type.to_s.downcase, [nil, []])[1].map do |x| self.class.new x end # add this unless we're it out << self.class.new('application/octet-stream') if out.empty? and type.downcase != 'application/octet-stream' out.uniq end |
#text? ⇒ Boolean
Returns true if type is a text format.
92 |
# File 'lib/mimemagic.rb', line 92 def text?; mediatype == 'text' || descendant_of?('text/plain'); end |
#to_s ⇒ String
Return the type as a string.
254 255 256 |
# File 'lib/mimemagic.rb', line 254 def to_s type end |
#video? ⇒ Boolean
Determine if the type is video.
101 |
# File 'lib/mimemagic.rb', line 101 def video?; mediatype == 'video'; end |