Module: Philiprehberger::GzipKit
- Defined in:
- lib/philiprehberger/gzip_kit.rb,
lib/philiprehberger/gzip_kit/version.rb
Defined Under Namespace
Classes: Error
Constant Summary collapse
- CHUNK_SIZE =
64 * 1024
- GZIP_MAGIC =
[0x1f, 0x8b].freeze
- VERSION =
'0.3.0'
Class Method Summary collapse
-
.compress(string, level: Zlib::DEFAULT_COMPRESSION, stats: false) ⇒ String, Hash
Compress a string to gzip bytes.
-
.compress_file(src, dest, level: Zlib::DEFAULT_COMPRESSION) {|bytes_processed, total_bytes| ... } ⇒ void
Compress a file to a gzip file.
-
.compress_stream(io_in, io_out, level: Zlib::DEFAULT_COMPRESSION) ⇒ void
Streaming compression from one IO to another, reading in 64KB chunks.
-
.compressed?(data) ⇒ Boolean
Check if data is gzip-compressed by inspecting magic bytes.
-
.concat(data_a, data_b) ⇒ String
Concatenate two gzip-compressed strings.
-
.decompress(data) ⇒ String
Decompress gzip bytes to a string.
-
.decompress_file(src, dest) {|bytes_processed, total_bytes| ... } ⇒ void
Decompress a gzip file to a regular file.
-
.decompress_stream(io_in, io_out) ⇒ void
Streaming decompression from one IO to another, reading in 64KB chunks.
-
.equivalent?(blob_a, blob_b) ⇒ Boolean
Check whether two gzip-compressed blobs decompress to equal byte strings.
-
.inspect_header(data) ⇒ Hash?
Inspect the gzip header without decompressing.
Class Method Details
.compress(string, level: Zlib::DEFAULT_COMPRESSION, stats: false) ⇒ String, Hash
Compress a string to gzip bytes.
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# File 'lib/philiprehberger/gzip_kit.rb', line 20 def self.compress(string, level: Zlib::DEFAULT_COMPRESSION, stats: false) io_out = StringIO.new io_out.binmode gz = Zlib::GzipWriter.new(io_out, level) gz.write(string) gz.close compressed = io_out.string if stats original_size = string.bytesize compressed_size = compressed.bytesize ratio = original_size.zero? ? 0.0 : 1.0 - (compressed_size.to_f / original_size) { data: compressed, ratio: ratio, original_size: original_size, compressed_size: compressed_size } else compressed end end |
.compress_file(src, dest, level: Zlib::DEFAULT_COMPRESSION) {|bytes_processed, total_bytes| ... } ⇒ void
This method returns an undefined value.
Compress a file to a gzip file.
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
# File 'lib/philiprehberger/gzip_kit.rb', line 88 def self.compress_file(src, dest, level: Zlib::DEFAULT_COMPRESSION, &block) File.open(src, 'rb') do |io_in| File.open(dest, 'wb') do |io_out| if block total_bytes = File.size(src) bytes_processed = 0 gz = Zlib::GzipWriter.new(io_out, level) while (chunk = io_in.read(CHUNK_SIZE)) gz.write(chunk) bytes_processed += chunk.bytesize block.call(bytes_processed, total_bytes) end gz.finish else compress_stream(io_in, io_out, level: level) end end end end |
.compress_stream(io_in, io_out, level: Zlib::DEFAULT_COMPRESSION) ⇒ void
This method returns an undefined value.
Streaming compression from one IO to another, reading in 64KB chunks.
200 201 202 203 204 205 206 |
# File 'lib/philiprehberger/gzip_kit.rb', line 200 def self.compress_stream(io_in, io_out, level: Zlib::DEFAULT_COMPRESSION) gz = Zlib::GzipWriter.new(io_out, level) while (chunk = io_in.read(CHUNK_SIZE)) gz.write(chunk) end gz.finish end |
.compressed?(data) ⇒ Boolean
Check if data is gzip-compressed by inspecting magic bytes.
72 73 74 75 76 77 |
# File 'lib/philiprehberger/gzip_kit.rb', line 72 def self.compressed?(data) return false if data.nil? || data.bytesize < 2 bytes = data.bytes bytes[0] == GZIP_MAGIC[0] && bytes[1] == GZIP_MAGIC[1] end |
.concat(data_a, data_b) ⇒ String
Concatenate two gzip-compressed strings.
Per the gzip specification, concatenated gzip streams are valid.
143 144 145 146 147 148 149 150 |
# File 'lib/philiprehberger/gzip_kit.rb', line 143 def self.concat(data_a, data_b) raise Error, 'first argument is not valid gzip data' unless compressed?(data_a) raise Error, 'second argument is not valid gzip data' unless compressed?(data_b) result = String.new(data_a, encoding: Encoding::BINARY) result << data_b.b result end |
.decompress(data) ⇒ String
Decompress gzip bytes to a string.
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
# File 'lib/philiprehberger/gzip_kit.rb', line 48 def self.decompress(data) io_in = StringIO.new(data) io_in.binmode result = String.new(encoding: Encoding::BINARY) # Handle concatenated gzip streams per gzip spec until io_in.eof? gz = Zlib::GzipReader.new(io_in) result << gz.read # GzipReader leaves io_in positioned after the stream unused = gz.unused gz.finish if unused io_in.pos -= unused.bytesize end end result.force_encoding(Encoding::UTF_8) end |
.decompress_file(src, dest) {|bytes_processed, total_bytes| ... } ⇒ void
This method returns an undefined value.
Decompress a gzip file to a regular file.
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
# File 'lib/philiprehberger/gzip_kit.rb', line 116 def self.decompress_file(src, dest, &block) File.open(src, 'rb') do |io_in| File.open(dest, 'wb') do |io_out| if block gz = Zlib::GzipReader.new(io_in) bytes_processed = 0 while (chunk = gz.read(CHUNK_SIZE)) io_out.write(chunk) bytes_processed += chunk.bytesize block.call(bytes_processed, nil) end gz.close else decompress_stream(io_in, io_out) end end end end |
.decompress_stream(io_in, io_out) ⇒ void
This method returns an undefined value.
Streaming decompression from one IO to another, reading in 64KB chunks.
213 214 215 216 217 218 219 220 |
# File 'lib/philiprehberger/gzip_kit.rb', line 213 def self.decompress_stream(io_in, io_out) gz = Zlib::GzipReader.new(io_in) while (chunk = gz.read(CHUNK_SIZE)) io_out.write(chunk) end ensure gz&.close end |
.equivalent?(blob_a, blob_b) ⇒ Boolean
Check whether two gzip-compressed blobs decompress to equal byte strings.
Useful for comparing gzip outputs produced at different compression levels or with different metadata — only the decompressed payloads are compared.
161 162 163 164 165 166 167 168 |
# File 'lib/philiprehberger/gzip_kit.rb', line 161 def self.equivalent?(blob_a, blob_b) raise Error, 'first argument is not valid gzip data' unless compressed?(blob_a) raise Error, 'second argument is not valid gzip data' unless compressed?(blob_b) decompress(blob_a).b == decompress(blob_b).b rescue Zlib::GzipFile::Error => e raise Error, "failed to decompress gzip data: #{e.}" end |
.inspect_header(data) ⇒ Hash?
Inspect the gzip header without decompressing.
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 |
# File 'lib/philiprehberger/gzip_kit.rb', line 174 def self.inspect_header(data) return nil unless compressed?(data) io = StringIO.new(data) io.binmode gz = Zlib::GzipReader.new(io) { method: :deflate, mtime: gz.mtime, os: gz.os_code, original_name: gz.orig_name && gz.orig_name.empty? ? nil : gz.orig_name, comment: gz.comment && gz.comment.empty? ? nil : gz.comment } rescue Zlib::GzipFile::Error nil ensure gz&.close end |