Class: Archaeo::WarcWriter

Inherits:
Object
  • Object
show all
Defined in:
lib/archaeo/warc_support.rb

Overview

Writes snapshots to WARC format files (.warc, .warc.gz).

Produces valid WARC 1.0 files with response and metadata records.

Constant Summary collapse

WARC_VERSION =
"WARC/1.0"
RECORD_SEP =
"\r\n\r\n"
CRLF =
"\r\n"

Instance Method Summary collapse

Constructor Details

#initialize(software: "archaeo/#{VERSION}") ⇒ WarcWriter

Returns a new instance of WarcWriter.



118
119
120
121
# File 'lib/archaeo/warc_support.rb', line 118

def initialize(software: "archaeo/#{VERSION}")
  @software = software
  @record_count = 0
end

Instance Method Details

#write(path, pages, compress: nil) ⇒ Object



123
124
125
126
127
128
129
130
# File 'lib/archaeo/warc_support.rb', line 123

def write(path, pages, compress: nil)
  compress = path.end_with?(".gz") if compress.nil?
  io = open_warc(path, compress)
  write_warcinfo(io, path)
  pages.each { |page| write_page(io, page) }
ensure
  io&.close
end