Class: Firecrawl::Models::ParseFile

Inherits:
Object
  • Object
show all
Defined in:
lib/firecrawl/models/parse_file.rb

Overview

Binary upload payload for the ‘/v2/parse` endpoint.

Supported file extensions: .html, .htm, .pdf, .docx, .doc, .odt, .rtf, .xlsx, .xls

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(filename:, content:, content_type: nil) ⇒ ParseFile

Build a ParseFile directly.

Parameters:

  • filename (String)

    filename for the upload (e.g., “document.pdf”)

  • content (String)

    raw bytes for the file

  • content_type (String, nil) (defaults to: nil)

    optional MIME type hint

Raises:

  • (ArgumentError)


16
17
18
19
20
21
22
23
# File 'lib/firecrawl/models/parse_file.rb', line 16

def initialize(filename:, content:, content_type: nil)
  raise ArgumentError, "filename is required" if filename.nil? || filename.to_s.strip.empty?
  raise ArgumentError, "content is required" if content.nil? || content.bytesize.zero?

  @filename = filename.to_s.strip
  @content = content.to_s
  @content_type = content_type
end

Instance Attribute Details

#contentObject (readonly)

Returns the value of attribute content.



9
10
11
# File 'lib/firecrawl/models/parse_file.rb', line 9

def content
  @content
end

#content_typeObject (readonly)

Returns the value of attribute content_type.



9
10
11
# File 'lib/firecrawl/models/parse_file.rb', line 9

def content_type
  @content_type
end

#filenameObject (readonly)

Returns the value of attribute filename.



9
10
11
# File 'lib/firecrawl/models/parse_file.rb', line 9

def filename
  @filename
end

Class Method Details

.from_path(path, filename: nil, content_type: nil) ⇒ ParseFile

Build a ParseFile by reading a file from disk.

Parameters:

  • path (String)

    absolute or relative path to the file

  • filename (String, nil) (defaults to: nil)

    optional override for the upload filename

  • content_type (String, nil) (defaults to: nil)

    optional MIME type hint

Returns:

Raises:

  • (ArgumentError)


31
32
33
34
35
36
37
38
39
40
41
# File 'lib/firecrawl/models/parse_file.rb', line 31

def self.from_path(path, filename: nil, content_type: nil)
  raise ArgumentError, "path is required" if path.nil? || path.to_s.strip.empty?
  unless File.file?(path)
    raise ArgumentError, "file path does not exist: #{path}"
  end

  content = File.binread(path)
  resolved_filename = filename || File.basename(path)
  resolved_content_type = content_type || guess_content_type(resolved_filename)
  new(filename: resolved_filename, content: content, content_type: resolved_content_type)
end

.guess_content_type(filename) ⇒ Object

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# File 'lib/firecrawl/models/parse_file.rb', line 44

def self.guess_content_type(filename)
  ext = File.extname(filename).downcase
  {
    ".pdf" => "application/pdf",
    ".html" => "text/html",
    ".htm" => "text/html",
    ".xhtml" => "application/xhtml+xml",
    ".docx" => "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
    ".doc" => "application/msword",
    ".odt" => "application/vnd.oasis.opendocument.text",
    ".rtf" => "application/rtf",
    ".xlsx" => "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
    ".xls" => "application/vnd.ms-excel",
  }[ext]
end