Class: Hyperion::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/hyperion/parser.rb

Overview

Pure-Ruby HTTP/1.1 parser. Phase 4 replaces this with a C extension wrapping llhttp; the interface (parse(buffer) -> [Request, end_offset] | raise ParseError | raise UnsupportedError) stays stable.

Constant Summary collapse

REQUEST_LINE_RE =
%r{\A([A-Z]+) ([^ ?]+)(?:\?([^ ]*))? (HTTP/\d\.\d)\r\n}
HEADER_RE =
/\G([!-9;-~]+):[ \t]*(.*?)[ \t]*\r\n/

Instance Method Summary collapse

Instance Method Details

#parse(buffer) ⇒ Object

Returns [Request, end_offset] where end_offset is the byte index just AFTER the last byte consumed by parsing. The caller (Connection) uses end_offset to compute carry-over for pipelining.

Raises:



15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
# File 'lib/hyperion/parser.rb', line 15

def parse(buffer)
  m = REQUEST_LINE_RE.match(buffer)
  raise ParseError, 'invalid request line' unless m

  method, path, query, version = m.captures
  offset = m.end(0)

  headers = {}
  loop do
    if buffer.byteslice(offset, 2) == "\r\n"
      offset += 2
      break
    end
    h = HEADER_RE.match(buffer, offset)
    raise ParseError, 'invalid header line' unless h && h.begin(0) == offset

    headers[h[1].downcase] = h[2]
    offset = h.end(0)
  end

  headers_end = offset

  has_content_length     = headers.key?('content-length')
  has_transfer_encoding  = headers.key?('transfer-encoding')

  # RFC 9112 ยง6.1: a sender MUST NOT send a message containing both
  # Content-Length and Transfer-Encoding. Refuse rather than risk
  # request smuggling.
  if has_content_length && has_transfer_encoding
    raise ParseError, 'both Content-Length and Transfer-Encoding present'
  end

  if has_transfer_encoding
    encodings = headers['transfer-encoding'].split(',').map { |e| e.strip.downcase }
    unless encodings.last == 'chunked'
      raise UnsupportedError,
            "Transfer-Encoding #{headers['transfer-encoding'].inspect} not supported"
    end

    result = dechunk(buffer, headers_end)
    raise ParseError, 'truncated chunked body' if result.nil?

    body, end_offset = result
    request = Request.new(
      method: method,
      path: path,
      query_string: query || '',
      http_version: version,
      headers: headers,
      body: body
    )
    return [request, end_offset]
  end

  content_length = headers['content-length']&.to_i || 0
  body = buffer.byteslice(headers_end, content_length) || ''
  raise ParseError, "content-length mismatch (declared #{content_length}, got #{body.bytesize})" \
    if body.bytesize != content_length

  end_offset = headers_end + content_length
  request = Request.new(
    method: method,
    path: path,
    query_string: query || '',
    http_version: version,
    headers: headers,
    body: body
  )
  [request, end_offset]
end