Module: WebStruct::Http::ContentType

Defined in:
lib/webstruct/page/content_type.rb

Overview

Utilities for the HTTP Content-Type header (charset parameter and UTF-8 decoding).

Constant Summary collapse

CHARSET_PARAM =

Matches a charset= parameter in a Content-Type header value.

/;\s*charset\s*=\s*(?:"(?<charset>[^"]+)"|(?<charset>[^;\s]+))/i

Class Method Summary collapse

Class Method Details

.charset(header) ⇒ String?

Returns the charset token from a raw Content-Type header, if present.

Parameters:

  • header (String, nil)

Returns:

  • (String, nil)


16
17
18
19
20
21
# File 'lib/webstruct/page/content_type.rb', line 16

def charset(header)
  return if header.nil?

  match = header.match(CHARSET_PARAM)
  match&.[](:charset)
end

.decode_to_utf8(raw, charset_label) ⇒ String

Transcodes raw to UTF-8 when charset_label names a non-UTF-8 encoding.

Parameters:

  • raw (String)
  • charset_label (String, nil)

Returns:

  • (String)

Raises:

  • (ArgumentError)

    when the charset is invalid

  • (Encoding::InvalidByteSequenceError)

    when the input contains invalid bytes

  • (Encoding::UndefinedConversionError)

    when the input contains undefined characters



42
43
44
45
46
47
48
49
# File 'lib/webstruct/page/content_type.rb', line 42

def decode_to_utf8(raw, charset_label)
  string = raw.to_s
  return string if charset_label.nil? || charset_label.to_s.casecmp("utf-8").zero?

  string.b.force_encoding(charset_label).encode(Encoding::UTF_8, invalid: :replace, undef: :replace)
rescue ArgumentError, Encoding::InvalidByteSequenceError, Encoding::UndefinedConversionError
  string
end

.normalize(content_type) ⇒ String

Returns the base media type and subtype in lowercase, without parameters (e.g. “text/html”). This is the canonical implementation; Mime.normalize delegates here.

Parameters:

  • content_type (String, nil)

    raw Content-Type header value

Returns:

  • (String)

    empty string when absent or blank



28
29
30
31
32
# File 'lib/webstruct/page/content_type.rb', line 28

def normalize(content_type)
  return "" if content_type.nil?

  content_type.split(";", 2).first.to_s.strip.downcase
end