Module: SemanticLogger::Utils

Defined in:
lib/semantic_logger/utils.rb

Overview

Internal-use only utility functions for Semantic Logger. Not intended for public use.

Constant Summary collapse

ENCODE_UTF8_OPTIONS =

Options used when transcoding a string to UTF-8. Invalid byte sequences and characters that cannot be represented in UTF-8 are dropped rather than substituted, matching the preference in issue #180.

{invalid: :replace, undef: :replace, replace: "".freeze}.freeze

Class Method Summary collapse

Class Method Details

.camelize(term) ⇒ Object

Borrow from Rails, when not running Rails



29
30
31
32
33
34
35
# File 'lib/semantic_logger/utils.rb', line 29

def self.camelize(term)
  string = term.to_s
  string = string.sub(/^[a-z\d]*/, &:capitalize)
  string.gsub!(%r{(?:_|(/))([a-z\d]*)}i) { "#{Regexp.last_match(1)}#{Regexp.last_match(2).capitalize}" }
  string.gsub!("/".freeze, "::".freeze)
  string
end

.constantize_symbol(symbol, namespace = "SemanticLogger::Appender") ⇒ Object



7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# File 'lib/semantic_logger/utils.rb', line 7

def self.constantize_symbol(symbol, namespace = "SemanticLogger::Appender")
  klass = "#{namespace}::#{camelize(symbol.to_s)}"
  constant =
    begin
      Object.const_get(klass)
    rescue NameError
      raise(ArgumentError,
            "Could not convert symbol: #{symbol.inspect} to a class in: #{namespace}. Looking for: #{klass}")
    end

  # The resolved constant is instantiated by the caller, so ensure it is
  # actually a class within the expected namespace rather than some other
  # constant that happens to share the name.
  unless constant.is_a?(Class)
    raise(ArgumentError,
          "Could not convert symbol: #{symbol.inspect} to a class in: #{namespace}. #{klass} is not a class.")
  end

  constant
end

.encode_utf8(value) ⇒ Object

Returns a copy of the supplied value with every String converted to valid UTF-8.

Recurses through Hash and Array structures, cleansing both keys and values. Strings that are already valid UTF-8 are returned unchanged (the common case), so the fast path allocates nothing. Any other value (Symbol, Numeric, Time, nil, ...) is returned as-is.

Used by .to_json on the rare failing path, and directly by formatters that serialize per value or emit to a non-JSON sink (where a single .to_json rescue boundary cannot catch an intermediate failure).



122
123
124
125
126
127
128
129
130
131
132
133
134
135
# File 'lib/semantic_logger/utils.rb', line 122

def self.encode_utf8(value)
  case value
  when String
    encode_utf8_string(value)
  when Hash
    value.each_with_object({}) do |(key, val), hash|
      hash[encode_utf8(key)] = encode_utf8(val)
    end
  when Array
    value.map { |element| encode_utf8(element) }
  else
    value
  end
end

.encode_utf8_string(string) ⇒ Object

Returns the string converted to valid UTF-8, dropping any invalid bytes.



143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
# File 'lib/semantic_logger/utils.rb', line 143

def self.encode_utf8_string(string)
  return string if string.encoding == Encoding::UTF_8 && string.valid_encoding?

  if string.encoding == Encoding::UTF_8
    # Correctly tagged as UTF-8 but contains invalid byte sequences.
    string.scrub("")
  else
    # Different encoding (e.g. ASCII-8BIT / Latin-1): transcode into UTF-8.
    string.encode(Encoding::UTF_8, **ENCODE_UTF8_OPTIONS)
  end
rescue EncodingError
  # Last resort for encodings without a converter to UTF-8: reinterpret the
  # raw bytes as UTF-8 and drop anything invalid. Logging must never raise.
  string.dup.force_encoding(Encoding::UTF_8).scrub("")
end

.extract_backtrace(stack = caller) ⇒ Object

Extract the backtrace stripping off the leading semantic logger entries. Leaves all other system and gem path entries in place.



51
52
53
54
55
56
# File 'lib/semantic_logger/utils.rb', line 51

def self.extract_backtrace(stack = caller)
  while (first = stack.first) && extract_path?(first)
    stack.shift
  end
  stack
end

.extract_path?(path) ⇒ Boolean

Whether this path should be excluded from any cleansed backtrace

Returns:

  • (Boolean)


63
64
65
# File 'lib/semantic_logger/utils.rb', line 63

def self.extract_path?(path)
  extract_paths.any? { |exclude| path.include?(exclude) }
end

.extract_pathsObject



58
59
60
# File 'lib/semantic_logger/utils.rb', line 58

def self.extract_paths
  @extract_paths ||= %w[lib/semantic_logger lib/rails_semantic_logger]
end

.method_visibility(mod, method_name) ⇒ Object

Returns the visibility for an instance method



38
39
40
41
42
43
44
45
46
47
# File 'lib/semantic_logger/utils.rb', line 38

def self.method_visibility(mod, method_name)
  method_name = method_name.to_sym
  if mod.instance_methods.include?(method_name)
    :public
  elsif mod.private_instance_methods.include?(method_name)
    :private
  elsif mod.protected_instance_methods.include?(method_name)
    :protected
  end
end

.strip_backtrace(stack = caller) ⇒ Object

Try to strip everything off of the supplied backtrace, until the first application stack entry is at the top. For example all leading gem paths and built-in ruby code paths are removed from the top. Once the first application entry is found, the remaining stack is returned.



70
71
72
73
74
75
# File 'lib/semantic_logger/utils.rb', line 70

def self.strip_backtrace(stack = caller)
  while (first = stack.first) && (strip_path?(first) || extract_path?(first))
    stack.shift
  end
  stack
end

.strip_path?(path) ⇒ Boolean

Whether this path should be excluded from any cleansed backtrace

Returns:

  • (Boolean)


89
90
91
# File 'lib/semantic_logger/utils.rb', line 89

def self.strip_path?(path)
  strip_paths.any? { |exclude| path.start_with?(exclude) }
end

.strip_pathsObject

Paths to exclude in the stripped backtrace Includes Gems and built-in Ruby code paths



79
80
81
82
83
84
85
86
# File 'lib/semantic_logger/utils.rb', line 79

def self.strip_paths
  @strip_paths ||=
    begin
      paths = Gem.path | [Gem.default_dir]
      paths << RbConfig::CONFIG["rubylibdir"]
      paths
    end
end

.to_json(value) ⇒ Object

Serializes the value to JSON, repairing invalid UTF-8 only when necessary.

Non UTF-8 data appears in well under 1% of log events, so it is wasteful to walk and reallocate the entire structure (see .encode_utf8) on every call. Instead this attempts .to_json directly and only falls back to cleansing when serialization fails because of an encoding problem.

The exception raised for non UTF-8 data depends on the json gem version: older versions raise Encoding::UndefinedConversionError (an EncodingError), newer versions wrap it as JSON::GeneratorError, so both are rescued. The retry is attempted only once: if it still fails (for example a JSON::GeneratorError caused by something other than encoding, such as NaN), the error propagates unchanged rather than being swallowed.



106
107
108
109
110
# File 'lib/semantic_logger/utils.rb', line 106

def self.to_json(value)
  value.to_json
rescue JSON::GeneratorError, EncodingError
  encode_utf8(value).to_json
end