Module: SemanticLogger::Utils
- Defined in:
- lib/semantic_logger/utils.rb
Overview
Internal-use only utility functions for Semantic Logger. Not intended for public use.
Constant Summary collapse
- ENCODE_UTF8_OPTIONS =
Options used when transcoding a string to UTF-8. Invalid byte sequences and characters that cannot be represented in UTF-8 are dropped rather than substituted, matching the preference in issue #180.
{invalid: :replace, undef: :replace, replace: "".freeze}.freeze
Class Method Summary collapse
-
.camelize(term) ⇒ Object
Borrow from Rails, when not running Rails.
- .constantize_symbol(symbol, namespace = "SemanticLogger::Appender") ⇒ Object
-
.encode_utf8(value) ⇒ Object
Returns a copy of the supplied value with every String converted to valid UTF-8.
-
.encode_utf8_string(string) ⇒ Object
Returns the string converted to valid UTF-8, dropping any invalid bytes.
-
.extract_backtrace(stack = caller) ⇒ Object
Extract the backtrace stripping off the leading semantic logger entries.
-
.extract_path?(path) ⇒ Boolean
Whether this path should be excluded from any cleansed backtrace.
- .extract_paths ⇒ Object
-
.method_visibility(mod, method_name) ⇒ Object
Returns the visibility for an instance method.
-
.strip_backtrace(stack = caller) ⇒ Object
Try to strip everything off of the supplied backtrace, until the first application stack entry is at the top.
-
.strip_path?(path) ⇒ Boolean
Whether this path should be excluded from any cleansed backtrace.
-
.strip_paths ⇒ Object
Paths to exclude in the stripped backtrace Includes Gems and built-in Ruby code paths.
-
.to_json(value) ⇒ Object
Serializes the value to JSON, repairing invalid UTF-8 only when necessary.
Class Method Details
.camelize(term) ⇒ Object
Borrow from Rails, when not running Rails
29 30 31 32 33 34 35 |
# File 'lib/semantic_logger/utils.rb', line 29 def self.camelize(term) string = term.to_s string = string.sub(/^[a-z\d]*/, &:capitalize) string.gsub!(%r{(?:_|(/))([a-z\d]*)}i) { "#{Regexp.last_match(1)}#{Regexp.last_match(2).capitalize}" } string.gsub!("/".freeze, "::".freeze) string end |
.constantize_symbol(symbol, namespace = "SemanticLogger::Appender") ⇒ Object
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
# File 'lib/semantic_logger/utils.rb', line 7 def self.constantize_symbol(symbol, namespace = "SemanticLogger::Appender") klass = "#{namespace}::#{camelize(symbol.to_s)}" constant = begin Object.const_get(klass) rescue NameError raise(ArgumentError, "Could not convert symbol: #{symbol.inspect} to a class in: #{namespace}. Looking for: #{klass}") end # The resolved constant is instantiated by the caller, so ensure it is # actually a class within the expected namespace rather than some other # constant that happens to share the name. unless constant.is_a?(Class) raise(ArgumentError, "Could not convert symbol: #{symbol.inspect} to a class in: #{namespace}. #{klass} is not a class.") end constant end |
.encode_utf8(value) ⇒ Object
Returns a copy of the supplied value with every String converted to valid UTF-8.
Recurses through Hash and Array structures, cleansing both keys and values. Strings that are already valid UTF-8 are returned unchanged (the common case), so the fast path allocates nothing. Any other value (Symbol, Numeric, Time, nil, ...) is returned as-is.
Used by .to_json on the rare failing path, and directly by formatters that
serialize per value or emit to a non-JSON sink (where a single .to_json
rescue boundary cannot catch an intermediate failure).
122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
# File 'lib/semantic_logger/utils.rb', line 122 def self.encode_utf8(value) case value when String encode_utf8_string(value) when Hash value.each_with_object({}) do |(key, val), hash| hash[encode_utf8(key)] = encode_utf8(val) end when Array value.map { |element| encode_utf8(element) } else value end end |
.encode_utf8_string(string) ⇒ Object
Returns the string converted to valid UTF-8, dropping any invalid bytes.
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
# File 'lib/semantic_logger/utils.rb', line 143 def self.encode_utf8_string(string) return string if string.encoding == Encoding::UTF_8 && string.valid_encoding? if string.encoding == Encoding::UTF_8 # Correctly tagged as UTF-8 but contains invalid byte sequences. string.scrub("") else # Different encoding (e.g. ASCII-8BIT / Latin-1): transcode into UTF-8. string.encode(Encoding::UTF_8, **ENCODE_UTF8_OPTIONS) end rescue EncodingError # Last resort for encodings without a converter to UTF-8: reinterpret the # raw bytes as UTF-8 and drop anything invalid. Logging must never raise. string.dup.force_encoding(Encoding::UTF_8).scrub("") end |
.extract_backtrace(stack = caller) ⇒ Object
Extract the backtrace stripping off the leading semantic logger entries. Leaves all other system and gem path entries in place.
51 52 53 54 55 56 |
# File 'lib/semantic_logger/utils.rb', line 51 def self.extract_backtrace(stack = caller) while (first = stack.first) && extract_path?(first) stack.shift end stack end |
.extract_path?(path) ⇒ Boolean
Whether this path should be excluded from any cleansed backtrace
63 64 65 |
# File 'lib/semantic_logger/utils.rb', line 63 def self.extract_path?(path) extract_paths.any? { |exclude| path.include?(exclude) } end |
.extract_paths ⇒ Object
58 59 60 |
# File 'lib/semantic_logger/utils.rb', line 58 def self.extract_paths @extract_paths ||= %w[lib/semantic_logger lib/rails_semantic_logger] end |
.method_visibility(mod, method_name) ⇒ Object
Returns the visibility for an instance method
38 39 40 41 42 43 44 45 46 47 |
# File 'lib/semantic_logger/utils.rb', line 38 def self.method_visibility(mod, method_name) method_name = method_name.to_sym if mod.instance_methods.include?(method_name) :public elsif mod.private_instance_methods.include?(method_name) :private elsif mod.protected_instance_methods.include?(method_name) :protected end end |
.strip_backtrace(stack = caller) ⇒ Object
Try to strip everything off of the supplied backtrace, until the first application stack entry is at the top. For example all leading gem paths and built-in ruby code paths are removed from the top. Once the first application entry is found, the remaining stack is returned.
70 71 72 73 74 75 |
# File 'lib/semantic_logger/utils.rb', line 70 def self.strip_backtrace(stack = caller) while (first = stack.first) && (strip_path?(first) || extract_path?(first)) stack.shift end stack end |
.strip_path?(path) ⇒ Boolean
Whether this path should be excluded from any cleansed backtrace
89 90 91 |
# File 'lib/semantic_logger/utils.rb', line 89 def self.strip_path?(path) strip_paths.any? { |exclude| path.start_with?(exclude) } end |
.strip_paths ⇒ Object
Paths to exclude in the stripped backtrace Includes Gems and built-in Ruby code paths
79 80 81 82 83 84 85 86 |
# File 'lib/semantic_logger/utils.rb', line 79 def self.strip_paths @strip_paths ||= begin paths = Gem.path | [Gem.default_dir] paths << RbConfig::CONFIG["rubylibdir"] paths end end |
.to_json(value) ⇒ Object
Serializes the value to JSON, repairing invalid UTF-8 only when necessary.
Non UTF-8 data appears in well under 1% of log events, so it is wasteful to
walk and reallocate the entire structure (see .encode_utf8) on every call.
Instead this attempts .to_json directly and only falls back to cleansing
when serialization fails because of an encoding problem.
The exception raised for non UTF-8 data depends on the json gem version: older versions raise Encoding::UndefinedConversionError (an EncodingError), newer versions wrap it as JSON::GeneratorError, so both are rescued. The retry is attempted only once: if it still fails (for example a JSON::GeneratorError caused by something other than encoding, such as NaN), the error propagates unchanged rather than being swallowed.
106 107 108 109 110 |
# File 'lib/semantic_logger/utils.rb', line 106 def self.to_json(value) value.to_json rescue JSON::GeneratorError, EncodingError encode_utf8(value).to_json end |