Class: Pubid::Utils::StringNormalizer
- Inherits:
-
Object
- Object
- Pubid::Utils::StringNormalizer
- Defined in:
- lib/pubid/utils/string_normalizer.rb
Overview
String normalization utilities for PubID parsing and rendering
This module provides centralized string manipulation methods to reduce duplication and improve consistency.
Usage
Pubid::Utils::StringNormalizer.normalize_dashes(str)
Pubid::Utils::StringNormalizer.normalize_whitespace(str)
Pubid::Utils::StringNormalizer.split_compound_number(str)
Constant Summary collapse
- DASH_CHARS =
Unicode dash characters that should be normalized to ASCII hyphen
["-", "‑", "‐", "–", "—"].freeze
- WHITESPACE_CHARS =
Whitespace characters to normalize to single space
[" ", "\t", "\n", "\r", "\u00A0"].freeze
Class Method Summary collapse
-
.blank?(str) ⇒ Boolean
Check if string is blank (nil, empty, or only whitespace).
-
.clean_abbr(str) ⇒ String
Clean and uppercase abbreviation.
-
.extract_number_suffix(str) ⇒ Array<String, nil>
Extract numeric suffix from string.
-
.join_parts(parts, separator: "") ⇒ String
Join parts with proper separator, skipping nils.
-
.normalize_dashes(str) ⇒ String
Normalize all dash characters to ASCII hyphen.
-
.normalize_whitespace(str) ⇒ String
Normalize whitespace to single space and strip.
-
.split_compound_number(str, separators: ["-", "/"]) ⇒ Array<String>
Split compound number (e.g., “800-53-1” -> [“800”, “53”, “1”]).
-
.title_case(str) ⇒ String
Convert to title case (first letter of each word uppercase).
-
.to_s(str) ⇒ String
Safe to_s method that handles nil.
-
.truncate(str, max_length:, ellipsis: "...") ⇒ String
Truncate string to max length with ellipsis.
Class Method Details
.blank?(str) ⇒ Boolean
Check if string is blank (nil, empty, or only whitespace)
177 178 179 |
# File 'lib/pubid/utils/string_normalizer.rb', line 177 def blank?(str) str.nil? || str.to_s.strip.empty? end |
.clean_abbr(str) ⇒ String
Clean and uppercase abbreviation
63 64 65 66 67 |
# File 'lib/pubid/utils/string_normalizer.rb', line 63 def clean_abbr(str) return "" if str.nil? normalize_whitespace(str).upcase end |
.extract_number_suffix(str) ⇒ Array<String, nil>
Extract numeric suffix from string
97 98 99 100 101 102 103 104 105 106 |
# File 'lib/pubid/utils/string_normalizer.rb', line 97 def extract_number_suffix(str) return [nil, nil] unless str match = str.match(/^(\D+)?(\d+)([a-zA-Z]+)?$/) return [nil, nil] unless match number = match[2] suffix = match[3] [number, suffix] end |
.join_parts(parts, separator: "") ⇒ String
Join parts with proper separator, skipping nils
120 121 122 |
# File 'lib/pubid/utils/string_normalizer.rb', line 120 def join_parts(parts, separator: "") parts.compact.join(separator) end |
.normalize_dashes(str) ⇒ String
Normalize all dash characters to ASCII hyphen
33 34 35 36 37 |
# File 'lib/pubid/utils/string_normalizer.rb', line 33 def normalize_dashes(str) return str if str.nil? str.tr(DASH_CHARS.join, "-") end |
.normalize_whitespace(str) ⇒ String
Normalize whitespace to single space and strip
48 49 50 51 52 |
# File 'lib/pubid/utils/string_normalizer.rb', line 48 def normalize_whitespace(str) return "" if str.nil? str.gsub(/[#{WHITESPACE_CHARS.join}]+/, " ").strip end |
.split_compound_number(str, separators: ["-", "/"]) ⇒ Array<String>
Split compound number (e.g., “800-53-1” -> [“800”, “53”, “1”])
81 82 83 84 85 86 |
# File 'lib/pubid/utils/string_normalizer.rb', line 81 def split_compound_number(str, separators: ["-", "/"]) return [] unless str normalized = normalize_dashes(str) normalized.split(/[#{separators.join}]/).reject(&:empty?) end |
.title_case(str) ⇒ String
Convert to title case (first letter of each word uppercase)
154 155 156 157 158 159 160 161 162 163 164 |
# File 'lib/pubid/utils/string_normalizer.rb', line 154 def title_case(str) return "" if str.nil? str.split.map do |word| if word.upcase == word # Preserve acronyms word else word.capitalize end end.join(" ") end |
.to_s(str) ⇒ String
Safe to_s method that handles nil
190 191 192 |
# File 'lib/pubid/utils/string_normalizer.rb', line 190 def to_s(str) str.nil? ? "" : str.to_s end |
.truncate(str, max_length:, ellipsis: "...") ⇒ String
Truncate string to max length with ellipsis
137 138 139 140 141 |
# File 'lib/pubid/utils/string_normalizer.rb', line 137 def truncate(str, max_length:, ellipsis: "...") return str if str.nil? || str.length <= max_length str[0...(max_length - ellipsis.length)] + ellipsis end |