Module: Rich::Cells
- Defined in:
- lib/rich/cells.rb
Overview
Cell width calculation for Unicode characters. Handles East Asian Width and emoji character widths for proper terminal alignment.
Constant Summary collapse
- ZERO_WIDTH_RANGES =
Zero-width character categories (based on Unicode East Asian Width) These characters do not take up any visual space
[ 0x0000..0x001F, # C0 control codes 0x007F..0x009F, # C1 control codes 0x00AD..0x00AD, # Soft hyphen 0x0300..0x036F, # Combining diacritical marks 0x0483..0x0489, # Combining Cyrillic marks 0x0591..0x05BD, # Hebrew combining marks 0x05BF..0x05BF, 0x05C1..0x05C2, 0x05C4..0x05C5, 0x05C7..0x05C7, 0x0600..0x0605, # Arabic marks 0x0610..0x061A, 0x061C..0x061C, 0x064B..0x065F, 0x0670..0x0670, 0x06D6..0x06DC, 0x06DF..0x06E4, 0x06E7..0x06E8, 0x06EA..0x06ED, 0x070F..0x070F, 0x0711..0x0711, 0x0730..0x074A, 0x07A6..0x07B0, 0x07EB..0x07F3, 0x0816..0x0819, 0x081B..0x0823, 0x0825..0x0827, 0x0829..0x082D, 0x0859..0x085B, 0x08D4..0x08E1, 0x08E3..0x0902, 0x093A..0x093A, 0x093C..0x093C, 0x0941..0x0948, 0x094D..0x094D, 0x0951..0x0957, 0x0962..0x0963, 0x0981..0x0981, 0x09BC..0x09BC, 0x09C1..0x09C4, 0x09CD..0x09CD, 0x09E2..0x09E3, 0x0A01..0x0A02, 0x0A3C..0x0A3C, 0x0A41..0x0A42, 0x0A47..0x0A48, 0x0A4B..0x0A4D, 0x0A51..0x0A51, 0x0A70..0x0A71, 0x0A75..0x0A75, 0x0A81..0x0A82, 0x0ABC..0x0ABC, 0x0AC1..0x0AC5, 0x0AC7..0x0AC8, 0x0ACD..0x0ACD, 0x0AE2..0x0AE3, 0x0B01..0x0B01, 0x0B3C..0x0B3C, 0x0B3F..0x0B3F, 0x0B41..0x0B44, 0x0B4D..0x0B4D, 0x0B56..0x0B56, 0x0B62..0x0B63, 0x0B82..0x0B82, 0x0BC0..0x0BC0, 0x0BCD..0x0BCD, 0x0C00..0x0C00, 0x0C3E..0x0C40, 0x0C46..0x0C48, 0x0C4A..0x0C4D, 0x0C55..0x0C56, 0x0C62..0x0C63, 0x0C81..0x0C81, 0x0CBC..0x0CBC, 0x0CBF..0x0CBF, 0x0CC6..0x0CC6, 0x0CCC..0x0CCD, 0x0CE2..0x0CE3, 0x0D01..0x0D01, 0x0D41..0x0D44, 0x0D4D..0x0D4D, 0x0D62..0x0D63, 0x0DCA..0x0DCA, 0x0DD2..0x0DD4, 0x0DD6..0x0DD6, 0x0E31..0x0E31, 0x0E34..0x0E3A, 0x0E47..0x0E4E, 0x0EB1..0x0EB1, 0x0EB4..0x0EB9, 0x0EBB..0x0EBC, 0x0EC8..0x0ECD, 0x0F18..0x0F19, 0x0F35..0x0F35, 0x0F37..0x0F37, 0x0F39..0x0F39, 0x0F71..0x0F7E, 0x0F80..0x0F84, 0x0F86..0x0F87, 0x0F8D..0x0F97, 0x0F99..0x0FBC, 0x0FC6..0x0FC6, 0x102D..0x1030, 0x1032..0x1037, 0x1039..0x103A, 0x103D..0x103E, 0x1058..0x1059, 0x105E..0x1060, 0x1071..0x1074, 0x1082..0x1082, 0x1085..0x1086, 0x108D..0x108D, 0x109D..0x109D, 0x1160..0x11FF, # Hangul Jungseong/Jongseong 0x135D..0x135F, 0x1712..0x1714, 0x1732..0x1734, 0x1752..0x1753, 0x1772..0x1773, 0x17B4..0x17B5, 0x17B7..0x17BD, 0x17C6..0x17C6, 0x17C9..0x17D3, 0x17DD..0x17DD, 0x180B..0x180D, 0x1885..0x1886, 0x18A9..0x18A9, 0x1920..0x1922, 0x1927..0x1928, 0x1932..0x1932, 0x1939..0x193B, 0x1A17..0x1A18, 0x1A1B..0x1A1B, 0x1A56..0x1A56, 0x1A58..0x1A5E, 0x1A60..0x1A60, 0x1A62..0x1A62, 0x1A65..0x1A6C, 0x1A73..0x1A7C, 0x1A7F..0x1A7F, 0x1AB0..0x1ABE, 0x1B00..0x1B03, 0x1B34..0x1B34, 0x1B36..0x1B3A, 0x1B3C..0x1B3C, 0x1B42..0x1B42, 0x1B6B..0x1B73, 0x1B80..0x1B81, 0x1BA2..0x1BA5, 0x1BA8..0x1BA9, 0x1BAB..0x1BAD, 0x1BE6..0x1BE6, 0x1BE8..0x1BE9, 0x1BED..0x1BED, 0x1BEF..0x1BF1, 0x1C2C..0x1C33, 0x1C36..0x1C37, 0x1CD0..0x1CD2, 0x1CD4..0x1CE0, 0x1CE2..0x1CE8, 0x1CED..0x1CED, 0x1CF4..0x1CF4, 0x1CF8..0x1CF9, 0x1DC0..0x1DF5, 0x1DFC..0x1DFF, 0x200B..0x200F, # Zero-width spaces and direction marks 0x202A..0x202E, 0x2060..0x2064, 0x2066..0x206F, 0x20D0..0x20F0, # Combining marks for symbols 0x2CEF..0x2CF1, 0x2D7F..0x2D7F, 0x2DE0..0x2DFF, 0x302A..0x302D, 0x3099..0x309A, 0xA66F..0xA672, 0xA674..0xA67D, 0xA69E..0xA69F, 0xA6F0..0xA6F1, 0xA802..0xA802, 0xA806..0xA806, 0xA80B..0xA80B, 0xA825..0xA826, 0xA8C4..0xA8C4, 0xA8E0..0xA8F1, 0xA926..0xA92D, 0xA947..0xA951, 0xA980..0xA982, 0xA9B3..0xA9B3, 0xA9B6..0xA9B9, 0xA9BC..0xA9BC, 0xA9E5..0xA9E5, 0xAA29..0xAA2E, 0xAA31..0xAA32, 0xAA35..0xAA36, 0xAA43..0xAA43, 0xAA4C..0xAA4C, 0xAA7C..0xAA7C, 0xAAB0..0xAAB0, 0xAAB2..0xAAB4, 0xAAB7..0xAAB8, 0xAABE..0xAABF, 0xAAC1..0xAAC1, 0xAAEC..0xAAED, 0xAAF6..0xAAF6, 0xABE5..0xABE5, 0xABE8..0xABE8, 0xABED..0xABED, 0xFB1E..0xFB1E, 0xFE00..0xFE0F, # Variation selectors 0xFE20..0xFE2F, 0xFEFF..0xFEFF, # BOM/ZWNBSP 0xFFF9..0xFFFB, 0x101FD..0x101FD, 0x102E0..0x102E0, 0x10376..0x1037A, 0x10A01..0x10A03, 0x10A05..0x10A06, 0x10A0C..0x10A0F, 0x10A38..0x10A3A, 0x10A3F..0x10A3F, 0x10AE5..0x10AE6, 0x11001..0x11001, 0x11038..0x11046, 0x1107F..0x11081, 0x110B3..0x110B6, 0x110B9..0x110BA, 0x11100..0x11102, 0x11127..0x1112B, 0x1112D..0x11134, 0x11173..0x11173, 0x11180..0x11181, 0x111B6..0x111BE, 0x111CA..0x111CC, 0x1122F..0x11231, 0x11234..0x11234, 0x11236..0x11237, 0x1123E..0x1123E, 0x112DF..0x112DF, 0x112E3..0x112EA, 0x11300..0x11301, 0x1133C..0x1133C, 0x11340..0x11340, 0x11366..0x1136C, 0x11370..0x11374, 0x11438..0x1143F, 0x11442..0x11444, 0x11446..0x11446, 0x114B3..0x114B8, 0x114BA..0x114BA, 0x114BF..0x114C0, 0x114C2..0x114C3, 0x115B2..0x115B5, 0x115BC..0x115BD, 0x115BF..0x115C0, 0x115DC..0x115DD, 0x11633..0x1163A, 0x1163D..0x1163D, 0x1163F..0x11640, 0x116AB..0x116AB, 0x116AD..0x116AD, 0x116B0..0x116B5, 0x116B7..0x116B7, 0x1171D..0x1171F, 0x11722..0x11725, 0x11727..0x1172B, 0x11C30..0x11C36, 0x11C38..0x11C3D, 0x11C3F..0x11C3F, 0x11C92..0x11CA7, 0x11CAA..0x11CB0, 0x11CB2..0x11CB3, 0x11CB5..0x11CB6, 0x16AF0..0x16AF4, 0x16B30..0x16B36, 0x16F8F..0x16F92, 0x1BC9D..0x1BC9E, 0x1D167..0x1D169, 0x1D173..0x1D182, 0x1D185..0x1D18B, 0x1D1AA..0x1D1AD, 0x1D242..0x1D244, 0x1DA00..0x1DA36, 0x1DA3B..0x1DA6C, 0x1DA75..0x1DA75, 0x1DA84..0x1DA84, 0x1DA9B..0x1DA9F, 0x1DAA1..0x1DAAF, 0x1E000..0x1E006, 0x1E008..0x1E018, 0x1E01B..0x1E021, 0x1E023..0x1E024, 0x1E026..0x1E02A, 0x1E8D0..0x1E8D6, 0x1E944..0x1E94A, 0xE0001..0xE0001, 0xE0020..0xE007F, 0xE0100..0xE01EF ].freeze
- WIDE_RANGES =
Wide (double-width) character ranges (CJK, etc.)
[ 0x1100..0x115F, # Hangul Jamo 0x231A..0x231B, # Watch, Hourglass 0x2329..0x232A, # Angle brackets 0x23E9..0x23F3, # Media controls 0x23F8..0x23FA, 0x25FD..0x25FE, # Squares 0x2614..0x2615, # Umbrella, Hot beverage 0x2648..0x2653, # Zodiac 0x267F..0x267F, # Wheelchair 0x2693..0x2693, # Anchor 0x26A1..0x26A1, # Lightning 0x26AA..0x26AB, # Circles 0x26BD..0x26BE, # Sports 0x26C4..0x26C5, # Weather 0x26CE..0x26CE, # Ophiuchus 0x26D4..0x26D4, # No entry 0x26EA..0x26EA, # Church 0x26F2..0x26F3, # Fountain, Golf 0x26F5..0x26F5, # Sailboat 0x26FA..0x26FA, # Tent 0x26FD..0x26FD, # Fuel pump 0x2702..0x2702, # Scissors 0x2705..0x2705, # Check mark 0x2708..0x270D, # Various 0x270F..0x270F, # Pencil 0x2712..0x2712, # Black nib 0x2714..0x2714, # Check mark 0x2716..0x2716, # X mark 0x271D..0x271D, # Cross 0x2721..0x2721, # Star of David 0x2728..0x2728, # Sparkles 0x2733..0x2734, # Asterisks 0x2744..0x2744, # Snowflake 0x2747..0x2747, # Sparkle 0x274C..0x274C, # Cross mark 0x274E..0x274E, 0x2753..0x2755, # Question marks 0x2757..0x2757, # Exclamation 0x2763..0x2764, # Heart 0x2795..0x2797, # Math symbols 0x27A1..0x27A1, # Arrow 0x27B0..0x27B0, # Loop 0x27BF..0x27BF, # Loop 0x2934..0x2935, # Arrows 0x2B05..0x2B07, # Arrows 0x2B1B..0x2B1C, # Squares 0x2B50..0x2B50, # Star 0x2B55..0x2B55, # Circle 0x2E80..0x2E99, # CJK radicals 0x2E9B..0x2EF3, 0x2F00..0x2FD5, # Kangxi radicals 0x2FF0..0x2FFB, # Ideographic description 0x3000..0x303E, # CJK punctuation 0x3041..0x3096, # Hiragana 0x3099..0x30FF, # Katakana 0x3105..0x312D, # Bopomofo 0x3131..0x318E, # Hangul compatibility 0x3190..0x31BA, # Kanbun 0x31C0..0x31E3, # CJK strokes 0x31F0..0x321E, # Katakana extensions 0x3220..0x3247, # Enclosed CJK 0x3250..0x32FE, 0x3300..0x4DBF, # CJK unified 0x4E00..0x9FFF, # CJK unified ideographs 0xA000..0xA48C, # Yi syllables 0xA490..0xA4C6, # Yi radicals 0xA960..0xA97C, # Hangul Jamo extended 0xAC00..0xD7A3, # Hangul syllables 0xF900..0xFAFF, # CJK compatibility ideographs 0xFE10..0xFE19, # Vertical forms 0xFE30..0xFE52, # CJK compatibility forms 0xFE54..0xFE66, 0xFE68..0xFE6B, 0xFF01..0xFF60, # Fullwidth forms 0xFFE0..0xFFE6, 0x16FE0..0x16FE1, # Various 0x17000..0x187EC, # Tangut 0x18800..0x18AF2, 0x1B000..0x1B11E, # Kana supplement 0x1B170..0x1B2FB, 0x1F004..0x1F004, # Mahjong 0x1F0CF..0x1F0CF, # Playing card 0x1F18E..0x1F18E, # Squared AB 0x1F191..0x1F19A, # Squared 0x1F200..0x1F202, 0x1F210..0x1F23B, 0x1F240..0x1F248, 0x1F250..0x1F251, 0x1F260..0x1F265, 0x1F300..0x1F64F, # Emoji 0x1F680..0x1F6C5, 0x1F6CC..0x1F6CC, 0x1F6D0..0x1F6D2, 0x1F6EB..0x1F6EC, 0x1F6F4..0x1F6F8, 0x1F910..0x1F93E, 0x1F940..0x1F94C, 0x1F950..0x1F96B, 0x1F980..0x1F997, 0x1F9C0..0x1F9C0, 0x1F9D0..0x1F9E6, 0x20000..0x2FFFD, # CJK extension B 0x30000..0x3FFFD # CJK extension G ].freeze
Class Method Summary collapse
-
.cached_cell_len(text) ⇒ Integer
Get the display width of a string, with cache.
-
.cell_len(text) ⇒ Integer
Get the display width of a string.
-
.char_width(char) ⇒ Integer
Get the display width of a single character.
-
.clear_cache ⇒ Object
Clear the cell width cache.
-
.set_cell_size(char, size) ⇒ Object
Set the cell size for a specific character (override).
-
.wide?(char) ⇒ Boolean
Check if a character is a wide (double-width) character.
-
.zero_width?(char) ⇒ Boolean
Check if a character is a zero-width character.
Class Method Details
.cached_cell_len(text) ⇒ Integer
Get the display width of a string, with cache
476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 |
# File 'lib/rich/cells.rb', line 476 def cached_cell_len(text) return 0 if text.nil? || text.empty? # Lock-free read of the common cache-hit path (Hash reads are safe under # MRI's GVL); only take the lock to populate. A 0 width is truthy in # Ruby, so a cached zero-width string still short-circuits correctly. hit = @cache[text] return hit if hit result = cell_len(text) @cache_mutex.synchronize do # Bulk-clear once over the cap rather than shifting one entry per # insert while permanently parked at the boundary. @cache.clear if @cache.size > 10_000 @cache[text] = result end result end |
.cell_len(text) ⇒ Integer
Get the display width of a string
464 465 466 467 468 469 470 471 |
# File 'lib/rich/cells.rb', line 464 def cell_len(text) return 0 if text.nil? || text.empty? # Display width is the sum of per-character cell widths. This is correct # on every platform: legacy Windows consoles still render a CJK glyph as # 2 cells and ASCII as 1 — UTF-8 byte count is NOT a display width. text.each_char.sum { |c| char_width(c) } end |
.char_width(char) ⇒ Integer
Get the display width of a single character
429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 |
# File 'lib/rich/cells.rb', line 429 def char_width(char) return 0 if char.nil? || char.empty? # Honor explicit per-character overrides registered via set_cell_size. if defined?(@overrides) && @overrides && (override = @overrides[char]) return override end codepoint = char.ord # Tab, newline, and carriage return should have width 1 for measurement # (even though they may have different visual behavior) return 1 if codepoint == 0x09 || codepoint == 0x0A || codepoint == 0x0D # Fast path: printable ASCII is always width 1. This is the overwhelming # common case and avoids scanning the ~340 zero-width/wide ranges below. return 1 if codepoint >= 0x20 && codepoint <= 0x7E # Check zero-width first (most common for combining marks) ZERO_WIDTH_RANGES.each do |range| return 0 if range.cover?(codepoint) end # Check wide characters WIDE_RANGES.each do |range| return 2 if range.cover?(codepoint) end # Default to single width 1 end |
.clear_cache ⇒ Object
Clear the cell width cache
505 506 507 |
# File 'lib/rich/cells.rb', line 505 def clear_cache @cache_mutex.synchronize { @cache.clear } end |
.set_cell_size(char, size) ⇒ Object
Set the cell size for a specific character (override)
498 499 500 501 502 |
# File 'lib/rich/cells.rb', line 498 def set_cell_size(char, size) @overrides[char] = size # Existing cached widths may include this character; invalidate them. clear_cache end |
.wide?(char) ⇒ Boolean
Check if a character is a wide (double-width) character
519 520 521 |
# File 'lib/rich/cells.rb', line 519 def wide?(char) char_width(char) == 2 end |
.zero_width?(char) ⇒ Boolean
Check if a character is a zero-width character
512 513 514 |
# File 'lib/rich/cells.rb', line 512 def zero_width?(char) char_width(char) == 0 end |