Module: RedQuilt::Inline::Flanking

Defined in:
lib/red_quilt/inline/flanking.rb

Overview

CommonMark spec 6.2 flanking delimiter run helpers.

Determines whether a delimiter run can open and/or close an emphasis. All input positions are byte offsets into the document source.

Constant Summary collapse

UNICODE_WHITESPACE_RE =
/\A[\s   -    ]\z/
UNICODE_PUNCT_RE =

CommonMark 0.31.2 expanded the definition of “punctuation” for flanking purposes to also include Unicode S (symbol) category, so currency / math / other symbols form delimiter-run boundaries.

/\A[\p{P}\p{S}]\z/
ASCII_WHITESPACE =

Fast-path lookup table for ASCII whitespace. Flanking inputs are mostly single-byte ASCII; the table lets us skip regex matches entirely on the hot path. (ASCII punctuation uses the shared Inline::ASCII_PUNCT table.)

Array.new(128, false)

Class Method Summary collapse

Class Method Details

.can_open_close(char, prev_char, next_char) ⇒ Object

Returns [can_open, can_close] for a delimiter run.

char must be “*”, “_”, or “~”. For “_”, word-character adjacency rules apply on top of the flanking rules; “*” and “~” use plain flanking only.



106
107
108
109
110
111
112
113
114
115
116
117
# File 'lib/red_quilt/inline/flanking.rb', line 106

def can_open_close(char, prev_char, next_char)
  left = left_flanking?(prev_char, next_char)
  right = right_flanking?(prev_char, next_char)
  if char == "_"
    can_open = left && (!right || punctuation?(prev_char))
    can_close = right && (!left || punctuation?(next_char))
  else
    can_open = left
    can_close = right
  end
  [can_open, can_close]
end

.char_at(source, byte_pos, range_end) ⇒ Object



49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# File 'lib/red_quilt/inline/flanking.rb', line 49

def char_at(source, byte_pos, range_end)
  return nil if byte_pos >= range_end

  b = source.getbyte(byte_pos)
  return BYTE_CHR[b] if b < 0x80

  len = if b < 0xC0
          1
        elsif b < 0xE0
          2
        elsif b < 0xF0
          3
        else
          4
        end
  source.byteslice(byte_pos, [len, range_end - byte_pos].min)
end

.char_before(source, byte_pos, range_start) ⇒ Object

Returns the character immediately before the byte position, or nil if at the start of source / outside the inline range.



28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# File 'lib/red_quilt/inline/flanking.rb', line 28

def char_before(source, byte_pos, range_start)
  return nil if byte_pos <= range_start

  prev = byte_pos - 1
  b = source.getbyte(prev)
  # ASCII fast path: shared 1-byte string (avoids byteslice + alloc).
  return BYTE_CHR[b] if b < 0x80

  # Walk back at most 4 bytes to find the UTF-8 code point start.
  i = prev
  while i >= range_start && i > byte_pos - 4
    b = source.getbyte(i)
    if b < 0x80 || b >= 0xC0
      return source.byteslice(i, byte_pos - i)
    end

    i -= 1
  end
  nil
end

.left_flanking?(prev_char, next_char) ⇒ Boolean

CommonMark spec: left-flanking delimiter run.

Returns:

  • (Boolean)


86
87
88
89
90
91
# File 'lib/red_quilt/inline/flanking.rb', line 86

def left_flanking?(prev_char, next_char)
  return false if whitespace?(next_char)
  return true unless punctuation?(next_char)

  whitespace?(prev_char) || punctuation?(prev_char)
end

.punctuation?(char) ⇒ Boolean

Returns:

  • (Boolean)


76
77
78
79
80
81
82
83
# File 'lib/red_quilt/inline/flanking.rb', line 76

def punctuation?(char)
  return false if char.nil?
  if char.bytesize == 1
    return ASCII_PUNCT[char.getbyte(0)]
  end

  UNICODE_PUNCT_RE.match?(char)
end

.right_flanking?(prev_char, next_char) ⇒ Boolean

CommonMark spec: right-flanking delimiter run.

Returns:

  • (Boolean)


94
95
96
97
98
99
# File 'lib/red_quilt/inline/flanking.rb', line 94

def right_flanking?(prev_char, next_char)
  return false if whitespace?(prev_char)
  return true unless punctuation?(prev_char)

  whitespace?(next_char) || punctuation?(next_char)
end

.whitespace?(char) ⇒ Boolean

Returns:

  • (Boolean)


67
68
69
70
71
72
73
74
# File 'lib/red_quilt/inline/flanking.rb', line 67

def whitespace?(char)
  return true if char.nil?
  if char.bytesize == 1
    return ASCII_WHITESPACE[char.getbyte(0)]
  end

  UNICODE_WHITESPACE_RE.match?(char)
end