Class: Coradoc::Docx::Transform::StyleResolver

Inherits:
Object
  • Object
show all
Defined in:
lib/coradoc/docx/transform/style_resolver.rb

Overview

Resolves paragraph and run styles to semantic roles.

OOXML paragraphs don’t have explicit element types. Instead, their meaning is determined by style references (e.g., “Heading1” → section) or by formatting properties (e.g., numPr → list item).

StyleResolver centralizes this detection so HeadingRule, ListItemRule, and ParagraphRule don’t duplicate the logic.

The style map is built from the Uniword StylesConfiguration by walking all style definitions and their basedOn chains.

Constant Summary collapse

HEADING_PATTERN =
/^(heading|heading|h)\s*(\d+)$/i
QUOTE_PATTERN =
/\bquote\b/i
CODE_PATTERN =
/\b(code|source|listing)\b/i
LITERAL_PATTERN =
/\bliteral\b/i
EXAMPLE_PATTERN =
/\bexample\b/i

Instance Method Summary collapse

Constructor Details

#initialize(styles_configuration) ⇒ StyleResolver

Returns a new instance of StyleResolver.

Parameters:

  • styles_configuration (Object, nil)

    Uniword styles configuration



25
26
27
28
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 25

def initialize(styles_configuration)
  @config = styles_configuration
  @style_map = build_style_map(styles_configuration)
end

Instance Method Details

#heading?(paragraph) ⇒ Boolean

Check if paragraph is a heading

Parameters:

  • paragraph (Uniword::Wordprocessingml::Paragraph)

Returns:

  • (Boolean)


48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 48

def heading?(paragraph)
  return false unless paragraph.properties

  style_name = resolve_style_name(paragraph)
  return true if style_name && HEADING_PATTERN.match?(style_name)

  ol = paragraph.properties.outline_level
  if ol
    ol_level = ol.is_a?(Uniword::Wordprocessingml::OutlineLevel) ? ol.value.to_i : ol.to_i
    return true if ol_level.positive?
  end

  style = find_style_for_paragraph(paragraph)
  if style&.outline_level
    ol_val = style.outline_level
    ol_val = ol_val.is_a?(Uniword::Wordprocessingml::OutlineLevel) ? ol_val.value.to_i : ol_val.to_i
    return true if ol_val.positive?
  end

  false
end

#heading_level(paragraph) ⇒ Integer?

Get heading level (1-6) or nil

Parameters:

  • paragraph (Uniword::Wordprocessingml::Paragraph)

Returns:

  • (Integer, nil)


73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 73

def heading_level(paragraph)
  style_name = resolve_style_name(paragraph)
  if style_name
    match = HEADING_PATTERN.match(style_name)
    return match[2].to_i if match
  end

  # Check outline_level on paragraph properties
  ol = paragraph.properties&.outline_level
  if ol
    level = ol.is_a?(Uniword::Wordprocessingml::OutlineLevel) ? ol.value.to_i : ol.to_i
    return level if level.positive?
  end

  nil
end

#list_item?(paragraph) ⇒ Boolean

Check if paragraph is a list item

Parameters:

  • paragraph (Uniword::Wordprocessingml::Paragraph)

Returns:

  • (Boolean)


93
94
95
96
97
98
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 93

def list_item?(paragraph)
  return false unless paragraph.properties

  num_id = paragraph.properties.num_id
  num_id.to_i.positive?
end

#role_from_style(paragraph) ⇒ Symbol?

Check if paragraph has a specific role based on style name

Parameters:

  • paragraph (Uniword::Wordprocessingml::Paragraph)

Returns:

  • (Symbol, nil)


103
104
105
106
107
108
109
110
111
112
113
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 103

def role_from_style(paragraph)
  style_name = resolve_style_name(paragraph)
  return nil unless style_name

  case style_name
  when QUOTE_PATTERN then :quote
  when CODE_PATTERN then :source
  when LITERAL_PATTERN then :literal
  when EXAMPLE_PATTERN then :example
  end
end

#run_semantic_role(run) ⇒ Symbol?

Detect semantic role of a run based on its rStyle

Parameters:

  • run (Uniword::Wordprocessingml::Run)

Returns:

  • (Symbol, nil)


118
119
120
121
122
123
124
125
126
127
128
129
130
131
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 118

def run_semantic_role(run)
  return nil unless run.properties
  return nil unless run.properties.style

  style_name = resolve_run_style_name(run)
  return nil unless style_name

  case style_name
  when /\b(code|verbatim|teletype|keyboard)\b/i then :monospace
  when /\bstrong\b/i then :bold
  when /\b(emphasis|em)\b/i then :italic
  when /\bcitation\b/i then :italic
  end
end

#semantic_role(paragraph) ⇒ Symbol

Determine the semantic role of a paragraph

Parameters:

  • paragraph (Uniword::Wordprocessingml::Paragraph)

Returns:

  • (Symbol)

    :heading, :list_item, :quote, :source, :literal, :example, or :paragraph



35
36
37
38
39
40
41
42
43
# File 'lib/coradoc/docx/transform/style_resolver.rb', line 35

def semantic_role(paragraph)
  return :heading if heading?(paragraph)
  return :list_item if list_item?(paragraph)

  style_role = role_from_style(paragraph)
  return style_role if style_role

  :paragraph
end