Class: Coradoc::AsciiDoc::Transformer
- Inherits:
-
Parslet::Transform
- Object
- Parslet::Transform
- Coradoc::AsciiDoc::Transformer
- Defined in:
- lib/coradoc/asciidoc/transformer.rb,
lib/coradoc/asciidoc/transformer/list_rules.rb,
lib/coradoc/asciidoc/transformer/misc_rules.rb,
lib/coradoc/asciidoc/transformer/text_rules.rb,
lib/coradoc/asciidoc/transformer/block_rules.rb,
lib/coradoc/asciidoc/transformer/header_rules.rb,
lib/coradoc/asciidoc/transformer/inline_rules.rb,
lib/coradoc/asciidoc/transformer/structural_rules.rb
Overview
Parslet::Transform subclass that converts AST to AsciiDoc model objects.
This transformer uses a modular rule system where each group of rules is defined in a separate file for maintainability.
Rule modules (each autoloaded):
-
HeaderRules: Document header, author, revision
-
InlineRules: Inline formatting (bold, italic, etc.)
-
TextRules: Text elements and paragraphs
-
BlockRules: Block elements (example, admonition, etc.)
-
ListRules: List items and list types
-
StructuralRules: Sections, tables, documents
-
MiscRules: Comments, attributes, media elements
Defined Under Namespace
Modules: BlockRules, HeaderRules, InlineRules, ListRules, MiscRules, StructuralRules, TextRules
Class Method Summary collapse
-
.build_table_cell(format, content) ⇒ Model::TableCell
Helper method for building table cells with format specification.
-
.extract_inline_content(data) ⇒ Object
Helper method for extracting inline content (used by InlineRules).
-
.extract_simple_inline_content(data) ⇒ Object
Helper method for extracting simple inline content.
-
.group_cells_into_rows(cells, explicit_col_count = nil) ⇒ Array<Model::TableRow>
Group cells into rows based on column count.
-
.infer_column_count(cells) ⇒ Object
Infer column count from cells Look for patterns where rows have consistent cell counts.
-
.legacy_transform(syntax_tree) ⇒ Object
deprecated
Deprecated.
Use Transformer.transform instead
-
.parse_block_content(text) ⇒ Array
Parse block-level AsciiDoc content (for ‘a’ style cells).
-
.parse_cols_attribute(attrs) ⇒ Integer?
Parse the cols attribute to determine column count.
-
.parse_inline_content(text, style = nil) ⇒ Array<TextElement>
Helper method for parsing inline content from raw text This is used for table cells where content is captured as raw text.
-
.regroup_table_rows(rows, attrs = nil) ⇒ Array<Model::TableRow>
Regroup parser-level rows into proper AsciiDoc rows.
-
.transform(syntax_tree) ⇒ Object
Transform a syntax tree using this transformer’s rules.
Class Method Details
.build_table_cell(format, content) ⇒ Model::TableCell
Helper method for building table cells with format specification
166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 166 def self.build_table_cell(format, content) cell_opts = {} # Extract style first for content parsing style = nil # Parse format specification if present if format.is_a?(Hash) # Colspan cell_opts[:colspan] = format[:colspan].to_i if format[:colspan] # Rowspan (remove leading dot) if format[:rowspan] rowspan_str = format[:rowspan].to_s rowspan_str = rowspan_str.sub(/^\./, '') cell_opts[:rowspan] = rowspan_str.to_i if rowspan_str.match?(/^\d+$/) end # Horizontal alignment cell_opts[:halign] = format[:halign].to_s if format[:halign] # Vertical alignment (remove leading dot) if format[:valign] valign_str = format[:valign].to_s valign_str = valign_str.sub(/^\./, '') cell_opts[:valign] = valign_str if %w[< ^ >].include?(valign_str) end # Style style = format[:style].to_s if format[:style] cell_opts[:style] = style # Repeat marker cell_opts[:repeat] = true if format[:repeat] elsif format.is_a?(String) # Parse format string like ".2+^.^" or "4+^" or ".3+a" # Format: [colspan][.rowspan][halign][valign][style][*] format_str = format.to_s # Parse colspan (digits before +) cell_opts[:colspan] = Regexp.last_match(1).to_i if format_str =~ /^(\d+)\+/ # Parse rowspan (.digits) cell_opts[:rowspan] = Regexp.last_match(1).to_i if format_str =~ /\.(\d+)/ # Parse horizontal alignment (^ < >) # Note: In AsciiDoc, ^ is center, < is left, > is right cell_opts[:halign] = Regexp.last_match(0) if format_str =~ /[<>^]/ # Parse vertical alignment (.<. ^. >.) cell_opts[:valign] = Regexp.last_match(0)[1] if format_str =~ /\.[.^<>]/ # Parse style (d=decimal, s=strong, e=emphasis, m=monospace, a=asciidoc, l=literal, h=header) style = Regexp.last_match(0) if format_str =~ /[dsemalhv]/ cell_opts[:style] = style # Parse repeat marker cell_opts[:repeat] = true if format_str.include?('*') end # Parse content based on style parsed_content = parse_inline_content(content, style) cell_opts[:content] = parsed_content Model::TableCell.new(**cell_opts) end |
.extract_inline_content(data) ⇒ Object
Helper method for extracting inline content (used by InlineRules)
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 43 def self.extract_inline_content(data) if data.is_a?(Hash) && data.key?(:content) data[:content] elsif data.is_a?(Array) data.map do |item| if item.is_a?(Hash) && item.key?(:text) text = item[:text] if text.is_a?(Model::Base) && text.class.attributes.key?(:content) text.content elsif text.is_a?(Model::Base) text else text.to_s end else item end end else data end end |
.extract_simple_inline_content(data) ⇒ Object
Helper method for extracting simple inline content
67 68 69 70 71 72 73 74 75 76 77 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 67 def self.extract_simple_inline_content(data) if data.is_a?(Hash) && data.key?(:content) data[:content] elsif data.is_a?(Array) data.map do |item| item.is_a?(Hash) && item.key?(:text) ? item[:text].to_s : item end.join else data end end |
.group_cells_into_rows(cells, explicit_col_count = nil) ⇒ Array<Model::TableRow>
Group cells into rows based on column count
AsciiDoc table row semantics:
-
Column count is determined by cols attribute or first row
-
A new row starts when previous row has ‘column_count` cells
-
Cells with colspan > 1 take multiple column slots
278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 278 def self.group_cells_into_rows(cells, explicit_col_count = nil) return [] if cells.nil? || cells.empty? # Normalize cells to ensure they're TableCell objects normalized_cells = cells.map do |cell| case cell when Model::TableCell cell when Hash content = cell[:text] || cell[:content] || '' Model::TableCell.new(content: parse_inline_content(content)) else Model::TableCell.new(content: parse_inline_content(cell)) end end # Determine column count # If explicit_col_count is provided, use it # Otherwise, count cells until we find a row boundary col_count = explicit_col_count if col_count.nil? || col_count.zero? # Infer from first row - count cells until we have a complete row # A complete row is when the total column slots equals a consistent number col_count = infer_column_count(normalized_cells) end # If still no column count, assume all cells are one row col_count = normalized_cells.size if col_count.nil? || col_count.zero? # Group cells into rows rows = [] current_row_cells = [] current_col_slots = 0 normalized_cells.each do |cell| # Get colspan (default 1) colspan = cell.is_a?(Model::TableCell) && cell.colspan ? cell.colspan : 1 current_row_cells << cell current_col_slots += colspan # Check if row is complete next unless current_col_slots >= col_count rows << Model::TableRow.new(columns: current_row_cells) current_row_cells = [] current_col_slots = 0 end # Handle remaining cells (incomplete last row) rows << Model::TableRow.new(columns: current_row_cells) if current_row_cells.any? rows end |
.infer_column_count(cells) ⇒ Object
Infer column count from cells Look for patterns where rows have consistent cell counts
336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 336 def self.infer_column_count(cells) return nil if cells.nil? || cells.empty? col_slots = cells.map do |cell| cell.is_a?(Model::TableCell) && cell.colspan ? cell.colspan : 1 end total_cells = col_slots.sum # Find all valid column counts possible_cols = (1..[total_cells, 12].min).select do |candidate| next false if candidate > total_cells next false if total_cells % candidate != 0 slots_used = 0 valid = true col_slots.each do |slots| slots_used += slots if slots_used == candidate slots_used = 0 elsif slots_used > candidate valid = false break end end valid && slots_used.zero? end possible_cols.max || col_slots.first || 1 end |
.legacy_transform(syntax_tree) ⇒ Object
Use transform instead
Legacy transform method (deprecated)
401 402 403 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 401 def self.legacy_transform(syntax_tree) new.apply(syntax_tree) end |
.parse_block_content(text) ⇒ Array
Parse block-level AsciiDoc content (for ‘a’ style cells)
112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 112 def self.parse_block_content(text) return [Coradoc::AsciiDoc::Model::TextElement.new(content: '')] if text.nil? || text.to_s.strip.empty? parser = Coradoc::AsciiDoc::Parser::Base.new text_str = text.to_s # Try parsing as a list if content contains list markers # List markers can appear after other content (e.g., "Title:\n\n* item") if /^(\*+|-+|\d+\.)/m.match?(text_str) # Extract just the list portion list_match = text_str.match(/\n(\*+|-+|\d+\.)(.*)$/m) if list_match list_text = list_match[1] + list_match[2] begin ast = parser.list.parse(list_text) transformed = new.apply(ast) # Parse the text before the list as inline content before_list = text_str[0, list_match.begin(1) - 1].strip before_elements = [] unless before_list.empty? begin before_ast = parser.text_any.parse(before_list) before_transformed = new.apply(before_ast) before_array = before_transformed.is_a?(Array) ? before_transformed : [before_transformed] before_elements = [Coradoc::AsciiDoc::Model::TextElement.new(content: before_array)] rescue Parslet::ParseFailed before_elements = [Coradoc::AsciiDoc::Model::TextElement.new(content: before_list)] end end return before_elements + [transformed] rescue Parslet::ParseFailed # Fall through to inline parsing end end end # Try parsing as inline content begin ast = parser.text_any.parse(text_str) transformed = new.apply(ast) content_array = transformed.is_a?(Array) ? transformed : [transformed] [Coradoc::AsciiDoc::Model::TextElement.new(content: content_array)] rescue Parslet::ParseFailed # If parsing fails, return the text as a simple TextElement [Coradoc::AsciiDoc::Model::TextElement.new(content: text_str)] end end |
.parse_cols_attribute(attrs) ⇒ Integer?
Parse the cols attribute to determine column count
236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 236 def self.parse_cols_attribute(attrs) return nil if attrs.nil? # Get the cols value from named attributes cols_value = if attrs.is_a?(Model::AttributeList) attrs.named.find { |n| n.name.to_s == 'cols' }&.value elsif attrs.is_a?(Hash) attrs['cols'] || attrs[:cols] end return nil if cols_value.nil? # cols can be: # - A single number: "3" -> 3 columns # - A list: "1,2,1" -> 3 columns # - With multipliers: "3*" -> 3 columns # - Quoted: "\"3\"" -> 3 columns cols_str = cols_value.is_a?(Array) ? cols_value.first.to_s : cols_value.to_s # Remove surrounding quotes if present cols_str = cols_str.gsub(/^["']|["']$/, '') # Handle multiplier syntax: "3*" means 3 columns return Regexp.last_match(1).to_i if cols_str =~ /^(\d+)\*$/ # Handle comma-separated list: count the parts return cols_str.split(',').size if cols_str.include?(',') # Single number cols_str.to_i if /^\d+$/.match?(cols_str) end |
.parse_inline_content(text, style = nil) ⇒ Array<TextElement>
Helper method for parsing inline content from raw text This is used for table cells where content is captured as raw text
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 84 def self.parse_inline_content(text, style = nil) return [Coradoc::AsciiDoc::Model::TextElement.new(content: '')] if text.nil? || text.to_s.strip.empty? # For AsciiDoc style cells, parse as block content return parse_block_content(text) if style == 'a' # For literal style cells, preserve text as-is return [Coradoc::AsciiDoc::Model::TextElement.new(content: text.to_s)] if style == 'l' # For default cells, parse inline content parser = Coradoc::AsciiDoc::Parser::Base.new begin ast = parser.text_any.parse(text.to_s) # Transform the AST to model objects transformed = new.apply(ast) # Wrap in TextElement content_array = transformed.is_a?(Array) ? transformed : [transformed] [Coradoc::AsciiDoc::Model::TextElement.new(content: content_array)] rescue Parslet::ParseFailed # If parsing fails, return the text as a simple TextElement [Coradoc::AsciiDoc::Model::TextElement.new(content: text.to_s)] end end |
.regroup_table_rows(rows, attrs = nil) ⇒ Array<Model::TableRow>
Regroup parser-level rows into proper AsciiDoc rows. The parser produces one “row” per line; this flattens all cells and regroups by the cols attribute, then marks the first row as header.
376 377 378 379 380 381 382 383 384 385 386 387 388 389 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 376 def self.regroup_table_rows(rows, attrs = nil) return rows if rows.nil? || rows.empty? col_count = parse_cols_attribute(attrs) all_cells = rows.flat_map do |r| r.is_a?(Model::TableRow) ? r.columns : [] end return rows if all_cells.empty? grouped = group_cells_into_rows(all_cells, col_count) grouped.first.header = true unless grouped.empty? grouped end |
.transform(syntax_tree) ⇒ Object
Transform a syntax tree using this transformer’s rules
395 396 397 |
# File 'lib/coradoc/asciidoc/transformer.rb', line 395 def self.transform(syntax_tree) new.apply(syntax_tree) end |