Module: Sourcerer::SourceSkim

Defined in:
lib/sourcerer/source_skim.rb,
lib/sourcerer/source_skim/config.rb,
lib/sourcerer/source_skim/skimmer.rb,
lib/sourcerer/source_skim/markdown_skimmer.rb,
lib/asciidoctor/extensions/source-skim-tree-processor/extension.rb

Overview

SourceSkim produces machine-oriented skims of markup source documents.

A skim is a structured, JSON-ready representation of selected source elements intended to help automated tooling inspect documentation source and identify likely areas of interest when related product code changes.

AsciiDoc files are fully parsed by Asciidoctor and yield rich semantic output (sections, attributes, code blocks, tables, etc.). Markdown files yield frontmatter and section headings only, since Markdown has no standardised semantic block model.

The format is auto-detected from the file extension when using skim_file. Pass format: :markdown or format: :asciidoc to skim_string to disambiguate when there is no path to inspect.

Examples:

Skim an AsciiDoc file (auto-detected)

skim = Sourcerer::SourceSkim.skim_file('docs/install.adoc')

Skim a Markdown file (auto-detected)

skim = Sourcerer::SourceSkim.skim_file('docs/guide.md')

Skim with both tree and flat section shapes

skim = Sourcerer::SourceSkim.skim_file('docs/install.adoc', forms: [:tree, :flat])

Skim a Markdown string explicitly

skim = Sourcerer::SourceSkim.skim_string(content, format: :markdown)

Skim with caller-supplied Asciidoctor attribute overrides

skim = Sourcerer::SourceSkim.skim_file('docs/ref.adoc', attributes: { 'env' => 'prod' })

Defined Under Namespace

Classes: Config, MarkdownSkimmer, Skimmer, TreeProcessorExtension

Constant Summary collapse

NULL_LOGGER =
Logger.new(IO::NULL)
LOAD_OPTS =
{ safe: :safe, sourcemap: true, logger: NULL_LOGGER,
attributes: { 'skip-front-matter' => '' } }.freeze
ALL_CATEGORIES =

All recognized element categories. Sections are shape-controlled via forms rather than listed here.

%i[
  attributes_custom
  attributes_builtin
  definition_lists
  code_blocks
  literal_blocks
  examples
  sidebars
  tables
  admonitions
  quotes
  images
].freeze
DEFAULT_CATEGORIES =

Categories included when a caller passes categories: nil (the default). attributes_builtin, admonitions, and quotes are opt-in only.

(ALL_CATEGORIES - %i[attributes_builtin admonitions quotes]).freeze

Class Method Summary collapse

Class Method Details

.skim_doc(doc, forms: [:tree], categories: nil) ⇒ Hash

Skim an already-parsed Asciidoctor document.

This entry point is useful when the document has been loaded through other means, such as from an Asciidoctor extension callback.

Parameters:

  • doc (Asciidoctor::Document)

    parsed document object

  • forms (Array<Symbol>) (defaults to: [:tree])

    section shape(s) to emit

  • categories (Array<Symbol>, nil) (defaults to: nil)

    element categories to include

Returns:

  • (Hash)

    JSON-ready skim



106
107
108
109
# File 'lib/sourcerer/source_skim.rb', line 106

def self.skim_doc doc, forms: [:tree], categories: nil
  config = Config.new(forms: forms, categories: categories)
  Skimmer.new.process(doc, config: config)
end

.skim_file(file_path, forms: nil, format: nil, categories: nil, attributes: {}) ⇒ Hash

Skim the markup file at file_path.

Format is auto-detected from the file extension (.adoc → AsciiDoc; .md / .markdown → Markdown). Override with format: :asciidoc or format: :markdown.

Parameters:

  • file_path (String)

    path to the source file

  • forms (Array<Symbol>, nil) (defaults to: nil)

    section shape(s) to emit: :tree, :flat, or both. Defaults to [:tree] for AsciiDoc and [:flat] for Markdown.

  • format (Symbol, nil) (defaults to: nil)

    :asciidoc or :markdown; nil auto-detects

  • categories (Array<Symbol>, nil) (defaults to: nil)

    AsciiDoc only. Element categories to include; nil uses DEFAULT_CATEGORIES. Silently ignored for Markdown.

  • attributes (Hash{String => String}) (defaults to: {})

    AsciiDoc only. Asciidoctor attribute overrides. Silently ignored for Markdown.

Returns:

  • (Hash)

    JSON-ready skim



60
61
62
63
64
65
66
67
68
69
70
71
# File 'lib/sourcerer/source_skim.rb', line 60

def self.skim_file file_path, forms: nil, format: nil, categories: nil, attributes: {}
  fmt = format || detect_format(file_path)
  if fmt == :markdown
    config = Config.new(forms: forms || [:flat])
    MarkdownSkimmer.new.process(File.read(file_path), config: config)
  else
    attrs = LOAD_OPTS[:attributes].merge(attributes)
    opts  = LOAD_OPTS.merge(attributes: attrs)
    doc   = Asciidoctor.load_file(file_path, **opts)
    skim_doc(doc, forms: forms || [:tree], categories: categories)
  end
end

.skim_string(content, format: :asciidoc, forms: nil, categories: nil, attributes: {}) ⇒ Hash

Skim markup source from a content string.

format: must be provided when the content is Markdown, since there is no file extension to inspect. Defaults to :asciidoc for backward compatibility.

Parameters:

  • content (String)

    raw markup text

  • format (Symbol) (defaults to: :asciidoc)

    :asciidoc (default) or :markdown

  • forms (Array<Symbol>, nil) (defaults to: nil)

    section shape(s) to emit

  • categories (Array<Symbol>, nil) (defaults to: nil)

    AsciiDoc only

  • attributes (Hash{String => String}) (defaults to: {})

    AsciiDoc only

Returns:

  • (Hash)

    JSON-ready skim



85
86
87
88
89
90
91
92
93
94
95
# File 'lib/sourcerer/source_skim.rb', line 85

def self.skim_string content, format: :asciidoc, forms: nil, categories: nil, attributes: {}
  if format == :markdown
    config = Config.new(forms: forms || [:flat])
    MarkdownSkimmer.new.process(content, config: config)
  else
    attrs = LOAD_OPTS[:attributes].merge(attributes)
    opts  = LOAD_OPTS.merge(attributes: attrs)
    doc   = Asciidoctor.load(content, **opts)
    skim_doc(doc, forms: forms || [:tree], categories: categories)
  end
end