Purpose

DocBook is a Ruby gem for parsing, validating, and converting DocBook 5 XML documents into interactive HTML readers and structured JSON. It provides a complete command-line interface and Ruby API for document processing.

Key features:

  • Interactive reader — Self-contained HTML with search, themes, TOC navigation

  • DocBook 5 parsing — Full element support via lutaml-model

  • XInclude resolution — Transparent handling of multi-file documents

  • DocbookMirror JSON — ProseMirror-compatible structured representation

  • Output formats — Inline, DOM (SEO), distribution directory, split-pages, and chunked (on-demand loading) modes

  • Library support — Multi-book collections with covers and manifests

Installation

gem 'docbook'

Or:

$ gem install docbook

Quick start

CLI

Build an interactive reader from any DocBook XML:

$ docbook build guide.xml

This produces guide.html in the same directory. Open it in a browser — it works offline with no server required.

Specify a custom output path:

$ docbook build guide.xml -o output.html

Try the bundled demo:

$ docbook build --demo

SEO-friendly with pre-rendered content:

$ docbook build guide.xml --format dom -o reader.html

Library

Build a multi-book library:

$ docbook library my-library/ --format dist -o /var/www/library/

Ruby API

require 'docbook'

# Build an interactive reader (any format)
Docbook::Output::Builder.new(
  xml_path: 'guide.xml',
  output_path: 'guide.html',
  format: :inline  # or :dom, :dist, :paged, :chunked
).build

# Build a multi-book library
Docbook::Output::LibraryBuilder.new(
  input_path: 'library.yml',
  output_path: 'library.html',
  format: :inline
).build

# Export as DocbookMirror JSON
parsed = Docbook::Document.from_xml(File.read('guide.xml'))
json = Docbook::Output::DocbookMirror.new(parsed).to_pretty_json
File.write('guide.json', json)

CLI reference

Command Description

build

Build an interactive HTML reader from DocBook XML. Pass --demo for the bundled sample, or --demo=xslTNG / --demo=model-flow. Supports --format to choose output format.

library

Build a multi-book library from a directory or manifest

export

Export as DocbookMirror JSON

validate

Validate DocBook XML against RELAX NG schema

format

Format/prettify DocBook XML

build

Build an interactive HTML reader. The output is a single self-contained file that works offline (CSS, JS, and images are all inlined).

$ docbook build [INPUT] [options]
$ docbook build --demo [options]

Options:

  • -o, --output FILE — Output HTML file path (default: <input>.html or demo.html)

  • --demo — Use the bundled DocBook sample as input (accepts name: xslTNG or model-flow)

  • --format FORMAT — Output format: inline (default), dom, dist, paged, or chunked

  • --image-search-dir DIR — Directories to search for images (default: XML file directory)

  • --image-strategy STRATEGY — Image handling: data_url (default), file_url, or relative

  • --title TITLE — Page title (default: derived from document)

  • --sort-glossary — Sort glossary entries alphabetically

  • --xinclude / --no-xinclude — Resolve XIncludes (default: true)

Examples:

# Basic (output defaults to book.html)
$ docbook build book.xml

# Custom output path
$ docbook build book.xml -o output.html

# SEO-friendly with pre-rendered content
$ docbook build book.xml --format dom -o output.html

# Distribution directory (separate assets)
$ docbook build book.xml --format dist -o /var/www/book/

# Split into individual pages
$ docbook build book.xml --format paged -o /var/www/book/

# On-demand section loading (best for large books)
$ docbook build book.xml --format chunked -o /var/www/book/

# With custom image paths
$ docbook build book.xml --image-search-dir media/ images/

# For web hosting (external image files)
$ docbook build book.xml --image-strategy relative

# Bundled demo
$ docbook build --demo

# Named demo
$ docbook build --demo=model-flow

export

Export DocBook XML as DocbookMirror JSON (ProseMirror-compatible format):

$ docbook export INPUT -o output.json

# Or to stdout
$ docbook export INPUT

validate

Validate DocBook XML:

$ docbook validate input.xml
input.xml: valid

# Well-formedness only (no schema)
$ docbook validate --wellformed input.xml

format

Format/prettify DocBook XML:

$ docbook format input.xml -o output.xml

library

Build a multi-book library from a directory or manifest:

# From a directory (auto-discovers XML files)
$ docbook library my-library/ -o library.html

# From a YAML manifest
$ docbook library library.yml --format dist -o /var/www/library/

# From a JSON manifest
$ docbook library library.json --format paged -o /var/www/library/

Ruby API

Parsing

require 'docbook'

# Parse a DocBook document (auto-detects article, book, chapter, etc.)
doc = Docbook::Document.from_xml(File.read('guide.xml'))

# With XInclude resolution
xml_string = File.read('main.xml')
resolved = Docbook::XincludeResolver.resolve_string(xml_string, base_path: 'main.xml')
doc = Docbook::Document.from_xml(resolved.to_xml)

Building the reader

# Full pipeline: parse + TOC + numbering + index + images + HTML
builder = Docbook::Output::Builder.new(
  xml_path: 'guide.xml',
  output_path: 'output/guide.html',
  format: :inline,             # :inline, :dom, :dist, :paged, or :chunked
  image_search_dirs: ['media/'],
  image_strategy: :data_url,   # :data_url, :file_url, or :relative
  title: "My Document"
)
builder.build

# Build a multi-book library
Docbook::Output::LibraryBuilder.new(
  input_path: 'library.yml',
  output_path: 'library.html',
  format: :inline
).build

Exporting JSON

doc = Docbook::Document.from_xml(File.read('guide.xml'))
mirror = Docbook::Output::DocbookMirror.new(doc)

# Pretty JSON
File.write('guide.json', mirror.to_pretty_json)

# Compact JSON
File.write('guide.json', mirror.to_json)

Reader features

The interactive reader built by docbook build includes:

Themes

Day, Sepia, Night, OLED — all driven by CSS custom properties

Search

Full-text search across headings and content (FlexSearch, lazy-loaded)

TOC

Collapsible hierarchical table of contents

Breadcrumb

Ancestor chain breadcrumb bar with collapsible chips

Fonts

Sans-serif (Inter) and serif (Merriweather) options

Navigation

Keyboard shortcuts (/ for search, Esc to close)

Print

Clean print stylesheet

Progressive load

Chunked format with on-demand sections, IndexedDB caching, skeleton screens, and prefetch

Mobile gestures

Touch swipe from left edge opens sidebar

Bookmarks

Press b to bookmark any section

Accessibility

Reduced-motion support, focus-visible indicators, 44px touch targets

Building the frontend

The reader’s frontend is a Vue 3 application pre-compiled into frontend/dist/.

cd frontend
npm install
npm run build

After building, docbook build automatically includes the compiled assets.

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                        DocBook XML (input)                              │
│                    book.xml, article.xml, etc.                          │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────────┐
│                      Lutaml::Model (gem)                               │
│                   Serialization Framework                              │
│                                                                        │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │  Lutaml::Model::Serializable                                    │   │
│  │  • XML ↔ Ruby object mapping (from_xml / to_xml)                │   │
│  │  • Attribute definitions with types                             │   │
│  │  • mixed_content, map_element, map_attribute                    │   │
│  │  • Lutaml::Xml::W3c::XmlIdType for xml:id                       │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                  ▲ base class for all element classes                  │
└──────────────────┼─────────────────────────────────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                Docbook Gem (metanorma/docbook)                         │
│                                                                         │
│  ┌──────────────────────────────────────────────────────────────────┐   │
│  │  Docbook::Elements::*   (~100+ classes)                          │   │
│  │                                                                  │   │
│  │  Structural:   Book, Article, Chapter, Part, Section, Appendix   │   │
│  │  Block:        Para, FormalPara, BlockQuote, Example             │   │
│  │  List:         OrderedList, ItemizedList, VariableList           │   │
│  │  Inline:       Emphasis, Code, Link, Xref, Filename, Command     │   │
│  │  Media:        MediaObject, ImageObject, ImageData, Figure       │   │
│  │  Table:        Table, InformalTable, TGroup, Row, Entry          │   │
│  │  Admonition:   Note, Warning, Caution, Important, Tip            │   │
│  │  Reference:    RefEntry, RefSection, RefMeta, FieldSynopsis      │   │
│  │  ...and more                                                     │   │
│  └──────────────────────────────────────────────────────────────────┘   │
│                                                                         │
│  Docbook::Document                                                      │
│  • Root element dispatcher (from_xml → correct element class)           │
│                                                                         │
│  ┌────────────────┐  ┌────────────────────┐  ┌───────────────────┐      │
│  │   XInclude     │  │   XrefResolver     │  │  Services::*      │      │
│  │   Resolver     │  │   (link resolution)│  │  (helpers)        │      │
│  └────────────────┘  └────────────────────┘  └───────────────────┘      │
│                                                                         │
│  ┌───────────────────────────────────────────────────────────────────┐  │
│  │                        Output Layer                               │  │
│  │                                                                   │  │
│  │  ┌──────────────────┐  ┌────────────────┐  ┌──────────────────┐  │  │
│  │  │  Output::Builder │  │  Output::       │  │  Output::        │  │  │
│  │  │  Pipeline →      │  │  HtmlRenderer   │  │  Formats::*      │  │  │
│  │  │  Format          │  │  (DOM/Paged)    │  │  Inline/Dom/     │  │  │
│  │  │                  │  │                 │  │  Dist/Paged/     │  │  │
│  │  │                  │  │                 │  │  Chunked         │  │  │
│  │  └────────┬─────────┘  └─────────────────┘  └──────────────────┘  │  │
│  │           │                                                       │  │
│  └───────────┼───────────────────────────────────────────────────────┘  │
│              │                                                          │
│  ┌───────────┼───────────────────────────────────────────────────────┐  │
│  │           ▼                                                       │  │
│  │  ┌─────────────────────────────────────────────────────────────┐  │  │
│  │  │             DocbookMirror (Mirror::Transformer)             │  │  │
│  │  │                                                             │  │  │
│  │  │  Converts Docbook Element objects → ProseMirror-compatible  │  │  │
│  │  │  JSON document format (doc > nodes > text, with marks)      │  │  │
│  │  │                                                             │  │  │
│  │  │  Mirror::Node::Document ──┐                                 │  │  │
│  │  │  Mirror::Node::Block  ────┤  tree structure                 │  │  │
│  │  │  Mirror::Node::Text   ────┤                                 │  │  │
│  │  │  Mirror::Mark::*      ────┘  inline annotations             │  │  │
│  │  │   (emphasis, code, link, xref, citation, tag)               │  │  │
│  │  └─────────────────────────────────────────────────────────────┘  │  │
│  │                     │                          │                  │  │
│  │           export cmd (JSON)            embedded in Builder        │  │
│  │                     │                          │                  │  │
│  └─────────────────────┼──────────────────────────┼──────────────────┘  │
│                        │                          │                     │
│                        ▼                          ▼                     │
└────────────────────────┼──────────────────────────┼─────────────────────┘
                         │                          │
                         ▼                          ▼
┌────────────────────────────────────────────────────────────────────────┐
│                  Vue 3 SPA (frontend/dist/)                            │
│                                                                        │
│  Built by Vite → app.iife.js + app.css                                 │
│  Inlined into single-page HTML for file:// protocol support            │
│                                                                        │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                      Pinia Stores                                │  │
│  │  • documentStore  (document data, TOC, numbering)                │  │
│  │  • uiStore        (sidebar, search, active section state)        │  │
│  │  • ebookStore     (themes, settings)                             │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                        │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │  MirrorRenderer.vue + TextRenderer.vue                           │  │
│  │  Renders DocbookMirror JSON:                                      │  │
│  │  ├─ doc, section, chapter, appendix, part, reference, refentry   │  │
│  │  ├─ paragraph, heading, code_block, list, table, blockquote      │  │
│  │  ├─ image, admonition, footnote                                  │  │
│  │  └─ text nodes with marks (emphasis, code, link, xref, citation) │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                        │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │  UI Shell                                                        │  │
│  │  EbookContainer, AppSidebar, TocTreeItem, EbookTopBar,           │  │
│  │  BreadcrumbBar, SearchModal, SettingsPanel, LibraryApp           │  │
│  └──────────────────────────────────────────────────────────────────┘  │
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌────────────────────────────────────────────────────────────────────────┐
│                    Interactive HTML Reader                             │
│    Inline (single .html), DOM (pre-rendered), Dist (directory),       │
│    Paged (split pages), Chunked (on-demand sections)                  │
│    -- all driven by Format strategy classes                            │
└────────────────────────────────────────────────────────────────────────┘

CLI Commands:

  docbook build    [INPUT] [-o out.html]  →  Builder → Format (inline|dom|dist|paged|chunked)
  docbook build    --demo                 →  Demo from bundled sample
  docbook library  DIRECTORY/manifest     →  Multi-book library
  docbook export   INPUT [-o out.json]    →  DocbookMirror JSON
  docbook validate INPUT                  →  RELAX NG schema validation

Key relationships:

Lutaml::Model

The foundation — every Docbook::Elements::* class inherits from Lutaml::Model::Serializable, providing XML↔Ruby object serialization.

DocbookMirror

A ProseMirror-compatible intermediate JSON format. The Ruby Mirror::Transformer walks Docbook element objects and produces a {content: [nodes]} tree with typed nodes and marks.

Vue SPA

Renders DocbookMirror JSON via MirrorRenderer and TextRenderer components. The data is embedded as window.DOCBOOK_DATA.

Builder/Formats

The output layer. Pipeline processes XML into a guide hash, then a Format strategy class (Inline, Dom, Dist, Paged, Chunked) packages it into the chosen output structure.

Contributing

Bug reports and pull requests are welcome at https://github.com/metanorma/docbook.

License

Copyright Ribose. MIT License.