Purpose
DocBook is a Ruby gem for parsing, validating, and converting DocBook 5 XML documents into interactive HTML readers and structured JSON. It provides a complete command-line interface and Ruby API for document processing.
Key features:
-
Interactive reader — Self-contained HTML with search, themes, TOC navigation
-
DocBook 5 parsing — Full element support via lutaml-model
-
XInclude resolution — Transparent handling of multi-file documents
-
DocbookMirror JSON — ProseMirror-compatible structured representation
-
Output formats — Inline, DOM (SEO), distribution directory, split-pages, and chunked (on-demand loading) modes
-
Library support — Multi-book collections with covers and manifests
Installation
gem 'docbook'
Or:
$ gem install docbook
Quick start
CLI
Build an interactive reader from any DocBook XML:
$ docbook build guide.xml
This produces guide.html in the same directory. Open it in a browser — it works offline with no server required.
Specify a custom output path:
$ docbook build guide.xml -o output.html
Try the bundled demo:
$ docbook build --demo
SEO-friendly with pre-rendered content:
$ docbook build guide.xml --format dom -o reader.html
Library
Build a multi-book library:
$ docbook library my-library/ --format dist -o /var/www/library/
Ruby API
require 'docbook'
# Build an interactive reader (any format)
Docbook::Output::Builder.new(
xml_path: 'guide.xml',
output_path: 'guide.html',
format: :inline # or :dom, :dist, :paged, :chunked
).build
# Build a multi-book library
Docbook::Output::LibraryBuilder.new(
input_path: 'library.yml',
output_path: 'library.html',
format: :inline
).build
# Export as DocbookMirror JSON
parsed = Docbook::Document.from_xml(File.read('guide.xml'))
json = Docbook::Output::DocbookMirror.new(parsed).to_pretty_json
File.write('guide.json', json)
CLI reference
| Command | Description |
|---|---|
|
Build an interactive HTML reader from DocBook XML. Pass |
|
Build a multi-book library from a directory or manifest |
|
Export as DocbookMirror JSON |
|
Validate DocBook XML against RELAX NG schema |
|
Format/prettify DocBook XML |
build
Build an interactive HTML reader. The output is a single self-contained file that works offline (CSS, JS, and images are all inlined).
$ docbook build [INPUT] [options]
$ docbook build --demo [options]
Options:
-
-o, --output FILE— Output HTML file path (default:<input>.htmlordemo.html) -
--demo— Use the bundled DocBook sample as input (accepts name:xslTNGormodel-flow) -
--format FORMAT— Output format:inline(default),dom,dist,paged, orchunked -
--image-search-dir DIR— Directories to search for images (default: XML file directory) -
--image-strategy STRATEGY— Image handling:data_url(default),file_url, orrelative -
--title TITLE— Page title (default: derived from document) -
--sort-glossary— Sort glossary entries alphabetically -
--xinclude/--no-xinclude— Resolve XIncludes (default: true)
Examples:
# Basic (output defaults to book.html)
$ docbook build book.xml
# Custom output path
$ docbook build book.xml -o output.html
# SEO-friendly with pre-rendered content
$ docbook build book.xml --format dom -o output.html
# Distribution directory (separate assets)
$ docbook build book.xml --format dist -o /var/www/book/
# Split into individual pages
$ docbook build book.xml --format paged -o /var/www/book/
# On-demand section loading (best for large books)
$ docbook build book.xml --format chunked -o /var/www/book/
# With custom image paths
$ docbook build book.xml --image-search-dir media/ images/
# For web hosting (external image files)
$ docbook build book.xml --image-strategy relative
# Bundled demo
$ docbook build --demo
# Named demo
$ docbook build --demo=model-flow
export
Export DocBook XML as DocbookMirror JSON (ProseMirror-compatible format):
$ docbook export INPUT -o output.json
# Or to stdout
$ docbook export INPUT
validate
Validate DocBook XML:
$ docbook validate input.xml
input.xml: valid
# Well-formedness only (no schema)
$ docbook validate --wellformed input.xml
format
Format/prettify DocBook XML:
$ docbook format input.xml -o output.xml
library
Build a multi-book library from a directory or manifest:
# From a directory (auto-discovers XML files)
$ docbook library my-library/ -o library.html
# From a YAML manifest
$ docbook library library.yml --format dist -o /var/www/library/
# From a JSON manifest
$ docbook library library.json --format paged -o /var/www/library/
Ruby API
Parsing
require 'docbook'
# Parse a DocBook document (auto-detects article, book, chapter, etc.)
doc = Docbook::Document.from_xml(File.read('guide.xml'))
# With XInclude resolution
xml_string = File.read('main.xml')
resolved = Docbook::XincludeResolver.resolve_string(xml_string, base_path: 'main.xml')
doc = Docbook::Document.from_xml(resolved.to_xml)
Building the reader
# Full pipeline: parse + TOC + numbering + index + images + HTML
builder = Docbook::Output::Builder.new(
xml_path: 'guide.xml',
output_path: 'output/guide.html',
format: :inline, # :inline, :dom, :dist, :paged, or :chunked
image_search_dirs: ['media/'],
image_strategy: :data_url, # :data_url, :file_url, or :relative
title: "My Document"
)
builder.build
# Build a multi-book library
Docbook::Output::LibraryBuilder.new(
input_path: 'library.yml',
output_path: 'library.html',
format: :inline
).build
Reader features
The interactive reader built by docbook build includes:
- Themes
-
Day, Sepia, Night, OLED — all driven by CSS custom properties
- Search
-
Full-text search across headings and content (FlexSearch, lazy-loaded)
- TOC
-
Collapsible hierarchical table of contents
- Breadcrumb
-
Ancestor chain breadcrumb bar with collapsible chips
- Fonts
-
Sans-serif (Inter) and serif (Merriweather) options
- Navigation
-
Keyboard shortcuts (
/for search,Escto close) -
Clean print stylesheet
- Progressive load
-
Chunked format with on-demand sections, IndexedDB caching, skeleton screens, and prefetch
- Mobile gestures
-
Touch swipe from left edge opens sidebar
- Bookmarks
-
Press
bto bookmark any section - Accessibility
-
Reduced-motion support, focus-visible indicators, 44px touch targets
Building the frontend
The reader’s frontend is a Vue 3 application pre-compiled into frontend/dist/.
cd frontend
npm install
npm run build
After building, docbook build automatically includes the compiled assets.
Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ DocBook XML (input) │
│ book.xml, article.xml, etc. │
└────────────────────────────────┬────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────────┐
│ Lutaml::Model (gem) │
│ Serialization Framework │
│ │
│ ┌─────────────────────────────────────────────────────────────────┐ │
│ │ Lutaml::Model::Serializable │ │
│ │ • XML ↔ Ruby object mapping (from_xml / to_xml) │ │
│ │ • Attribute definitions with types │ │
│ │ • mixed_content, map_element, map_attribute │ │
│ │ • Lutaml::Xml::W3c::XmlIdType for xml:id │ │
│ └─────────────────────────────────────────────────────────────────┘ │
│ ▲ base class for all element classes │
└──────────────────┼─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ Docbook Gem (metanorma/docbook) │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Docbook::Elements::* (~100+ classes) │ │
│ │ │ │
│ │ Structural: Book, Article, Chapter, Part, Section, Appendix │ │
│ │ Block: Para, FormalPara, BlockQuote, Example │ │
│ │ List: OrderedList, ItemizedList, VariableList │ │
│ │ Inline: Emphasis, Code, Link, Xref, Filename, Command │ │
│ │ Media: MediaObject, ImageObject, ImageData, Figure │ │
│ │ Table: Table, InformalTable, TGroup, Row, Entry │ │
│ │ Admonition: Note, Warning, Caution, Important, Tip │ │
│ │ Reference: RefEntry, RefSection, RefMeta, FieldSynopsis │ │
│ │ ...and more │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ Docbook::Document │
│ • Root element dispatcher (from_xml → correct element class) │
│ │
│ ┌────────────────┐ ┌────────────────────┐ ┌───────────────────┐ │
│ │ XInclude │ │ XrefResolver │ │ Services::* │ │
│ │ Resolver │ │ (link resolution)│ │ (helpers) │ │
│ └────────────────┘ └────────────────────┘ └───────────────────┘ │
│ │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ Output Layer │ │
│ │ │ │
│ │ ┌──────────────────┐ ┌────────────────┐ ┌──────────────────┐ │ │
│ │ │ Output::Builder │ │ Output:: │ │ Output:: │ │ │
│ │ │ Pipeline → │ │ HtmlRenderer │ │ Formats::* │ │ │
│ │ │ Format │ │ (DOM/Paged) │ │ Inline/Dom/ │ │ │
│ │ │ │ │ │ │ Dist/Paged/ │ │ │
│ │ │ │ │ │ │ Chunked │ │ │
│ │ └────────┬─────────┘ └─────────────────┘ └──────────────────┘ │ │
│ │ │ │ │
│ └───────────┼───────────────────────────────────────────────────────┘ │
│ │ │
│ ┌───────────┼───────────────────────────────────────────────────────┐ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ DocbookMirror (Mirror::Transformer) │ │ │
│ │ │ │ │ │
│ │ │ Converts Docbook Element objects → ProseMirror-compatible │ │ │
│ │ │ JSON document format (doc > nodes > text, with marks) │ │ │
│ │ │ │ │ │
│ │ │ Mirror::Node::Document ──┐ │ │ │
│ │ │ Mirror::Node::Block ────┤ tree structure │ │ │
│ │ │ Mirror::Node::Text ────┤ │ │ │
│ │ │ Mirror::Mark::* ────┘ inline annotations │ │ │
│ │ │ (emphasis, code, link, xref, citation, tag) │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │ │
│ │ export cmd (JSON) embedded in Builder │ │
│ │ │ │ │ │
│ └─────────────────────┼──────────────────────────┼──────────────────┘ │
│ │ │ │
│ ▼ ▼ │
└────────────────────────┼──────────────────────────┼─────────────────────┘
│ │
▼ ▼
┌────────────────────────────────────────────────────────────────────────┐
│ Vue 3 SPA (frontend/dist/) │
│ │
│ Built by Vite → app.iife.js + app.css │
│ Inlined into single-page HTML for file:// protocol support │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ Pinia Stores │ │
│ │ • documentStore (document data, TOC, numbering) │ │
│ │ • uiStore (sidebar, search, active section state) │ │
│ │ • ebookStore (themes, settings) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ MirrorRenderer.vue + TextRenderer.vue │ │
│ │ Renders DocbookMirror JSON: │ │
│ │ ├─ doc, section, chapter, appendix, part, reference, refentry │ │
│ │ ├─ paragraph, heading, code_block, list, table, blockquote │ │
│ │ ├─ image, admonition, footnote │ │
│ │ └─ text nodes with marks (emphasis, code, link, xref, citation) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ UI Shell │ │
│ │ EbookContainer, AppSidebar, TocTreeItem, EbookTopBar, │ │
│ │ BreadcrumbBar, SearchModal, SettingsPanel, LibraryApp │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────────┐
│ Interactive HTML Reader │
│ Inline (single .html), DOM (pre-rendered), Dist (directory), │
│ Paged (split pages), Chunked (on-demand sections) │
│ -- all driven by Format strategy classes │
└────────────────────────────────────────────────────────────────────────┘
CLI Commands:
docbook build [INPUT] [-o out.html] → Builder → Format (inline|dom|dist|paged|chunked)
docbook build --demo → Demo from bundled sample
docbook library DIRECTORY/manifest → Multi-book library
docbook export INPUT [-o out.json] → DocbookMirror JSON
docbook validate INPUT → RELAX NG schema validation
Key relationships:
- Lutaml::Model
-
The foundation — every Docbook::Elements::* class inherits from Lutaml::Model::Serializable, providing XML↔Ruby object serialization.
- DocbookMirror
-
A ProseMirror-compatible intermediate JSON format. The Ruby Mirror::Transformer walks Docbook element objects and produces a {content: [nodes]} tree with typed nodes and marks.
- Vue SPA
-
Renders DocbookMirror JSON via MirrorRenderer and TextRenderer components. The data is embedded as window.DOCBOOK_DATA.
- Builder/Formats
-
The output layer. Pipeline processes XML into a guide hash, then a Format strategy class (Inline, Dom, Dist, Paged, Chunked) packages it into the chosen output structure.
Contributing
Bug reports and pull requests are welcome at https://github.com/metanorma/docbook.
License
Copyright Ribose. MIT License.