DMS

dms-rb

Ruby parser for DMS, a data syntax with strong typing, ordered maps, multi-line heredocs, and front-matter metadata.

Two gems live in this repo, both with the same Ruby API and value shape:

gem implementation when to use
dms-parser pure Ruby (require "dms") portable; no C toolchain required
dms-c C extension wrapping the dms-c decoder hot paths; ~2× faster than pure Ruby

Gem naming. The plain dms name on RubyGems is taken by an unrelated project, so the pure-Ruby gem ships as dms-parser. The require path is still "dms" (the install command and the require line don't match — a one-time gotcha when adding the dependency).

What DMS looks like

A medium-size tier-0 document, exercising every feature you'd touch in a real config — front matter, comments (line + trailing), nested tables, list-of-tables with the + marker, flow forms, distinct types, and a heredoc with a trim modifier:

+++
title:    "DMS feature tour"
version:  "1.0.0"
updated:  2026-04-24T09:30:00-04:00
+++

# Hash and // line comments both work.
// Bare keys allow full Unicode; quoted keys take any string.

database:
  host:    "db.internal"
  port:    5432            # bumped after the LB change
  pool:    { size: 10, idle_timeout_s: 30 }   # flow table

servers:
  + name: "web1"
    disks:
      + mount: "/"
        size_gb: 100
      + mount: "/var"
        size_gb: 500
  + name: "web2"

regions: ["us-east-1", "eu-west-1", "ap-south-1"]

sql: """SQL _trim("\n", ">")
    SELECT id, email
      FROM users
     WHERE active = true
    SQL

Tier 1 layers structured decorators on top of the value tree. Sigils bind to families published by a dialect; here is dms+html carrying an HTML fragment as a DMS document:

+++
_dms_tier: 1
_dms_imports:
  + dialect: "html"
    version: "1.0.0"
+++

+ |html(lang: "en")
  + |head
    + |title "DMS feature tour"
    + |meta(charset: "UTF-8")
  + |body(class: "main")
    + |h1 "Welcome to DMS"
    + |p(class: "lede")
      + "Click "
      + |a(href: "/spec.html") "here"
      + " to read the spec."

Full feature tour, format comparison, and dialect index on the DMS website.

Install

gem install dms-parser   # pure Ruby
gem install dms-c        # native (C) extension, same API (not yet published)

Usage

require "dms"            # or:  require "dms_c"

src = File.read("config.dms")

# Body-only (drops front matter and comments after decode).
body = Dms.decode(src)   # or:  DmsC.decode(src)

# Full document (preserves comments + literal forms for encode round-trip).
doc = Dms.decode_document(src)
doc.meta              # Hash | nil  — nil when there is no `+++` block
doc.body              # decoded root value
doc.comments          # Array of Dms::AttachedComment
doc.original_forms    # Array of [path, Dms::OriginalLiteral]

# Re-emit DMS source.
output = Dms.encode(doc)

Migrating from parse/to_dms? SPEC v0.14 renamed the canonical entry points. The old names (Dms.parse, Dms.parse_document, Dms.parse_lite, Dms.to_dms, Dms.to_dms_lite, and the matching DmsC.parse*) still work as deprecated aliases — each emits a one-shot warning on first call, then forwards to the canonical name. They will be removed in the next release.

Tables are insertion-ordered Hashes (Ruby Hashes preserve insertion order since 1.9). Lists are Arrays. Datetimes are wrapped types: the pure module returns Dms::LocalDate / Dms::LocalTime / Dms::LocalDateTime / Dms::OffsetDateTime class instances; the C extension returns plain { __dms_type:, value: } hashes with the same data. Encoders that detect via __dms_type + value work unchanged across both gems.

Working with comments and heredocs

DMS preserves comments through decode → mutate → re-emit (SPEC §Comments). Attach a comment to a value after decoding and have it round-trip through Dms.encode:

require "dms"

doc = Dms.decode_document("db:\n  port: 8080\n")

# Mutate a value in place.
doc.body["db"]["port"] = 5432

# Attach a leading line comment to db.port.
doc.comments << Dms::AttachedComment.new(
  Dms::Comment.new("# bumped after LB change", :line),
  :leading,
  ["db", "port"],
)

puts Dms.encode(doc)

Forcing a heredoc on emit

Strings parse and re-emit in their source form. To switch a basic-quoted string to a heredoc (or to construct one from scratch), append an OriginalLiteral.string record to doc.original_forms keyed by the value's path:

doc.body["db"]["greeting"] = "Hello, friend.\nWelcome aboard.\n"

doc.original_forms << [
  ["db", "greeting"],
  Dms::OriginalLiteral.string(
    Dms::StringForm.heredoc(
      :basic_triple,    # or :literal_triple for '''
      nil,              # nil = unlabeled (terminator is """ / ''')
      [],               # _trim(...), _fold_paragraphs(), …
    ),
  ),
]

Round-trip rules (SPEC §Round-trip semantics): comments stick to still-present nodes; deleting a node drops its comments; newly inserted nodes start with no comments. The first original_forms entry per path wins, so override a parser-recorded form by replacing rather than appending if the key is already present.

Performance

50,000-key flat document (~700 KB), best-of-5, startup-subtracted, Ruby 3.3 on Windows 11:

tier DMS gem time JSON peer time YAML peer time DMS / JSON DMS / YAML
pure Ruby dms 115.8 ms n/a n/a n/a n/a
native (C) dms-c 56.5 ms json 21.4 ms psych 260.4 ms 2.63× 0.22× — DMS ~4.6× faster

Ruby's stdlib json and psych are both C-backed; there's no widely-used pure-Ruby alternative for either, so JSON and YAML peers only appear in the FFI tier (same situation as Node). The pure-Ruby DMS port is reported on its own — no fair pure-vs-pure peer to compare against.

The C extension is ~2× faster than pure Ruby; against C-backed peers DMS is ~2.6× the JSON cost (the cost of carrying comments, ordered keys, and source-form metadata) and ~5× faster than libyaml.

Reproduce with:

ruby bench/run_formats.rb

Build & test

# pure gem:
bundle install
bundle exec rake test

# native (C) gem:
cd dms-c/ext/dms_c && ruby extconf.rb && make

The C-extension build needs Ruby's MSYS toolchain on Windows or a standard cc + make on Unix; mkmf handles the platform detection.

Conformance

The fixture corpus lives in dms-tests (4500+ pairs). Clone it once as a sibling:

cd ..
git clone https://gitlab.com/flo-labs/pub/dms-tests.git

The dms-encoder binary reads DMS from stdin and writes tagged JSON to stdout, matching the format the conformance runner consumes.

License

Dual-licensed: MIT or Apache-2.0, your choice.