Natural Sort

CI Gem Version

Natural-sort ordering for Ruby — sort strings the way people read them, so "a2" comes before "a10" instead of after it.

%w[a1 a10 a2].sort               # => ["a1", "a10", "a2"]   # lexical: a10 before a2
NaturalSort.sort(%w[a1 a10 a2])  # => ["a1", "a2", "a10"]   # natural

Installation

Add it to your Gemfile:

gem "natural_sort"

…then bundle install. Or grab it directly:

$ gem install natural_sort

Requires Ruby 3.3 or newer.

Usage

NaturalSort is a comparator that plugs into Ruby's own sort methods — it doesn't replace them.

list = %w[a10 a a20 a1b a1a a2 a0 a1]

NaturalSort.sort(list)   # => ["a", "a0", "a1", "a1a", "a1b", "a2", "a10", "a20"]
NaturalSort.sort!(list)  # same, but sorts `list` in place and returns it
list.sort(&NaturalSort)  # NaturalSort works directly as the comparison block

sort(&NaturalSort) works because the module is a comparator. To sort by a derived value you want a key instead — NaturalSort.key(x), or the NaturalSort() helper — for sort_by, min_by, and friends:

require "natural_sort/kernel"

UbuntuRelease = Struct.new(:number, :name)

releases = [
  UbuntuRelease.new("9.04",    "Jaunty Jackalope"),
  UbuntuRelease.new("10.10",   "Maverick Meerkat"),
  UbuntuRelease.new("8.10",    "Intrepid Ibex"),
  UbuntuRelease.new("10.04.4", "Lucid Lynx"),
  UbuntuRelease.new("9.10",    "Karmic Koala"),
]

releases.sort_by { |release| NaturalSort(release.number) }
# => 8.10, 9.04, 9.10, 10.04.4, 10.10

NaturalSort() is a global helper — a Kernel method in the spirit of Integer() or Array(). It lives in a separate file so that requiring the gem (or its refinements) never adds a method to every object unless you explicitly ask for it with require "natural_sort/kernel". If you'd rather not add a global method, NaturalSort.key(value) does the same thing:

releases.sort_by { |release| NaturalSort.key(release.number) }

Performance. NaturalSort.sort and sort_by with a NaturalSort.key build one key per element; &NaturalSort re-splits both strings on every comparison (so roughly n log n key builds instead of n). For large arrays, prefer the key-based forms.

Keys are immutable and safe to share across threads, so you can build one once and reuse it — e.g. cache keys when sorting the same data repeatedly.

How it sorts

NaturalSort is a faithful port of Martin Pool's natural-order string comparison — the same algorithm PHP's strnatcmp uses. When an ordering looks ambiguous, that implementation is the source of truth.

Each string is split into runs of digits and runs of non-digits, then compared segment by segment:

  • Numbers compare numerically"a2" sorts before "a10", and arbitrarily large integers compare exactly (no float rounding or overflow).
  • Everything else compares by byte value (case-sensitive ASCII), so every uppercase letter sorts before every lowercase one.
  • A digit run with a leading zero is treated as text, so fraction- and version-like strings order the way you'd expect:
  NaturalSort.sort(%w[1.1 1.02 1.002])  # => ["1.002", "1.02", "1.1"]
  • Whitespace is skipped — it never affects ordering on its own, though it still separates adjacent digit runs.

Comparison is byte-based and not locale-aware: non-ASCII bytes sort by byte value (for valid UTF-8, that's the same as codepoint order), and malformed or non-ASCII-compatible input — UTF-16, stray bytes — is ordered by byte rather than raising.

Surprising cases

Because this matches strnatcmp exactly, it inherits a few results that catch people off guard — all consequences of the rules above:

# A leading zero makes a number sort like a fraction, so "08" and "09" land
# BEFORE "1" — not where you'd put the eighth and ninth items.
NaturalSort.sort(%w[10 08 1 09 2])   # => ["08", "09", "1", "2", "10"]
NaturalSort.sort(%w[1.5 1.50 1.05])  # => ["1.05", "1.5", "1.50"]

# Among themselves, leading-zero numbers compare as text, so "01333" sorts
# BEFORE "0400" and "0401" — '1' beats '4' even though 1333 > 400.
NaturalSort.sort(%w[0400 01333 0401])  # => ["01333", "0400", "0401"]

# Whitespace is insignificant, so these compare equal...
NaturalSort.compare("a b", "ab")     # => 0
# ...but it still splits a number in two, so "1 0" is [1, 0], not 10:
NaturalSort.compare("1 0", "10")     # => -1

# Case-sensitive byte order: every uppercase letter sorts before every
# lowercase one (so "Z" sorts before "a").
NaturalSort.sort(%w[banana Apple apple Banana])
# => ["Apple", "Banana", "apple", "banana"]

Want case-insensitive ordering? Normalize your keys:

%w[img10 IMG2 img1].sort_by { |s| NaturalSort.key(s.downcase) }
# => ["img1", "IMG2", "img10"]

Refinements

Prefer calling methods directly? Opt into natural_sort and natural_sort_by on Array, Hash, and Set:

require "natural_sort/refinements"

using NaturalSort

%w[a1 a10 a2].natural_sort           # => ["a1", "a2", "a10"]
releases.natural_sort_by(&:number)   # => sorted by version number

Versioning

This project follows Semantic Versioning. The sort order itself is part of the public API: any change to how strings are ordered is a breaking change and ships only in a major release.

Contributing

Bug reports and pull requests are welcome at https://github.com/rwz/natural_sort.

License

Available as open source under the terms of the MIT License.