Natural Sort
Natural-sort ordering for Ruby — sort strings the way people read them, so
"a2" comes before "a10" instead of after it.
%w[a1 a10 a2].sort # => ["a1", "a10", "a2"] # lexical: a10 before a2
NaturalSort.sort(%w[a1 a10 a2]) # => ["a1", "a2", "a10"] # natural
Installation
Add it to your Gemfile:
gem "natural_sort"
…then bundle install. Or grab it directly:
$ gem install natural_sort
Requires Ruby 3.3 or newer.
Usage
NaturalSort is a comparator that plugs into Ruby's own sort methods — it
doesn't replace them.
list = %w[a10 a a20 a1b a1a a2 a0 a1]
NaturalSort.sort(list) # => ["a", "a0", "a1", "a1a", "a1b", "a2", "a10", "a20"]
NaturalSort.sort!(list) # same, but sorts `list` in place and returns it
list.sort(&NaturalSort) # NaturalSort works directly as the comparison block
sort(&NaturalSort) works because the module is a comparator. To sort by a
derived value you want a key instead — NaturalSort.key(x), or the
NaturalSort() helper — for sort_by, min_by, and friends:
require "natural_sort/kernel"
UbuntuRelease = Struct.new(:number, :name)
releases = [
UbuntuRelease.new("9.04", "Jaunty Jackalope"),
UbuntuRelease.new("10.10", "Maverick Meerkat"),
UbuntuRelease.new("8.10", "Intrepid Ibex"),
UbuntuRelease.new("10.04.4", "Lucid Lynx"),
UbuntuRelease.new("9.10", "Karmic Koala"),
]
releases.sort_by { |release| NaturalSort(release.number) }
# => 8.10, 9.04, 9.10, 10.04.4, 10.10
NaturalSort() is a global helper — a Kernel method in the spirit of
Integer() or Array(). It lives in a separate file so that requiring the gem
(or its refinements) never adds a method to every object unless you explicitly
ask for it with require "natural_sort/kernel". If you'd rather not add a
global method, NaturalSort.key(value) does the same thing:
releases.sort_by { |release| NaturalSort.key(release.number) }
Performance. NaturalSort.sort and sort_by with a NaturalSort.key build
one key per element; &NaturalSort re-splits both strings on every comparison
(so roughly n log n key builds instead of n). For large arrays, prefer the
key-based forms.
Keys are immutable and safe to share across threads, so you can build one once and reuse it — e.g. cache keys when sorting the same data repeatedly.
How it sorts
NaturalSort is a faithful port of Martin Pool's natural-order string
comparison — the same algorithm PHP's strnatcmp uses. When an
ordering looks ambiguous, that implementation is the source of truth.
Each string is split into runs of digits and runs of non-digits, then compared segment by segment:
- Numbers compare numerically —
"a2"sorts before"a10", and arbitrarily large integers compare exactly (no float rounding or overflow). - Everything else compares by byte value (case-sensitive ASCII), so every uppercase letter sorts before every lowercase one.
- A digit run with a leading zero is treated as text, so fraction- and version-like strings order the way you'd expect:
NaturalSort.sort(%w[1.1 1.02 1.002]) # => ["1.002", "1.02", "1.1"]
- Whitespace is skipped — it never affects ordering on its own, though it still separates adjacent digit runs.
Comparison is byte-based and not locale-aware: non-ASCII bytes sort by byte value (for valid UTF-8, that's the same as codepoint order), and malformed or non-ASCII-compatible input — UTF-16, stray bytes — is ordered by byte rather than raising.
Surprising cases
Because this matches strnatcmp exactly, it inherits a few results that catch
people off guard — all consequences of the rules above:
# A leading zero makes a number sort like a fraction, so "08" and "09" land
# BEFORE "1" — not where you'd put the eighth and ninth items.
NaturalSort.sort(%w[10 08 1 09 2]) # => ["08", "09", "1", "2", "10"]
NaturalSort.sort(%w[1.5 1.50 1.05]) # => ["1.05", "1.5", "1.50"]
# Among themselves, leading-zero numbers compare as text, so "01333" sorts
# BEFORE "0400" and "0401" — '1' beats '4' even though 1333 > 400.
NaturalSort.sort(%w[0400 01333 0401]) # => ["01333", "0400", "0401"]
# Whitespace is insignificant, so these compare equal...
NaturalSort.compare("a b", "ab") # => 0
# ...but it still splits a number in two, so "1 0" is [1, 0], not 10:
NaturalSort.compare("1 0", "10") # => -1
# Case-sensitive byte order: every uppercase letter sorts before every
# lowercase one (so "Z" sorts before "a").
NaturalSort.sort(%w[banana Apple apple Banana])
# => ["Apple", "Banana", "apple", "banana"]
Want case-insensitive ordering? Normalize your keys:
%w[img10 IMG2 img1].sort_by { |s| NaturalSort.key(s.downcase) }
# => ["img1", "IMG2", "img10"]
Refinements
Prefer calling methods directly? Opt into natural_sort and natural_sort_by
on Array, Hash, and Set:
require "natural_sort/refinements"
using NaturalSort
%w[a1 a10 a2].natural_sort # => ["a1", "a2", "a10"]
releases.natural_sort_by(&:number) # => sorted by version number
Versioning
This project follows Semantic Versioning. The sort order itself is part of the public API: any change to how strings are ordered is a breaking change and ships only in a major release.
Contributing
Bug reports and pull requests are welcome at https://github.com/rwz/natural_sort.
License
Available as open source under the terms of the MIT License.