Pure Ruby Apache Arrow reader/writer
((<Apache Arrow|URL:arrow.apache.org/>)) is the de fact standard data format in modern data processing systems. We can use the official ((<Red Arrow|URL:rubygems.org/gems/red-arrow>)) gem to process Apache Arrow data. It’s suitable for fast large data processing but it’s over-performance for only low cost data exchange needs. Red Arrow is larger and a bit difficult to install than pure Ruby gems because Red Arrow is implemented as bindings.
I’m implementing the official pure Ruby Apache Arrow reader/writer for only low cost data exchange needs. I expect that more Ruby libraries and applications add support for Apache Arrow inputs/outputs by the pure Ruby Apache Arrow reader/writer. Ruby can be used more for data processing by it.
This talk describes how to implement fast pure Ruby binary data reader/writer and the future of data processing in Ruby.
This is a 2025 Ruby Association Grant project: ((<URL:www.ruby.or.jp/en/news/20251030>))
License
Slide
CC BY-SA 4.0
Use the followings for notation of the author:
* Sutou Kouhei
ClearCode Inc. logo
CC BY-SA 4.0
Author: ClearCode Inc.
It is used in page header and some pages in the slide.
For author
Show
rake
Publish
rake publish
For viewers
Install
gem install rabbit--kou-rubykaigi-2026
Show
rabbit rabbit--kou-rubykaigi-2026.gem