Bemi

A suite of tools that allow to reliably track data changes without using extra Rails model callbacks.

Bemi stands for "beginner mindset" and is pronounced as [ˈbɛmɪ].

Contents

Overview

  • Automatically storing database changes and any addition context in a structured form
  • High performance without affecting your code execution with callbacks
  • 100% reliability by using a design pattern called Change Data Capture
  • Easy to use, no data engineering knowledge or complex infrastructure is required
  • Web UI and code tools for inspecting and auditing data changes and user activity
  • Works with the most popular databases like MySQL, PostgreSQL and MongoDB (soon)

Code example

Here is an example of storing all data changes made when processing an HTTP request:

class ApplicationController < ActionController::Base
  before_action :set_bemi_context

  private

  # Attach any information you want to any subsequent data changes
  def set_bemi_context
    Bemi.set_context(
      user_id: current_user&.id,
      ip: request.remote_ip,
      user_agent: request.user_agent,
      controller: "#{self.class.name}##{action_name}",
    )
  end
end
class InvoicesController < ApplicationController
  # Automatically store *any* database changes
  def update
    invoice = Invoice.find(params[:id])
    invoice.update_column(:due_date, params[:due_date])
    invoice.client.recurring_schedule.delete
  end
end

Bemi then allows easily querying data changes:

Bemi.activity(ip: '127.0.0.1').map(&:pretty_print)

# Bemi::Changeset
#   - id: 2040
#   - table: "invoices"
#   - external_id: 43
#   - action: "update"
#   - committed_at: Sat, 03 Jun 2023 21:16:22 UTC +00:00
#   - change:
#     - updated_at: ["2023-06-03 20:41:35", "2023-06-03 21:16:22"]
#     - due_date: ["2023-06-03", "2023-06-30"]
#   - context:
#     - ip: "127.0.0.1"
#     - user_id: 3195
#     - controller: "InvoicesController#update"
#     - user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36"
#
# Bemi::Changeset
#   - id: 2041
#   - table: "recurring_schedules"
#   - external_id: 5
#   - action: "delete"
#   - committed_at: Sat, 03 Jun 2023 21:16:22 UTC +00:00
#   - change:
#     - id: 5
#     - frequency: 1
#     - occurrences: 0
#     - invoice_id: 43
#     - created_at: "2023-04-28 20:34:09"
#     - updated_at: "2023-04-28 20:34:09"
#   - context:
#     - ip: "127.0.0.1"
#     - user_id: 3195
#     - controller: "InvoicesController#update"
#     - user_agent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36"

Architecture

Bemi is designed to be lightweight, composable, and simple to use by default.

         /‾‾‾\
         \___/
       __/   \__
      /   User  \
           │
           │                       Application code
 - - - - - │ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
╵          │                                                                          ╵
╵          │ Update invoice                                                           ╵
╵          ∨                                                                          ╵
╵    ______________                                                 ______________    ╵
╵   ┆              ┆                                               ┆              ┆   ╵
╵   ┆    Rails     ┆                          Structured changes   ┆     Bemi     ┆   ╵
╵   ┆    server    ┆                       ╷–––––––––––––––––––––– ┆    process   ┆   ╵
╵   ┆              ┆                       │                       ┆              ┆   ╵
╵    ‾‾‾‾‾‾‾‾‾‾‾‾‾‾                        │                        ‾‾‾‾‾‾‾‾‾‾‾‾‾‾    ╵
╵          │                               │                               ⌃          ╵
╵          │ Database query                │               Replication log │          ╵
╵          │                               │                               │          ╵
 - - - - - │ - - - - - - - - - - - - - - - │ - - - - - - - - - - - - - - - │ - - - - -╵
           │                               ∨                               │
           │                         [‾‾‾‾‾‾‾‾‾‾‾‾]                        │
           │                         [------------]                        │
           ╵–––––––––––––––––––––––> [  Database  ] –––––––––––––––––––––––╵
                                     [------------]
                                     [____________]

Bemi by reuses the same connection configuration and runs a simple process to process a replication log that databases usually use to communicate within the same cluster:

  • Binary Log for MySQL
  • Write-Ahead Log for PostgreSQL
  • Oplog for MongoDB

By default, it stores the structured data changes in the same database.

Usage

Installation

Add gem 'bemi' to your application's Gemfile and execute:

$ bundle install

Database migration

Create a new database migration to store changeset and context in a structured form:

$ bundle exec rails g migration create_bemi_tables

Then paste the following into the created migration file:

# db/migrate/20230603190131_create_bemi_tables.rb
CreateBemiTables = Class.new(Bemi.generate_migration)

And run:

$ bundle exec rails db:migrate

Bemi process

Alternatives

Background jobs with persistent state

Tools like Sidekiq, Que, and GoodJob are similar since they execute jobs in background, persist the execution state, retry, etc. These tools, however, focus on executing a single job as a unit of work. Bemi can use these tools to perform single actions when managing chains of actions defined in workflows without a need to use complex callbacks.

Bemi orchestrates workflows instead of trying to choreograph them. This makes it easy to implement and maintain the code, reduce coordination overhead by having a central coordinator, improve observability, and simplify troubleshooting issues.

Orchestration ![Orchestration](images/orchestration.jpg)
Choreography ![Choreography](images/choreography.jpg)

Workflow orchestration tools and services

Tools like Temporal, AWS Step Functions, Argo Workflows, and Airflow allow orchestrating workflows, although they use quite different approaches.

Temporal was born based on challenges faced by big-tech and enterprise companies. As a result, it has a complex architecture with deployed clusters, support for databases like Cassandra and optional Elasticsearch, and multiple services for frontend, matching, history, etc. Its main differentiator is writing workflows imperatively instead of describing them declaratively (think of state machines). This makes code a lot more complex and forces you to mix business logic with implementation and execution details. Some would argue that Temporal's development and user experience are quite rough. Plus, at the time of this writing, it doesn't have an official stable SDK for our favorite programming language (Ruby).

AWS Step Functions rely on using AWS Lambda to execute each action in a workflow. For various reasons, not everyone can use AWS and their serverless solution. Additionally, workflows should be defined in JSON by using Amazon States Language instead of using a regular programming language.

Argo Workflows rely on using Kubernetes. It is closer to infrastructure-level workflows since it relies on running a container for each workflow action and doesn't provide code-level features and primitives. Additionally, it requires defining workflows in YAML.

Airflow is a popular tool for data engineering pipelines. Unfortunately, it can work only with Python.

Ruby frameworks for writing better code

There are many libraries that also implement useful patterns and allow better organize the code. For example, Interactor, ActiveInteraction, Mutations, Dry-Rb, and Trailblazer. They, however, don't help with asynchronous and distributed execution with better reliability guarantees that many of us rely on to execute code "out-of-band" to avoid running long-running workflows in a request/response lifecycle. For example, when sending emails, sending requests to other services, running multiple actions in parallel, etc.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the Bemi project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.