Class: Pcrd::Schema::Packer
- Inherits:
-
Object
- Object
- Pcrd::Schema::Packer
- Defined in:
- lib/pcrd/schema/packer.rb
Overview
Analyzes column alignment and computes optimal ordering to minimize padding waste.
PostgreSQL stores columns in definition order. Each column must start at an address aligned to its type’s natural alignment boundary. When a small-aligned column (e.g. bool, 1 byte) precedes a large-aligned column (e.g. timestamp, 8 bytes), PostgreSQL inserts padding bytes to satisfy the alignment requirement of the larger type. Reordering columns largest-alignment-first eliminates this waste entirely for fixed-size columns.
Variable-length columns (text, varchar, numeric, etc.) have a 4-byte aligned varlena header. Their actual content length is not predictable, so we count only the header for padding estimates and place them last where they contribute no cross-column alignment overhead.
Defined Under Namespace
Classes: LayoutEntry
Instance Method Summary collapse
-
#estimated_row_size(columns) ⇒ Object
Estimated bytes consumed by fixed-length columns plus alignment padding.
-
#layout(columns) ⇒ Object
Computes the per-column layout (offset and padding before each column).
-
#optimize(columns) ⇒ Object
Returns columns in optimal order: 8-byte → 4-byte → 2-byte → 1-byte → variable.
-
#report(columns) ⇒ Object
Returns a report hash comparing current vs optimal layout.
-
#total_padding(columns) ⇒ Object
Total wasted padding bytes across all columns.
Instance Method Details
#estimated_row_size(columns) ⇒ Object
Estimated bytes consumed by fixed-length columns plus alignment padding. Variable-length columns contribute 4 bytes each (header only).
50 51 52 53 54 |
# File 'lib/pcrd/schema/packer.rb', line 50 def estimated_row_size(columns) layout(columns).sum do |e| e.padding_before + (e.column.fixed? ? e.column.fixed_size : 4) end end |
#layout(columns) ⇒ Object
Computes the per-column layout (offset and padding before each column). Returns Array<LayoutEntry>.
34 35 36 37 38 39 40 41 42 43 44 45 46 |
# File 'lib/pcrd/schema/packer.rb', line 34 def layout(columns) offset = 0 entries = [] columns.each do |col| align = col.fixed? ? col.alignment : 4 # varlena header is 4-byte aligned padding = padding_needed(offset, align) entries << LayoutEntry.new(column: col, offset: offset + padding, padding_before: padding) offset += padding + (col.fixed? ? col.fixed_size : 4) # count header only for varlena end entries end |
#optimize(columns) ⇒ Object
Returns columns in optimal order: 8-byte → 4-byte → 2-byte → 1-byte → variable. Within each alignment tier, preserves the original column order.
25 26 27 28 29 30 |
# File 'lib/pcrd/schema/packer.rb', line 25 def optimize(columns) fixed = columns.select(&:fixed?) variable = columns.select(&:variable?) sorted_fixed = fixed.sort_by.with_index { |c, i| [-c.alignment, -c.fixed_size, i] } sorted_fixed + variable end |
#report(columns) ⇒ Object
Returns a report hash comparing current vs optimal layout.
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
# File 'lib/pcrd/schema/packer.rb', line 62 def report(columns) optimal_order = optimize(columns) current_size = estimated_row_size(columns) optimal_size = estimated_row_size(optimal_order) saved_bytes = current_size - optimal_size pct = current_size > 0 ? (saved_bytes.to_f / current_size * 100).round(1) : 0.0 { current_columns: columns, optimal_columns: optimal_order, current_layout: layout(columns), optimal_layout: layout(optimal_order), current_size: current_size, optimal_size: optimal_size, saved_bytes: saved_bytes, savings_pct: pct, already_optimal: saved_bytes.zero? } end |
#total_padding(columns) ⇒ Object
Total wasted padding bytes across all columns.
57 58 59 |
# File 'lib/pcrd/schema/packer.rb', line 57 def total_padding(columns) layout(columns).sum(&:padding_before) end |