Class: DuckDB::Connection

Inherits:

Object

Object
DuckDB::Connection

show all

Includes:: TableNameParser

Defined in:: lib/duckdb/connection.rb,
ext/duckdb/connection.c

Overview

The DuckDB::Connection encapsulates connection with DuckDB database.

require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query(sql)

Instance Method Summary collapse

#appender(table, schema: nil, catalog: nil) ⇒ Object

:call-seq: connection.appender(table, schema: nil, catalog: nil) -> DuckDB::Appender connection.appender(table, schema: nil, catalog: nil) { |appender| … } -> self.
#appender_from_query(query, types, table_name = nil, column_names = nil) ⇒ Object

:call-seq: connection.appender_from_query(query, types, table_name = nil, column_names = nil) -> DuckDB::Appender.
#async_query(sql, *args, **kwargs) ⇒ Object (also: #async_execute)

executes sql with args asynchronously.
#connect(db) ⇒ Object (also: #open)

connects DuckDB database The first argument is DuckDB::Database object.
#disconnect ⇒ Object (also: #close)
#expose_as_table(object, name, columns: nil) ⇒ void

Exposes a Ruby object as a queryable DuckDB table function via a registered adapter.
#extract(sql) ⇒ Array<DuckDB::PreparedStatement>

Extracts multiple SQL statements and returns them as prepared statements without executing them.
#interrupt ⇒ nil

Interrupts the currently running query.
#prepared_statement(str) ⇒ Object (also: #prepare)

returns PreparedStatement object.
#query(sql, *args, **kwargs) ⇒ Object (also: #execute)

executes sql with args.
#query_multi_sql(sql) ⇒ Object
#query_progress ⇒ Object

Returns the progress of the currently running query.
#register_aggregate_function(aggregate_function) ⇒ self

Registers an aggregate function with the connection.
#register_aggregate_function_set(aggregate_function_set) ⇒ self

Registers an aggregate function set with the connection.
#register_logical_type(logical_type) ⇒ self

Registers a custom logical type with the connection.
#register_scalar_function(scalar_function = nil, **kwargs) ⇒ void

Registers a scalar function with the connection.
#register_scalar_function_set(scalar_function_set) ⇒ self

Registers a scalar function set with the connection.
#register_table_function(table_function) ⇒ Object

Registers a table function with the database connection.

Instance Method Details

#appender(table, schema: nil, catalog: nil) ⇒ `Object`

:call-seq:

connection.appender(table, schema: nil, catalog: nil) -> DuckDB::Appender
connection.appender(table, schema: nil, catalog: nil) { |appender| ... } -> self

Returns a DuckDB::Appender for bulk-inserting rows into table. If a block is given, the appender is flushed and closed automatically after the block.

schema: and catalog: optionally qualify the table.

Raises DuckDB::Error if the table (or schema/catalog) does not exist.

Table name parsing (quoting, dot-notation) is handled by DuckDB::Appender.new. See DuckDB::Appender.new for details on quoting and dot-notation.

require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE users (id INTEGER, name VARCHAR)')

# block form (recommended) — flushes and closes automatically
con.appender('users') do |a|
  a.append_row(1, 'Alice')
  a.append_row(2, 'Bob')
end

# with schema
con.appender('users', schema: 'main') do |a|
  a.append_row(3, 'Carol')
end

# manual form
appender = con.appender('users')
appender.append_row(4, 'Dave')
appender.close

# File 'lib/duckdb/connection.rb', line 166

def appender(table, schema: nil, catalog: nil, &)
  table, schema, catalog = parse_connection_appender_table(table, schema, catalog)
  run_appender_block(Appender.new(self, table, schema: schema, catalog: catalog), &)
end

#appender_from_query(query, types, table_name = nil, column_names = nil) ⇒ `Object`

:call-seq:

connection.appender_from_query(query, types, table_name = nil, column_names = nil) -> DuckDB::Appender

Creates an appender object that executes the given query with any data appended to it. The ‘table_name` parameter is used to refer to the appended data in the query. If omitted, it defaults to “appended_data”. The `column_names` parameter provides names for the columns of the appended data. If omitted, it defaults to “col1”, “col2”, etc.

require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE t (i INT PRIMARY KEY, value VARCHAR)')
query = 'INSERT OR REPLACE INTO t SELECT i, val FROM my_appended_data'
types = [DuckDB::LogicalType::INTEGER, DuckDB::LogicalType::VARCHAR]
appender = con.appender_from_query(query, types, 'my_appended_data', %w[i val])
appender.append_row(1, 'hello world')
appender.close

# File 'lib/duckdb/connection.rb', line 190

def appender_from_query(query, types, table_name = nil, column_names = nil, &)
  appender = Appender.create_query(self, query, types, table_name, column_names)
  run_appender_block(appender, &)
end

#async_query(sql, *args, **kwargs) ⇒ `Object` Also known as: async_execute

executes sql with args asynchronously. The first argument sql must be SQL string. The rest arguments are parameters of SQL string. This method returns DuckDB::PendingResult object.

require 'duckdb'
db = DuckDB::Database.open('duckdb_file')
con = db.connect

sql = 'SELECT * FROM users WHERE name = $name AND email = $email'
pending_result = con.async_query(sql, name: 'Dave', email: 'dave@example.com')
pending_result.execute_task while pending_result.state == :not_ready
result = pending_result.execute_pending
result.each.first

# File 'lib/duckdb/connection.rb', line 87

def async_query(sql, *args, **kwargs)
  prepare(sql) do |stmt|
    stmt.bind_args(*args, **kwargs)
    stmt.pending_prepared
  end
end

#connect(db) ⇒ `Object` Also known as: open

connects DuckDB database The first argument is DuckDB::Database object

# File 'lib/duckdb/connection.rb', line 96

def connect(db)
  conn = _connect(db)
  return conn unless block_given?

  begin
    yield conn
  ensure
    conn.disconnect
  end
end

#disconnect ⇒ `Object` Also known as: close

# File 'ext/duckdb/connection.c', line 75

static VALUE connection_disconnect(VALUE self) {
    rubyDuckDBConnection *ctx;

    TypedData_Get_Struct(self, rubyDuckDBConnection, &connection_data_type, ctx);
    duckdb_disconnect(&(ctx->con));

    /* Clear registered functions to release memory */
    rb_ary_clear(ctx->registered_functions);

    return self;
}

#expose_as_table(object, name, columns: nil) ⇒ `void`

This method returns an undefined value.

Exposes a Ruby object as a queryable DuckDB table function via a registered adapter.

Looks up a table adapter registered for the object’s class via DuckDB::TableFunction.add_table_adapter, then uses it to create and register a table function under the given name.

Examples:

Expose a CSV as a table

require 'csv'
DuckDB::TableFunction.add_table_adapter(CSV, CSVTableAdapter.new)
csv = CSV.new(File.read('data.csv'), headers: true)
con.expose_as_table(csv, 'csv_table')
con.query('SELECT * FROM csv_table()').to_a

With explicit column types

con.expose_as_table(csv, 'csv_table', columns: {
  'id'   => DuckDB::LogicalType::BIGINT,
  'name' => DuckDB::LogicalType::VARCHAR
})

Parameters:

object (Object) —

the Ruby object to expose as a table (e.g. a CSV instance)
name (String) —

the SQL name of the table function
columns (Hash{String => DuckDB::LogicalType}, nil) (defaults to: nil) —

optional column schema override; if omitted, the adapter determines the columns (e.g. from headers or inference)

Raises:

(ArgumentError) —

if no adapter is registered for the object’s class

# File 'lib/duckdb/connection.rb', line 378

def expose_as_table(object, name, columns: nil)
  adapter = TableFunction.table_adapter_for(object.class)
  raise ArgumentError, "No table adapter registered for #{object.class}" if adapter.nil?

  tf = adapter.call(object, name, columns:)
  register_table_function(tf)
end

#extract(sql) ⇒ `Array<DuckDB::PreparedStatement>`

Extracts multiple SQL statements and returns them as prepared statements without executing them. The caller controls when to execute and destroy each statement.

Examples:

Execute each statement and destroy

stmts = con.extract('SELECT 1; SELECT 2; SELECT 3;')
stmts.each do |stmt|
  result = stmt.execute
  # process result...
  stmt.destroy
end

Parameters:

sql (String) —

a string containing one or more SQL statements separated by semicolons

Returns:

(Array<DuckDB::PreparedStatement>) —

the extracted prepared statements

Raises:

(DuckDB::Error) —

if the SQL is invalid

# File 'lib/duckdb/connection.rb', line 66

def extract(sql)
  stmts = ExtractedStatements.new(self, sql)
  stmts.to_a
ensure
  stmts&.destroy
end

#interrupt ⇒ `nil`

Interrupts the currently running query.

db = DuckDB::Database.open
conn = db.connect
con.query('SET ENABLE_PROGRESS_BAR=true')
con.query('SET ENABLE_PROGRESS_BAR_PRINT=false')
pending_result = con.async_query('slow query')

pending_result.execute_task
con.interrupt # => nil

Returns:

(nil)

# File 'ext/duckdb/connection.c', line 102

static VALUE connection_interrupt(VALUE self) {
    rubyDuckDBConnection *ctx;

    TypedData_Get_Struct(self, rubyDuckDBConnection, &connection_data_type, ctx);
    duckdb_interrupt(ctx->con);

    return Qnil;
}

#prepared_statement(str) ⇒ `Object` Also known as: prepare

returns PreparedStatement object. The first argument is SQL string. If block is given, the block is executed with PreparedStatement object and the object is cleaned up immediately.

require 'duckdb'
db = DuckDB::Database.open('duckdb_file')
con = db.connect

sql = 'SELECT * FROM users WHERE name = $name AND email = $email'
stmt = con.prepared_statement(sql)
stmt.bind_args(name: 'Dave', email: 'dave@example.com')
result = stmt.execute

# or
result = con.prepared_statement(sql) do |stmt|
           stmt.bind_args(name: 'Dave', email: 'dave@example.com')
           stmt.execute
         end

# File 'lib/duckdb/connection.rb', line 126

def prepared_statement(str, &)
  return PreparedStatement.new(self, str) unless block_given?

  PreparedStatement.prepare(self, str, &)
end

#query(sql, *args, **kwargs) ⇒ `Object` Also known as: execute

executes sql with args. The first argument sql must be SQL string. The rest arguments are parameters of SQL string.

require 'duckdb'
db = DuckDB::Database.open('duckdb_file')
con = db.connect
users = con.query('SELECT * FROM users')
sql = 'SELECT * FROM users WHERE name = ? AND email = ?'
dave = con.query(sql, 'Dave', 'dave@example.com')

# or You can use named parameter.

sql = 'SELECT * FROM users WHERE name = $name AND email = $email'
dave = con.query(sql, name: 'Dave', email: 'dave@example.com')

# File 'lib/duckdb/connection.rb', line 28

def query(sql, *args, **kwargs)
  return query_multi_sql(sql) if args.empty? && kwargs.empty?

  prepare(sql) do |stmt|
    stmt.bind_args(*args, **kwargs)
    stmt.execute
  end
end

#query_multi_sql(sql) ⇒ `Object`

# File 'lib/duckdb/connection.rb', line 37

def query_multi_sql(sql)
  stmts = ExtractedStatements.new(self, sql)
  return _query_sql(sql) if stmts.size == 1

  result = nil
  stmts.each do |stmt|
    result = stmt.execute
    stmt.destroy
  end
  result
ensure
  stmts&.destroy
end

#query_progress ⇒ `Object`

Returns the progress of the currently running query.

require 'duckdb'

db = DuckDB::Database.open
conn = db.connect
con.query('SET ENABLE_PROGRESS_BAR=true')
con.query('SET ENABLE_PROGRESS_BAR_PRINT=false')
con.query_progress # => -1.0
pending_result = con.async_query('slow query')
con.query_progress # => 0.0
pending_result.execute_task
con.query_progress # => Float

# File 'ext/duckdb/connection.c', line 126

static VALUE connection_query_progress(VALUE self) {
    rubyDuckDBConnection *ctx;
    duckdb_query_progress_type progress;

    TypedData_Get_Struct(self, rubyDuckDBConnection, &connection_data_type, ctx);
    progress = duckdb_query_progress(ctx->con);

    return rb_funcall(mDuckDBConverter, rb_intern("_to_query_progress"), 3, DBL2NUM(progress.percentage), ULL2NUM(progress.rows_processed), ULL2NUM(progress.total_rows_to_process));
}

#register_aggregate_function(aggregate_function) ⇒ `self`

Registers an aggregate function with the connection.

Parameters:

aggregate_function (DuckDB::AggregateFunction) —

the aggregate function to register

Returns:

(self)

Raises:

(TypeError) —

if argument is not a DuckDB::AggregateFunction

# File 'lib/duckdb/connection.rb', line 329

def register_aggregate_function(aggregate_function)
  unless aggregate_function.is_a?(AggregateFunction)
    raise TypeError, "#{aggregate_function.class} is not a DuckDB::AggregateFunction"
  end

  _register_aggregate_function(aggregate_function)
end

#register_aggregate_function_set(aggregate_function_set) ⇒ `self`

Registers an aggregate function set with the connection. An aggregate function set groups multiple overloads of an aggregate function under one name, allowing DuckDB to dispatch to the correct implementation based on argument types.

Examples:

Register multiple overloads under one name

af_bigint = DuckDB::AggregateFunction.new
af_bigint.name = 'my_sum'
af_bigint.return_type = DuckDB::LogicalType::BIGINT
af_bigint.add_parameter(DuckDB::LogicalType::BIGINT)
af_bigint.set_init   { 0 }
af_bigint.set_update  { |state, val| state + val }
af_bigint.set_combine { |s1, s2| s1 + s2 }

af_double = DuckDB::AggregateFunction.new
af_double.name = 'my_sum'
af_double.return_type = DuckDB::LogicalType::DOUBLE
af_double.add_parameter(DuckDB::LogicalType::DOUBLE)
af_double.set_init   { 0.0 }
af_double.set_update  { |state, val| state + val }
af_double.set_combine { |s1, s2| s1 + s2 }

set = DuckDB::AggregateFunctionSet.new(:my_sum)
set.add(af_bigint).add(af_double)
con.register_aggregate_function_set(set)

Parameters:

aggregate_function_set (DuckDB::AggregateFunctionSet) —

the function set to register

Returns:

(self)

Raises:

(TypeError) —

if argument is not a DuckDB::AggregateFunctionSet

# File 'lib/duckdb/connection.rb', line 316

def register_aggregate_function_set(aggregate_function_set)
  unless aggregate_function_set.is_a?(AggregateFunctionSet)
    raise TypeError, "#{aggregate_function_set.class} is not a DuckDB::AggregateFunctionSet"
  end

  _register_aggregate_function_set(aggregate_function_set)
end

#register_logical_type(logical_type) ⇒ `self`

Registers a custom logical type with the connection. The logical type must have an alias set via LogicalType#alias= before registration. The alias becomes the SQL type name.

Examples:

Register an enum type

mood = DuckDB::LogicalType.create_enum('happy', 'sad', 'neutral')
mood.alias = 'mood'
con.register_logical_type(mood)
con.query('CREATE TABLE t (m mood)')

Parameters:

logical_type (DuckDB::LogicalType) —

the logical type to register

Returns:

(self)

Raises:

(TypeError) —

if argument is not a DuckDB::LogicalType
(DuckDB::Error) —

if the type has no alias set or registration fails

# File 'lib/duckdb/connection.rb', line 210

def register_logical_type(logical_type)
  raise TypeError, "#{logical_type.class} is not a DuckDB::LogicalType" unless logical_type.is_a?(LogicalType)

  _register_logical_type(logical_type)
end

#register_scalar_function(scalar_function) ⇒ `void` #register_scalar_function(name:, return_type:, **kwargs) {|*args| ... } ⇒ `void`

This method returns an undefined value.

Registers a scalar function with the connection.

Examples:

Register pre-created function

sf = DuckDB::ScalarFunction.create(
  name: :triple,
  return_type: DuckDB::LogicalType::INTEGER,
  parameter_type: DuckDB::LogicalType::INTEGER
) { |v| v * 3 }
con.register_scalar_function(sf)

Register inline (single parameter)

con.register_scalar_function(
  name: :triple,
  return_type: DuckDB::LogicalType::INTEGER,
  parameter_type: DuckDB::LogicalType::INTEGER
) { |v| v * 3 }

Register inline (multiple parameters)

con.register_scalar_function(
  name: :add,
  return_type: DuckDB::LogicalType::INTEGER,
  parameter_types: [DuckDB::LogicalType::INTEGER, DuckDB::LogicalType::INTEGER]
) { |a, b| a + b }

Overloads:

#register_scalar_function(scalar_function) ⇒ void

Register a pre-created ScalarFunction object.
Parameters:
- scalar_function (DuckDB::ScalarFunction) —
  
  the scalar function to register
#register_scalar_function(name:, return_type:, **kwargs) {|*args| ... } ⇒ void

Create and register a scalar function inline.
Parameters:
- name (String, Symbol) —
  
  the function name
- return_type (DuckDB::LogicalType) —
  
  the return type
- parameter_type (DuckDB::LogicalType, nil) —
  
  single parameter type
- parameter_types (Array<DuckDB::LogicalType>, nil) —
  
  multiple parameter types
Yields:
- (*args) —
  
  the function implementation

Raises:

(ArgumentError) —

if both object and keywords/block are provided

# File 'lib/duckdb/connection.rb', line 254

def register_scalar_function(scalar_function = nil, **kwargs, &)
  # Validate: can't pass both object and inline arguments
  if scalar_function.is_a?(ScalarFunction)
    raise ArgumentError, 'Cannot pass both ScalarFunction object and keyword arguments' if kwargs.any?

    raise ArgumentError, 'Cannot pass both ScalarFunction object and block' if block_given?
  end

  sf = scalar_function || ScalarFunction.create(**kwargs, &)
  _register_scalar_function(sf)
end

#register_scalar_function_set(scalar_function_set) ⇒ `self`

Registers a scalar function set with the connection. A scalar function set groups multiple overloads of a function under one name, allowing DuckDB to dispatch to the correct implementation based on argument types.

Examples:

Register multiple overloads under one name

add_int = DuckDB::ScalarFunction.create(return_type: :integer, parameter_types: %i[integer integer]) { |a, b| a + b }
add_dbl = DuckDB::ScalarFunction.create(return_type: :double,  parameter_types: %i[double  double])  { |a, b| a + b }
set = DuckDB::ScalarFunctionSet.new(:add)
set.add(add_int).add(add_dbl)
con.register_scalar_function_set(set)

Parameters:

scalar_function_set (DuckDB::ScalarFunctionSet) —

the function set to register

Returns:

(self)

Raises:

(TypeError) —

if argument is not a DuckDB::ScalarFunctionSet

# File 'lib/duckdb/connection.rb', line 280

def register_scalar_function_set(scalar_function_set)
  unless scalar_function_set.is_a?(ScalarFunctionSet)
    raise TypeError, "#{scalar_function_set.class} is not a DuckDB::ScalarFunctionSet"
  end

  _register_scalar_function_set(scalar_function_set)
end

#register_table_function(table_function) ⇒ `Object`

Registers a table function with the database connection.

table_function = DuckDB::TableFunction.new
table_function.name = 'my_function'
table_function.bind { |bind_info| ... }
table_function.execute { |func_info, output| ... }
connection.register_table_function(table_function)

Raises:

(ArgumentError)

# File 'lib/duckdb/connection.rb', line 346

def register_table_function(table_function)
  raise ArgumentError, 'table_function must be a TableFunction' unless table_function.is_a?(TableFunction)

  _register_table_function(table_function)
end

Class: DuckDB::Connection

Overview

Instance Method Summary collapse

Instance Method Details

#appender(table, schema: nil, catalog: nil) ⇒ Object

#appender_from_query(query, types, table_name = nil, column_names = nil) ⇒ Object

#async_query(sql, *args, **kwargs) ⇒ Object Also known as: async_execute

#connect(db) ⇒ Object Also known as: open

#disconnect ⇒ Object Also known as: close

#expose_as_table(object, name, columns: nil) ⇒ void

Examples:

Expose a CSV as a table

With explicit column types

#extract(sql) ⇒ Array<DuckDB::PreparedStatement>

Examples:

Execute each statement and destroy

#interrupt ⇒ nil

#prepared_statement(str) ⇒ Object Also known as: prepare

#query(sql, *args, **kwargs) ⇒ Object Also known as: execute

#query_multi_sql(sql) ⇒ Object

#query_progress ⇒ Object

#register_aggregate_function(aggregate_function) ⇒ self

#register_aggregate_function_set(aggregate_function_set) ⇒ self

Examples:

Register multiple overloads under one name

#register_logical_type(logical_type) ⇒ self

Examples:

Register an enum type

#register_scalar_function(scalar_function) ⇒ void #register_scalar_function(name:, return_type:, **kwargs) {|*args| ... } ⇒ void

Examples:

Register pre-created function

Register inline (single parameter)

Register inline (multiple parameters)

#register_scalar_function_set(scalar_function_set) ⇒ self

Examples:

Register multiple overloads under one name

#register_table_function(table_function) ⇒ Object

#appender(table, schema: nil, catalog: nil) ⇒ `Object`

#appender_from_query(query, types, table_name = nil, column_names = nil) ⇒ `Object`

#async_query(sql, *args, **kwargs) ⇒ `Object` Also known as: async_execute

#connect(db) ⇒ `Object` Also known as: open

#disconnect ⇒ `Object` Also known as: close

#expose_as_table(object, name, columns: nil) ⇒ `void`

#extract(sql) ⇒ `Array<DuckDB::PreparedStatement>`

#interrupt ⇒ `nil`

#prepared_statement(str) ⇒ `Object` Also known as: prepare

#query(sql, *args, **kwargs) ⇒ `Object` Also known as: execute

#query_multi_sql(sql) ⇒ `Object`

#query_progress ⇒ `Object`

#register_aggregate_function(aggregate_function) ⇒ `self`

#register_aggregate_function_set(aggregate_function_set) ⇒ `self`

#register_logical_type(logical_type) ⇒ `self`

#register_scalar_function(scalar_function) ⇒ `void` #register_scalar_function(name:, return_type:, **kwargs) {|*args| ... } ⇒ `void`

#register_scalar_function_set(scalar_function_set) ⇒ `self`

#register_table_function(table_function) ⇒ `Object`