Class: DuckDB::Connection

Inherits:
Object
  • Object
show all
Includes:
TableNameParser
Defined in:
lib/duckdb/connection.rb,
ext/duckdb/connection.c

Overview

The DuckDB::Connection encapsulates connection with DuckDB database.

require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query(sql)

Instance Method Summary collapse

Instance Method Details

#appender(table, schema: nil, catalog: nil) ⇒ Object

:call-seq:

connection.appender(table, schema: nil, catalog: nil) -> DuckDB::Appender
connection.appender(table, schema: nil, catalog: nil) { |appender| ... } -> self

Returns a DuckDB::Appender for bulk-inserting rows into table. If a block is given, the appender is flushed and closed automatically after the block.

schema: and catalog: optionally qualify the table.

Raises DuckDB::Error if the table (or schema/catalog) does not exist.

Table name parsing (quoting, dot-notation) is handled by DuckDB::Appender.new. See DuckDB::Appender.new for details on quoting and dot-notation.

require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE users (id INTEGER, name VARCHAR)')

# block form (recommended) — flushes and closes automatically
con.appender('users') do |a|
  a.append_row(1, 'Alice')
  a.append_row(2, 'Bob')
end

# with schema
con.appender('users', schema: 'main') do |a|
  a.append_row(3, 'Carol')
end

# manual form
appender = con.appender('users')
appender.append_row(4, 'Dave')
appender.close


166
167
168
169
# File 'lib/duckdb/connection.rb', line 166

def appender(table, schema: nil, catalog: nil, &)
  table, schema, catalog = parse_connection_appender_table(table, schema, catalog)
  run_appender_block(Appender.new(self, table, schema: schema, catalog: catalog), &)
end

#appender_from_query(query, types, table_name = nil, column_names = nil) ⇒ Object

:call-seq:

connection.appender_from_query(query, types, table_name = nil, column_names = nil) -> DuckDB::Appender

Creates an appender object that executes the given query with any data appended to it. The ‘table_name` parameter is used to refer to the appended data in the query. If omitted, it defaults to “appended_data”. The `column_names` parameter provides names for the columns of the appended data. If omitted, it defaults to “col1”, “col2”, etc.

require 'duckdb'
db = DuckDB::Database.open
con = db.connect
con.query('CREATE TABLE t (i INT PRIMARY KEY, value VARCHAR)')
query = 'INSERT OR REPLACE INTO t SELECT i, val FROM my_appended_data'
types = [DuckDB::LogicalType::INTEGER, DuckDB::LogicalType::VARCHAR]
appender = con.appender_from_query(query, types, 'my_appended_data', %w[i val])
appender.append_row(1, 'hello world')
appender.close


190
191
192
193
# File 'lib/duckdb/connection.rb', line 190

def appender_from_query(query, types, table_name = nil, column_names = nil, &)
  appender = Appender.create_query(self, query, types, table_name, column_names)
  run_appender_block(appender, &)
end

#async_query(sql, *args, **kwargs) ⇒ Object Also known as: async_execute

executes sql with args asynchronously. The first argument sql must be SQL string. The rest arguments are parameters of SQL string. This method returns DuckDB::PendingResult object.

require 'duckdb'
db = DuckDB::Database.open('duckdb_file')
con = db.connect

sql = 'SELECT * FROM users WHERE name = $name AND email = $email'
pending_result = con.async_query(sql, name: 'Dave', email: 'dave@example.com')
pending_result.execute_task while pending_result.state == :not_ready
result = pending_result.execute_pending
result.each.first


87
88
89
90
91
92
# File 'lib/duckdb/connection.rb', line 87

def async_query(sql, *args, **kwargs)
  prepare(sql) do |stmt|
    stmt.bind_args(*args, **kwargs)
    stmt.pending_prepared
  end
end

#connect(db) ⇒ Object Also known as: open

connects DuckDB database The first argument is DuckDB::Database object



96
97
98
99
100
101
102
103
104
105
# File 'lib/duckdb/connection.rb', line 96

def connect(db)
  conn = _connect(db)
  return conn unless block_given?

  begin
    yield conn
  ensure
    conn.disconnect
  end
end

#disconnectObject Also known as: close



75
76
77
78
79
80
81
82
83
84
85
# File 'ext/duckdb/connection.c', line 75

static VALUE connection_disconnect(VALUE self) {
    rubyDuckDBConnection *ctx;

    TypedData_Get_Struct(self, rubyDuckDBConnection, &connection_data_type, ctx);
    duckdb_disconnect(&(ctx->con));

    /* Clear registered functions to release memory */
    rb_ary_clear(ctx->registered_functions);

    return self;
}

#expose_as_table(object, name, columns: nil) ⇒ void

This method returns an undefined value.

Exposes a Ruby object as a queryable DuckDB table function via a registered adapter.

Looks up a table adapter registered for the object’s class via DuckDB::TableFunction.add_table_adapter, then uses it to create and register a table function under the given name.

Examples:

Expose a CSV as a table

require 'csv'
DuckDB::TableFunction.add_table_adapter(CSV, CSVTableAdapter.new)
csv = CSV.new(File.read('data.csv'), headers: true)
con.expose_as_table(csv, 'csv_table')
con.query('SELECT * FROM csv_table()').to_a

With explicit column types

con.expose_as_table(csv, 'csv_table', columns: {
  'id'   => DuckDB::LogicalType::BIGINT,
  'name' => DuckDB::LogicalType::VARCHAR
})

Parameters:

  • object (Object)

    the Ruby object to expose as a table (e.g. a CSV instance)

  • name (String)

    the SQL name of the table function

  • columns (Hash{String => DuckDB::LogicalType}, nil) (defaults to: nil)

    optional column schema override; if omitted, the adapter determines the columns (e.g. from headers or inference)

Raises:

  • (ArgumentError)

    if no adapter is registered for the object’s class



378
379
380
381
382
383
384
# File 'lib/duckdb/connection.rb', line 378

def expose_as_table(object, name, columns: nil)
  adapter = TableFunction.table_adapter_for(object.class)
  raise ArgumentError, "No table adapter registered for #{object.class}" if adapter.nil?

  tf = adapter.call(object, name, columns:)
  register_table_function(tf)
end

#extract(sql) ⇒ Array<DuckDB::PreparedStatement>

Extracts multiple SQL statements and returns them as prepared statements without executing them. The caller controls when to execute and destroy each statement.

Examples:

Execute each statement and destroy

stmts = con.extract('SELECT 1; SELECT 2; SELECT 3;')
stmts.each do |stmt|
  result = stmt.execute
  # process result...
  stmt.destroy
end

Parameters:

  • sql (String)

    a string containing one or more SQL statements separated by semicolons

Returns:

Raises:



66
67
68
69
70
71
# File 'lib/duckdb/connection.rb', line 66

def extract(sql)
  stmts = ExtractedStatements.new(self, sql)
  stmts.to_a
ensure
  stmts&.destroy
end

#interruptnil

Interrupts the currently running query.

db = DuckDB::Database.open
conn = db.connect
con.query('SET ENABLE_PROGRESS_BAR=true')
con.query('SET ENABLE_PROGRESS_BAR_PRINT=false')
pending_result = con.async_query('slow query')

pending_result.execute_task
con.interrupt # => nil

Returns:

  • (nil)


102
103
104
105
106
107
108
109
# File 'ext/duckdb/connection.c', line 102

static VALUE connection_interrupt(VALUE self) {
    rubyDuckDBConnection *ctx;

    TypedData_Get_Struct(self, rubyDuckDBConnection, &connection_data_type, ctx);
    duckdb_interrupt(ctx->con);

    return Qnil;
}

#prepared_statement(str) ⇒ Object Also known as: prepare

returns PreparedStatement object. The first argument is SQL string. If block is given, the block is executed with PreparedStatement object and the object is cleaned up immediately.

require 'duckdb'
db = DuckDB::Database.open('duckdb_file')
con = db.connect

sql = 'SELECT * FROM users WHERE name = $name AND email = $email'
stmt = con.prepared_statement(sql)
stmt.bind_args(name: 'Dave', email: 'dave@example.com')
result = stmt.execute

# or
result = con.prepared_statement(sql) do |stmt|
           stmt.bind_args(name: 'Dave', email: 'dave@example.com')
           stmt.execute
         end


126
127
128
129
130
# File 'lib/duckdb/connection.rb', line 126

def prepared_statement(str, &)
  return PreparedStatement.new(self, str) unless block_given?

  PreparedStatement.prepare(self, str, &)
end

#query(sql, *args, **kwargs) ⇒ Object Also known as: execute

executes sql with args. The first argument sql must be SQL string. The rest arguments are parameters of SQL string.

require 'duckdb'
db = DuckDB::Database.open('duckdb_file')
con = db.connect
users = con.query('SELECT * FROM users')
sql = 'SELECT * FROM users WHERE name = ? AND email = ?'
dave = con.query(sql, 'Dave', 'dave@example.com')

# or You can use named parameter.

sql = 'SELECT * FROM users WHERE name = $name AND email = $email'
dave = con.query(sql, name: 'Dave', email: 'dave@example.com')


28
29
30
31
32
33
34
35
# File 'lib/duckdb/connection.rb', line 28

def query(sql, *args, **kwargs)
  return query_multi_sql(sql) if args.empty? && kwargs.empty?

  prepare(sql) do |stmt|
    stmt.bind_args(*args, **kwargs)
    stmt.execute
  end
end

#query_multi_sql(sql) ⇒ Object



37
38
39
40
41
42
43
44
45
46
47
48
49
# File 'lib/duckdb/connection.rb', line 37

def query_multi_sql(sql)
  stmts = ExtractedStatements.new(self, sql)
  return _query_sql(sql) if stmts.size == 1

  result = nil
  stmts.each do |stmt|
    result = stmt.execute
    stmt.destroy
  end
  result
ensure
  stmts&.destroy
end

#query_progressObject

Returns the progress of the currently running query.

require 'duckdb'

db = DuckDB::Database.open
conn = db.connect
con.query('SET ENABLE_PROGRESS_BAR=true')
con.query('SET ENABLE_PROGRESS_BAR_PRINT=false')
con.query_progress # => -1.0
pending_result = con.async_query('slow query')
con.query_progress # => 0.0
pending_result.execute_task
con.query_progress # => Float


126
127
128
129
130
131
132
133
134
# File 'ext/duckdb/connection.c', line 126

static VALUE connection_query_progress(VALUE self) {
    rubyDuckDBConnection *ctx;
    duckdb_query_progress_type progress;

    TypedData_Get_Struct(self, rubyDuckDBConnection, &connection_data_type, ctx);
    progress = duckdb_query_progress(ctx->con);

    return rb_funcall(mDuckDBConverter, rb_intern("_to_query_progress"), 3, DBL2NUM(progress.percentage), ULL2NUM(progress.rows_processed), ULL2NUM(progress.total_rows_to_process));
}

#register_aggregate_function(aggregate_function) ⇒ self

Registers an aggregate function with the connection.

Parameters:

Returns:

  • (self)

Raises:

  • (TypeError)

    if argument is not a DuckDB::AggregateFunction



329
330
331
332
333
334
335
# File 'lib/duckdb/connection.rb', line 329

def register_aggregate_function(aggregate_function)
  unless aggregate_function.is_a?(AggregateFunction)
    raise TypeError, "#{aggregate_function.class} is not a DuckDB::AggregateFunction"
  end

  _register_aggregate_function(aggregate_function)
end

#register_aggregate_function_set(aggregate_function_set) ⇒ self

Registers an aggregate function set with the connection. An aggregate function set groups multiple overloads of an aggregate function under one name, allowing DuckDB to dispatch to the correct implementation based on argument types.

Examples:

Register multiple overloads under one name

af_bigint = DuckDB::AggregateFunction.new
af_bigint.name = 'my_sum'
af_bigint.return_type = DuckDB::LogicalType::BIGINT
af_bigint.add_parameter(DuckDB::LogicalType::BIGINT)
af_bigint.set_init   { 0 }
af_bigint.set_update  { |state, val| state + val }
af_bigint.set_combine { |s1, s2| s1 + s2 }

af_double = DuckDB::AggregateFunction.new
af_double.name = 'my_sum'
af_double.return_type = DuckDB::LogicalType::DOUBLE
af_double.add_parameter(DuckDB::LogicalType::DOUBLE)
af_double.set_init   { 0.0 }
af_double.set_update  { |state, val| state + val }
af_double.set_combine { |s1, s2| s1 + s2 }

set = DuckDB::AggregateFunctionSet.new(:my_sum)
set.add(af_bigint).add(af_double)
con.register_aggregate_function_set(set)

Parameters:

Returns:

  • (self)

Raises:

  • (TypeError)

    if argument is not a DuckDB::AggregateFunctionSet



316
317
318
319
320
321
322
# File 'lib/duckdb/connection.rb', line 316

def register_aggregate_function_set(aggregate_function_set)
  unless aggregate_function_set.is_a?(AggregateFunctionSet)
    raise TypeError, "#{aggregate_function_set.class} is not a DuckDB::AggregateFunctionSet"
  end

  _register_aggregate_function_set(aggregate_function_set)
end

#register_logical_type(logical_type) ⇒ self

Registers a custom logical type with the connection. The logical type must have an alias set via LogicalType#alias= before registration. The alias becomes the SQL type name.

Examples:

Register an enum type

mood = DuckDB::LogicalType.create_enum('happy', 'sad', 'neutral')
mood.alias = 'mood'
con.register_logical_type(mood)
con.query('CREATE TABLE t (m mood)')

Parameters:

Returns:

  • (self)

Raises:

  • (TypeError)

    if argument is not a DuckDB::LogicalType

  • (DuckDB::Error)

    if the type has no alias set or registration fails



210
211
212
213
214
# File 'lib/duckdb/connection.rb', line 210

def register_logical_type(logical_type)
  raise TypeError, "#{logical_type.class} is not a DuckDB::LogicalType" unless logical_type.is_a?(LogicalType)

  _register_logical_type(logical_type)
end

#register_scalar_function(scalar_function) ⇒ void #register_scalar_function(name:, return_type:, **kwargs) {|*args| ... } ⇒ void

This method returns an undefined value.

Registers a scalar function with the connection.

Examples:

Register pre-created function

sf = DuckDB::ScalarFunction.create(
  name: :triple,
  return_type: DuckDB::LogicalType::INTEGER,
  parameter_type: DuckDB::LogicalType::INTEGER
) { |v| v * 3 }
con.register_scalar_function(sf)

Register inline (single parameter)

con.register_scalar_function(
  name: :triple,
  return_type: DuckDB::LogicalType::INTEGER,
  parameter_type: DuckDB::LogicalType::INTEGER
) { |v| v * 3 }

Register inline (multiple parameters)

con.register_scalar_function(
  name: :add,
  return_type: DuckDB::LogicalType::INTEGER,
  parameter_types: [DuckDB::LogicalType::INTEGER, DuckDB::LogicalType::INTEGER]
) { |a, b| a + b }

Overloads:

  • #register_scalar_function(scalar_function) ⇒ void

    Register a pre-created ScalarFunction object.

    Parameters:

  • #register_scalar_function(name:, return_type:, **kwargs) {|*args| ... } ⇒ void

    Create and register a scalar function inline.

    Parameters:

    Yields:

    • (*args)

      the function implementation

Raises:

  • (ArgumentError)

    if both object and keywords/block are provided



254
255
256
257
258
259
260
261
262
263
264
# File 'lib/duckdb/connection.rb', line 254

def register_scalar_function(scalar_function = nil, **kwargs, &)
  # Validate: can't pass both object and inline arguments
  if scalar_function.is_a?(ScalarFunction)
    raise ArgumentError, 'Cannot pass both ScalarFunction object and keyword arguments' if kwargs.any?

    raise ArgumentError, 'Cannot pass both ScalarFunction object and block' if block_given?
  end

  sf = scalar_function || ScalarFunction.create(**kwargs, &)
  _register_scalar_function(sf)
end

#register_scalar_function_set(scalar_function_set) ⇒ self

Registers a scalar function set with the connection. A scalar function set groups multiple overloads of a function under one name, allowing DuckDB to dispatch to the correct implementation based on argument types.

Examples:

Register multiple overloads under one name

add_int = DuckDB::ScalarFunction.create(return_type: :integer, parameter_types: %i[integer integer]) { |a, b| a + b }
add_dbl = DuckDB::ScalarFunction.create(return_type: :double,  parameter_types: %i[double  double])  { |a, b| a + b }
set = DuckDB::ScalarFunctionSet.new(:add)
set.add(add_int).add(add_dbl)
con.register_scalar_function_set(set)

Parameters:

Returns:

  • (self)

Raises:

  • (TypeError)

    if argument is not a DuckDB::ScalarFunctionSet



280
281
282
283
284
285
286
# File 'lib/duckdb/connection.rb', line 280

def register_scalar_function_set(scalar_function_set)
  unless scalar_function_set.is_a?(ScalarFunctionSet)
    raise TypeError, "#{scalar_function_set.class} is not a DuckDB::ScalarFunctionSet"
  end

  _register_scalar_function_set(scalar_function_set)
end

#register_table_function(table_function) ⇒ Object

Registers a table function with the database connection.

table_function = DuckDB::TableFunction.new
table_function.name = 'my_function'
table_function.bind { |bind_info| ... }
table_function.execute { |func_info, output| ... }
connection.register_table_function(table_function)

Raises:

  • (ArgumentError)


346
347
348
349
350
# File 'lib/duckdb/connection.rb', line 346

def register_table_function(table_function)
  raise ArgumentError, 'table_function must be a TableFunction' unless table_function.is_a?(TableFunction)

  _register_table_function(table_function)
end