Module: SparkConnect::Functions

Extended by:: Functions

Included in:: Functions

Defined in:: lib/spark_connect/functions.rb

Overview

The standard Spark SQL function library, mirroring PySpark’s ‘pyspark.sql.functions`. Every function returns a Column.

Available both as ‘SparkConnect::Functions` and the shorthand `SparkConnect::F`. All methods are module functions.

Following PySpark’s convention, a String argument denotes a **column name** for most functions (e.g. ‘F.sum(“salary”)` aggregates the `salary` column), while functions whose parameters are genuinely literal (regex patterns, date formats, JSON paths, …) treat their String arguments as literal values.

Examples:

F = SparkConnect::F
F.col("a") + F.lit(1)
F.when(F.col("x") > 0, "pos").otherwise("non-pos")
F.sum("amount").alias("total")

Constant Summary collapse

Proto =

SparkConnect::Proto

UNIFORM = The following functions are generated programmatically below (‘UNIFORM` and `NO_ARG`). The `@!method` directives document them so they appear in the API reference; each returns a Column. —- Generated uniform functions ————————————– Functions whose arguments are all ColumnOrName (a String denotes a column name). Defined programmatically to keep the surface complete and compact.

%w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

NO_ARG = No-argument functions.

%w[
  current_date current_timestamp now current_timezone current_user current_catalog
  current_database current_schema monotonically_increasing_id spark_partition_id
  input_file_name input_file_block_start input_file_block_length version uuid
  row_number rank dense_rank percent_rank cume_dist
].freeze

Class Attribute Summary collapse

.lambda_counter ⇒ Object private

Instance Method Summary collapse

#_col(value) ⇒ Object private

ColumnOrName coercion: String/Symbol -> column reference, Column -> itself, everything else -> literal.
#_lambda(block) ⇒ Object private

Build a Column wrapping a LambdaFunction from a Ruby block.
#_lit_or_col(value) ⇒ Object private
#abs(*cols) ⇒ Column

The Spark SQL ‘abs` function.
#acos(*cols) ⇒ Column

The Spark SQL ‘acos` function.
#acosh(*cols) ⇒ Column

The Spark SQL ‘acosh` function.
#add_months(col, months) ⇒ Object
#aggregate(col, initial, merge, finish = nil) ⇒ Column

Aggregate (fold) an array.
#any_value(*cols) ⇒ Column

The Spark SQL ‘any_value` function.
#approx_count_distinct(col, rsd = nil) ⇒ Column

Approximate distinct count (optionally with relative SD).
#array(*cols) ⇒ Column

An array from the given columns.
#array_append(col, value) ⇒ Object
#array_compact(*cols) ⇒ Column

The Spark SQL ‘array_compact` function.
#array_contains(col, value) ⇒ Object

—- Array / map functions with value arguments ———————–.
#array_distinct(*cols) ⇒ Column

The Spark SQL ‘array_distinct` function.
#array_except(*cols) ⇒ Column

The Spark SQL ‘array_except` function.
#array_insert(col, pos, value) ⇒ Object
#array_intersect(*cols) ⇒ Column

The Spark SQL ‘array_intersect` function.
#array_join(col, delimiter, null_replacement = nil) ⇒ Object
#array_max(*cols) ⇒ Column

The Spark SQL ‘array_max` function.
#array_min(*cols) ⇒ Column

The Spark SQL ‘array_min` function.
#array_position(col, value) ⇒ Object
#array_prepend(col, value) ⇒ Object
#array_remove(col, element) ⇒ Object
#array_repeat(col, count) ⇒ Object
#array_sort(*cols) ⇒ Column

The Spark SQL ‘array_sort` function.
#array_union(*cols) ⇒ Column

The Spark SQL ‘array_union` function.
#arrays_overlap(*cols) ⇒ Column

The Spark SQL ‘arrays_overlap` function.
#arrays_zip(*cols) ⇒ Column

The Spark SQL ‘arrays_zip` function.
#asc(col) ⇒ Column

An ascending sort order for the named/given column.
#asc_nulls_first(col) ⇒ Object
#asc_nulls_last(col) ⇒ Object
#ascii(*cols) ⇒ Column

The Spark SQL ‘ascii` function.
#asin(*cols) ⇒ Column

The Spark SQL ‘asin` function.
#asinh(*cols) ⇒ Column

The Spark SQL ‘asinh` function.
#atan(*cols) ⇒ Column

The Spark SQL ‘atan` function.
#atan2(*cols) ⇒ Column

The Spark SQL ‘atan2` function.
#atanh(*cols) ⇒ Column

The Spark SQL ‘atanh` function.
#avg(*cols) ⇒ Column

The Spark SQL ‘avg` function.
#base64(*cols) ⇒ Column

The Spark SQL ‘base64` function.
#bin(*cols) ⇒ Column

The Spark SQL ‘bin` function.
#bit_and(*cols) ⇒ Column

The Spark SQL ‘bit_and` function.
#bit_count(*cols) ⇒ Column

The Spark SQL ‘bit_count` function.
#bit_length(*cols) ⇒ Column

The Spark SQL ‘bit_length` function.
#bit_or(*cols) ⇒ Column

The Spark SQL ‘bit_or` function.
#bit_xor(*cols) ⇒ Column

The Spark SQL ‘bit_xor` function.
#bitwise_not(*cols) ⇒ Column

The Spark SQL ‘bitwise_not` function.
#bool_and(*cols) ⇒ Column

The Spark SQL ‘bool_and` function.
#bool_or(*cols) ⇒ Column

The Spark SQL ‘bool_or` function.
#broadcast(df) ⇒ DataFrame

Mark a DataFrame for broadcast (map-side) join.
#bround(col, scale = 0) ⇒ Column

HALF_EVEN (“banker’s”) rounding to ‘scale` places.
#cardinality(*cols) ⇒ Column

The Spark SQL ‘cardinality` function.
#cbrt(*cols) ⇒ Column

The Spark SQL ‘cbrt` function.
#ceil(*cols) ⇒ Column

The Spark SQL ‘ceil` function.
#ceiling(*cols) ⇒ Column

The Spark SQL ‘ceiling` function.
#char_length(*cols) ⇒ Column

The Spark SQL ‘char_length` function.
#character_length(*cols) ⇒ Column

The Spark SQL ‘character_length` function.
#coalesce(*cols) ⇒ Column

First non-null among the given columns.
#col(name) ⇒ Column (also: #column)

A column reference by name.
#collect_list(*cols) ⇒ Column

The Spark SQL ‘collect_list` function.
#collect_set(*cols) ⇒ Column

The Spark SQL ‘collect_set` function.
#concat(*cols) ⇒ Column

The Spark SQL ‘concat` function.
#concat_ws(sep, *cols) ⇒ Column

Concatenation of columns separated by literal ‘sep`.
#conv(col, from_base, to_base) ⇒ Column

Convert a number string from ‘from_base` to `to_base`.
#corr(*cols) ⇒ Column

The Spark SQL ‘corr` function.
#cos(*cols) ⇒ Column

The Spark SQL ‘cos` function.
#cosh(*cols) ⇒ Column

The Spark SQL ‘cosh` function.
#cot(*cols) ⇒ Column

The Spark SQL ‘cot` function.
#count(col) ⇒ Column

Count of rows (or non-null values of a column).
#count_distinct(*cols) ⇒ Column (also: #countDistinct)

Count of distinct combinations of the given columns.
#count_if(*cols) ⇒ Column

The Spark SQL ‘count_if` function.
#covar_pop(*cols) ⇒ Column

The Spark SQL ‘covar_pop` function.
#covar_samp(*cols) ⇒ Column

The Spark SQL ‘covar_samp` function.
#crc32(*cols) ⇒ Column

The Spark SQL ‘crc32` function.
#create_map(*cols) ⇒ Column

A map from alternating key/value columns.
#csc(*cols) ⇒ Column

The Spark SQL ‘csc` function.
#cume_dist ⇒ Column

The Spark SQL ‘cume_dist` function (takes no arguments).
#current_catalog ⇒ Column

The Spark SQL ‘current_catalog` function (takes no arguments).
#current_database ⇒ Column

The Spark SQL ‘current_database` function (takes no arguments).
#current_date ⇒ Column

The Spark SQL ‘current_date` function (takes no arguments).
#current_schema ⇒ Column

The Spark SQL ‘current_schema` function (takes no arguments).
#current_timestamp ⇒ Column

The Spark SQL ‘current_timestamp` function (takes no arguments).
#current_timezone ⇒ Column

The Spark SQL ‘current_timezone` function (takes no arguments).
#current_user ⇒ Column

The Spark SQL ‘current_user` function (takes no arguments).
#date_add(col, days) ⇒ Object
#date_format(col, fmt) ⇒ Object

—- Date / time functions with literal arguments ———————.
#date_from_unix_date(*cols) ⇒ Column

The Spark SQL ‘date_from_unix_date` function.
#date_sub(col, days) ⇒ Object
#date_trunc(fmt, col) ⇒ Object
#datediff(end_col, start_col) ⇒ Object
#day(*cols) ⇒ Column

The Spark SQL ‘day` function.
#dayofmonth(*cols) ⇒ Column

The Spark SQL ‘dayofmonth` function.
#dayofweek(*cols) ⇒ Column

The Spark SQL ‘dayofweek` function.
#dayofyear(*cols) ⇒ Column

The Spark SQL ‘dayofyear` function.
#degrees(*cols) ⇒ Column

The Spark SQL ‘degrees` function.
#dense_rank ⇒ Column

The Spark SQL ‘dense_rank` function (takes no arguments).
#desc(col) ⇒ Object
#desc_nulls_first(col) ⇒ Object
#desc_nulls_last(col) ⇒ Object
#element_at(col, extraction) ⇒ Object
#every(*cols) ⇒ Column

The Spark SQL ‘every` function.
#exists(col, &block) ⇒ Object
#exp(*cols) ⇒ Column

The Spark SQL ‘exp` function.
#explode(*cols) ⇒ Column

The Spark SQL ‘explode` function.
#explode_outer(*cols) ⇒ Column

The Spark SQL ‘explode_outer` function.
#expm1(*cols) ⇒ Column

The Spark SQL ‘expm1` function.
#expr(sql) ⇒ Column

Parse a SQL expression string into a Column.
#factorial(*cols) ⇒ Column

The Spark SQL ‘factorial` function.
#filter(col, &block) ⇒ Object
#first(*cols) ⇒ Column

The Spark SQL ‘first` function.
#first_value(*cols) ⇒ Column

The Spark SQL ‘first_value` function.
#flatten(*cols) ⇒ Column

The Spark SQL ‘flatten` function.
#floor(*cols) ⇒ Column

The Spark SQL ‘floor` function.
#forall(col, &block) ⇒ Object
#format_number(col, d) ⇒ Column

Number formatted to ‘d` decimal places.
#format_string(fmt, *cols) ⇒ Column

Printf-style formatting using literal ‘fmt`.
#from_json(col, schema, options = {}) ⇒ Object
#from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ Object
#from_utc_timestamp(col, tz) ⇒ Object
#get_json_object(col, path) ⇒ Object

—- JSON / CSV ——————————————————–.
#greatest(*cols) ⇒ Column

The Spark SQL ‘greatest` function.
#grouping(*cols) ⇒ Column

The Spark SQL ‘grouping` function.
#hash(*cols) ⇒ Column

The Spark SQL ‘hash` function.
#hex(*cols) ⇒ Column

The Spark SQL ‘hex` function.
#hour(*cols) ⇒ Column

The Spark SQL ‘hour` function.
#hypot(*cols) ⇒ Column

The Spark SQL ‘hypot` function.
#initcap(*cols) ⇒ Column

The Spark SQL ‘initcap` function.
#inline(*cols) ⇒ Column

The Spark SQL ‘inline` function.
#inline_outer(*cols) ⇒ Column

The Spark SQL ‘inline_outer` function.
#input_file_block_length ⇒ Column

The Spark SQL ‘input_file_block_length` function (takes no arguments).
#input_file_block_start ⇒ Column

The Spark SQL ‘input_file_block_start` function (takes no arguments).
#input_file_name ⇒ Column

The Spark SQL ‘input_file_name` function (takes no arguments).
#instr(col, substr) ⇒ Column

1-based position of literal ‘substr` within `col` (0 if absent).
#isnan(*cols) ⇒ Column

The Spark SQL ‘isnan` function.
#isnull(*cols) ⇒ Column

The Spark SQL ‘isnull` function.
#json_tuple(col, *fields) ⇒ Object
#kurtosis(*cols) ⇒ Column

The Spark SQL ‘kurtosis` function.
#lag(col, offset = 1, default = nil) ⇒ Object

—- Window / analytic functions ————————————–.
#last(*cols) ⇒ Column

The Spark SQL ‘last` function.
#last_day(*cols) ⇒ Column

The Spark SQL ‘last_day` function.
#last_value(*cols) ⇒ Column

The Spark SQL ‘last_value` function.
#lcase(*cols) ⇒ Column

The Spark SQL ‘lcase` function.
#lead(col, offset = 1, default = nil) ⇒ Object
#least(*cols) ⇒ Column

The Spark SQL ‘least` function.
#length(*cols) ⇒ Column

The Spark SQL ‘length` function.
#lit(value) ⇒ Column

A literal value column.
#ln(*cols) ⇒ Column

The Spark SQL ‘ln` function.
#locate(substr, col, pos = 1) ⇒ Column

1-based position of ‘substr` in `col` at/after `pos`.
#log(*cols) ⇒ Column

The Spark SQL ‘log` function.
#log10(*cols) ⇒ Column

The Spark SQL ‘log10` function.
#log1p(*cols) ⇒ Column

The Spark SQL ‘log1p` function.
#log2(*cols) ⇒ Column

The Spark SQL ‘log2` function.
#lower(*cols) ⇒ Column

The Spark SQL ‘lower` function.
#lpad(col, len, pad) ⇒ Column

Left-padded string.
#ltrim(*cols) ⇒ Column

The Spark SQL ‘ltrim` function.
#make_date(year, month, day) ⇒ Object
#map_concat(*cols) ⇒ Column

The Spark SQL ‘map_concat` function.
#map_contains_key(col, key) ⇒ Object
#map_entries(*cols) ⇒ Column

The Spark SQL ‘map_entries` function.
#map_filter(col, &block) ⇒ Object
#map_from_arrays(keys, values) ⇒ Column

A map from two array columns (keys, values).
#map_from_entries(*cols) ⇒ Column

The Spark SQL ‘map_from_entries` function.
#map_keys(*cols) ⇒ Column

The Spark SQL ‘map_keys` function.
#map_values(*cols) ⇒ Column

The Spark SQL ‘map_values` function.
#map_zip_with(c1, c2, &block) ⇒ Object
#max(*cols) ⇒ Column

The Spark SQL ‘max` function.
#max_by(*cols) ⇒ Column

The Spark SQL ‘max_by` function.
#md5(*cols) ⇒ Column

The Spark SQL ‘md5` function.
#mean(*cols) ⇒ Column

The Spark SQL ‘mean` function.
#median(*cols) ⇒ Column

The Spark SQL ‘median` function.
#min(*cols) ⇒ Column

The Spark SQL ‘min` function.
#min_by(*cols) ⇒ Column

The Spark SQL ‘min_by` function.
#minute(*cols) ⇒ Column

The Spark SQL ‘minute` function.
#mode(*cols) ⇒ Column

The Spark SQL ‘mode` function.
#monotonically_increasing_id ⇒ Column

The Spark SQL ‘monotonically_increasing_id` function (takes no arguments).
#month(*cols) ⇒ Column

The Spark SQL ‘month` function.
#months_between(d1, d2, round_off = true) ⇒ Object
#named_struct(*cols) ⇒ Column

A named struct from alternating name/value arguments.
#nanvl(col1, col2) ⇒ Column

‘value` if `col` is NaN else `col`.
#negate(*cols) ⇒ Column

The Spark SQL ‘negate` function.
#negative(*cols) ⇒ Column

The Spark SQL ‘negative` function.
#next_day(col, day_of_week) ⇒ Object
#now ⇒ Column

The Spark SQL ‘now` function (takes no arguments).
#nth_value(col, offset, ignore_nulls = false) ⇒ Object
#ntile(n) ⇒ Object
#octet_length(*cols) ⇒ Column

The Spark SQL ‘octet_length` function.
#overlay(col, replace, pos, len = -1)) ⇒ Column

Overlay ‘replace` into `col` at `pos` for `len` chars.
#percent_rank ⇒ Column

The Spark SQL ‘percent_rank` function (takes no arguments).
#pmod(*cols) ⇒ Column

The Spark SQL ‘pmod` function.
#posexplode(*cols) ⇒ Column

The Spark SQL ‘posexplode` function.
#posexplode_outer(*cols) ⇒ Column

The Spark SQL ‘posexplode_outer` function.
#positive(*cols) ⇒ Column

The Spark SQL ‘positive` function.
#pow(*cols) ⇒ Column

The Spark SQL ‘pow` function.
#power(*cols) ⇒ Column

The Spark SQL ‘power` function.
#product(*cols) ⇒ Column

The Spark SQL ‘product` function.
#quarter(*cols) ⇒ Column

The Spark SQL ‘quarter` function.
#radians(*cols) ⇒ Column

The Spark SQL ‘radians` function.
#rand(seed = nil) ⇒ Object

—- Randomness ——————————————————–.
#randn(seed = nil) ⇒ Object
#rank ⇒ Column

The Spark SQL ‘rank` function (takes no arguments).
#regexp_count(col, pattern) ⇒ Object
#regexp_extract(col, pattern, idx = 0) ⇒ Column

The ‘idx`-th group of `pattern` matched in `col`.
#regexp_extract_all(col, pattern, idx = 1) ⇒ Column

All matches of group ‘idx` of `pattern`.
#regexp_like(col, pattern) ⇒ Column

Whether ‘col` matches `pattern`.
#regexp_replace(col, pattern, replacement) ⇒ Column

‘col` with `pattern` replaced by `replacement`.
#regexp_substr(col, pattern) ⇒ Object
#repeat(col, n) ⇒ Column

The string repeated ‘n` times.
#reverse(*cols) ⇒ Column

The Spark SQL ‘reverse` function.
#rint(*cols) ⇒ Column

The Spark SQL ‘rint` function.
#round(col, scale = 0) ⇒ Column

HALF_UP rounding to ‘scale` decimal places.
#row_number ⇒ Column

The Spark SQL ‘row_number` function (takes no arguments).
#rpad(col, len, pad) ⇒ Column

Right-padded string.
#rtrim(*cols) ⇒ Column

The Spark SQL ‘rtrim` function.
#schema_of_json(json, options = {}) ⇒ Object
#sec(*cols) ⇒ Column

The Spark SQL ‘sec` function.
#second(*cols) ⇒ Column

The Spark SQL ‘second` function.
#sequence(start, stop, step = nil) ⇒ Object
#sha(*cols) ⇒ Column

The Spark SQL ‘sha` function.
#sha1(*cols) ⇒ Column

The Spark SQL ‘sha1` function.
#sha2(col, num_bits) ⇒ Column

SHA-2 hash with the given bit length (224/256/384/512).
#shiftleft(col, num_bits) ⇒ Column

Left shift / right shift by literal bit counts.
#shiftright(col, num_bits) ⇒ Object
#shiftrightunsigned(col, num_bits) ⇒ Object
#shuffle(*cols) ⇒ Column

The Spark SQL ‘shuffle` function.
#signum(*cols) ⇒ Column

The Spark SQL ‘signum` function.
#sin(*cols) ⇒ Column

The Spark SQL ‘sin` function.
#sinh(*cols) ⇒ Column

The Spark SQL ‘sinh` function.
#size(*cols) ⇒ Column

The Spark SQL ‘size` function.
#skewness(*cols) ⇒ Column

The Spark SQL ‘skewness` function.
#slice(col, start, length) ⇒ Object
#some(*cols) ⇒ Column

The Spark SQL ‘some` function.
#sort_array(col, asc = true) ⇒ Object

—- Sorting helpers —————————————————.
#soundex(*cols) ⇒ Column

The Spark SQL ‘soundex` function.
#spark_partition_id ⇒ Column

The Spark SQL ‘spark_partition_id` function (takes no arguments).
#split(col, pattern, limit = -1)) ⇒ Column

Split ‘col` by the literal regex `pattern`.
#sqrt(*cols) ⇒ Column

The Spark SQL ‘sqrt` function.
#stddev(*cols) ⇒ Column

The Spark SQL ‘stddev` function.
#stddev_pop(*cols) ⇒ Column

The Spark SQL ‘stddev_pop` function.
#stddev_samp(*cols) ⇒ Column

The Spark SQL ‘stddev_samp` function.
#struct(*cols) ⇒ Column

A struct from the given columns.
#substring(col, pos, len) ⇒ Column

Substring of length ‘len` from 1-based `pos`.
#substring_index(col, delim, count) ⇒ Column

Substring before the ‘count`-th occurrence of `delim`.
#sum(*cols) ⇒ Column

The Spark SQL ‘sum` function.
#sum_distinct(col) ⇒ Column

Sum of distinct values.
#tan(*cols) ⇒ Column

The Spark SQL ‘tan` function.
#tanh(*cols) ⇒ Column

The Spark SQL ‘tanh` function.
#timestamp_micros(*cols) ⇒ Column

The Spark SQL ‘timestamp_micros` function.
#timestamp_millis(*cols) ⇒ Column

The Spark SQL ‘timestamp_millis` function.
#timestamp_seconds(*cols) ⇒ Column

The Spark SQL ‘timestamp_seconds` function.
#to_date(col, fmt = nil) ⇒ Object
#to_json(col, options = {}) ⇒ Object
#to_timestamp(col, fmt = nil) ⇒ Object
#to_utc_timestamp(col, tz) ⇒ Object
#transform(col) {|element| ... } ⇒ Column

Transform each element of an array.
#transform_keys(col, &block) ⇒ Object
#transform_values(col, &block) ⇒ Object
#translate(col, matching, replace) ⇒ Column

Characters of ‘col` matching `matching` replaced per `replace`.
#trim(*cols) ⇒ Column

The Spark SQL ‘trim` function.
#trunc(col, fmt) ⇒ Object
#typeof(*cols) ⇒ Column

The Spark SQL ‘typeof` function.
#ucase(*cols) ⇒ Column

The Spark SQL ‘ucase` function.
#udf ⇒ Object

UDFs require a server-side execution environment (Python/Scala) and are not supported by the pure-Ruby client.
#unbase64(*cols) ⇒ Column

The Spark SQL ‘unbase64` function.
#unhex(*cols) ⇒ Column

The Spark SQL ‘unhex` function.
#unix_date(*cols) ⇒ Column

The Spark SQL ‘unix_date` function.
#unix_micros(*cols) ⇒ Column

The Spark SQL ‘unix_micros` function.
#unix_millis(*cols) ⇒ Column

The Spark SQL ‘unix_millis` function.
#unix_seconds(*cols) ⇒ Column

The Spark SQL ‘unix_seconds` function.
#unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ Object
#upper(*cols) ⇒ Column

The Spark SQL ‘upper` function.
#uuid ⇒ Column

The Spark SQL ‘uuid` function (takes no arguments).
#var_pop(*cols) ⇒ Column

The Spark SQL ‘var_pop` function.
#var_samp(*cols) ⇒ Column

The Spark SQL ‘var_samp` function.
#variance(*cols) ⇒ Column

The Spark SQL ‘variance` function.
#version ⇒ Column

The Spark SQL ‘version` function (takes no arguments).
#weekday(*cols) ⇒ Column

The Spark SQL ‘weekday` function.
#weekofyear(*cols) ⇒ Column

The Spark SQL ‘weekofyear` function.
#when(condition, value) ⇒ Column

Start a CASE WHEN expression.
#xxhash64(*cols) ⇒ Column

The Spark SQL ‘xxhash64` function.
#year(*cols) ⇒ Column

The Spark SQL ‘year` function.
#zip_with(left, right, &block) ⇒ Object

Class Attribute Details

.lambda_counter ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



880
881
882

# File 'lib/spark_connect/functions.rb', line 880

def lambda_counter
  @lambda_counter
end

Instance Method Details

#_col(value) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

ColumnOrName coercion: String/Symbol -> column reference, Column -> itself, everything else -> literal.

# File 'lib/spark_connect/functions.rb', line 863

def _col(value)
  case value
  when Column then value
  when String, Symbol then col(value.to_s)
  else lit(value)
  end
end

#_lambda(block) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.

Build a Column wrapping a LambdaFunction from a Ruby block. The block is called with one or more lambda-variable columns and must return a Column.

# File 'lib/spark_connect/functions.rb', line 886

def _lambda(block)
  arity = block.arity.negative? ? 1 : [block.arity, 1].max
  Functions.lambda_counter += 1
  names = (0...arity).map { |i| "x_#{Functions.lambda_counter}_#{i}" }
  vars = names.map do |n|
    Proto::Expression::UnresolvedNamedLambdaVariable.new(name_parts: [n])
  end
  cols = vars.map { |v| Column.new(Proto::Expression.new(unresolved_named_lambda_variable: v)) }
  body = block.call(*cols)
  Column.new(Proto::Expression.new(
               lambda_function: Proto::Expression::LambdaFunction.new(function: body.to_expr, arguments: vars)
             ))
end

#_lit_or_col(value) ⇒ `Object`

This method is part of a private API. You should avoid using this method if possible, as it may be removed or be changed in the future.



872
873
874

# File 'lib/spark_connect/functions.rb', line 872

def _lit_or_col(value)
  value.is_a?(Column) ? value : lit(value)
end

#abs(*cols) ⇒ `Column`

The Spark SQL ‘abs` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#acos(*cols) ⇒ `Column`

The Spark SQL ‘acos` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#acosh(*cols) ⇒ `Column`

The Spark SQL ‘acosh` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#add_months(col, months) ⇒ `Object`

159	# File 'lib/spark_connect/functions.rb', line 159 def add_months(col, months) = Column.invoke("add_months", _col(col), lit(months))

#aggregate(col, initial, merge, finish = nil) ⇒ `Column`

Aggregate (fold) an array. ‘merge` combines accumulator and element; optional `finish` post-processes the result.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 258

def aggregate(col, initial, merge, finish = nil)
  args = [_col(col), _col(initial), _lambda(merge)]
  args << _lambda(finish) if finish
  Column.invoke("aggregate", *args)
end

#any_value(*cols) ⇒ `Column`

The Spark SQL ‘any_value` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#approx_count_distinct(col, rsd = nil) ⇒ `Column`

Returns approximate distinct count (optionally with relative SD).

Returns:

(Column) —

approximate distinct count (optionally with relative SD).



70
71
72

# File 'lib/spark_connect/functions.rb', line 70

def approx_count_distinct(col, rsd = nil)
  rsd.nil? ? Column.invoke("approx_count_distinct", _col(col)) : Column.invoke("approx_count_distinct", _col(col), lit(rsd))
end

#array(*cols) ⇒ `Column`

Returns an array from the given columns.

Returns:

(Column) —

an array from the given columns.



96
97

# File 'lib/spark_connect/functions.rb', line 96

def array(*cols) = Column.invoke("array", *cols.map { |c| _col(c) })
# @return [Column] a map from alternating key/value columns.

#array_append(col, value) ⇒ `Object`

201	# File 'lib/spark_connect/functions.rb', line 201 def array_append(col, value) = Column.invoke("array_append", _col(col), lit(value))

#array_compact(*cols) ⇒ `Column`

The Spark SQL ‘array_compact` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_contains(col, value) ⇒ `Object`

—- Array / map functions with value arguments ———————–

197	# File 'lib/spark_connect/functions.rb', line 197 def array_contains(col, value) = Column.invoke("array_contains", _col(col), lit(value))

#array_distinct(*cols) ⇒ `Column`

The Spark SQL ‘array_distinct` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_except(*cols) ⇒ `Column`

The Spark SQL ‘array_except` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_insert(col, pos, value) ⇒ `Object`

203	# File 'lib/spark_connect/functions.rb', line 203 def array_insert(col, pos, value) = Column.invoke("array_insert", _col(col), lit(pos), lit(value))

#array_intersect(*cols) ⇒ `Column`

The Spark SQL ‘array_intersect` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_join(col, delimiter, null_replacement = nil) ⇒ `Object`

# File 'lib/spark_connect/functions.rb', line 205

def array_join(col, delimiter, null_replacement = nil)
  if null_replacement.nil?
    Column.invoke("array_join", _col(col),
                  lit(delimiter))
  else
    Column.invoke("array_join", _col(col), lit(delimiter), lit(null_replacement))
  end
end

#array_max(*cols) ⇒ `Column`

The Spark SQL ‘array_max` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_min(*cols) ⇒ `Column`

The Spark SQL ‘array_min` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_position(col, value) ⇒ `Object`

198	# File 'lib/spark_connect/functions.rb', line 198 def array_position(col, value) = Column.invoke("array_position", _col(col), lit(value))

#array_prepend(col, value) ⇒ `Object`

202	# File 'lib/spark_connect/functions.rb', line 202 def array_prepend(col, value) = Column.invoke("array_prepend", _col(col), lit(value))

#array_remove(col, element) ⇒ `Object`

199	# File 'lib/spark_connect/functions.rb', line 199 def array_remove(col, element) = Column.invoke("array_remove", _col(col), lit(element))

#array_repeat(col, count) ⇒ `Object`

200	# File 'lib/spark_connect/functions.rb', line 200 def array_repeat(col, count) = Column.invoke("array_repeat", _col(col), lit(count))

#array_sort(*cols) ⇒ `Column`

The Spark SQL ‘array_sort` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#array_union(*cols) ⇒ `Column`

The Spark SQL ‘array_union` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#arrays_overlap(*cols) ⇒ `Column`

The Spark SQL ‘arrays_overlap` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#arrays_zip(*cols) ⇒ `Column`

The Spark SQL ‘arrays_zip` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asc(col) ⇒ `Column`

Returns an ascending sort order for the named/given column.

Returns:

(Column) —

an ascending sort order for the named/given column.

42	# File 'lib/spark_connect/functions.rb', line 42 def asc(col) = _col(col).asc

#asc_nulls_first(col) ⇒ `Object`

44	# File 'lib/spark_connect/functions.rb', line 44 def asc_nulls_first(col) = _col(col).asc_nulls_first

#asc_nulls_last(col) ⇒ `Object`

45	# File 'lib/spark_connect/functions.rb', line 45 def asc_nulls_last(col) = _col(col).asc_nulls_last

#ascii(*cols) ⇒ `Column`

The Spark SQL ‘ascii` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asin(*cols) ⇒ `Column`

The Spark SQL ‘asin` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#asinh(*cols) ⇒ `Column`

The Spark SQL ‘asinh` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atan(*cols) ⇒ `Column`

The Spark SQL ‘atan` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atan2(*cols) ⇒ `Column`

The Spark SQL ‘atan2` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#atanh(*cols) ⇒ `Column`

The Spark SQL ‘atanh` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#avg(*cols) ⇒ `Column`

The Spark SQL ‘avg` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#base64(*cols) ⇒ `Column`

The Spark SQL ‘base64` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bin(*cols) ⇒ `Column`

The Spark SQL ‘bin` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_and(*cols) ⇒ `Column`

The Spark SQL ‘bit_and` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_count(*cols) ⇒ `Column`

The Spark SQL ‘bit_count` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_length(*cols) ⇒ `Column`

The Spark SQL ‘bit_length` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_or(*cols) ⇒ `Column`

The Spark SQL ‘bit_or` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bit_xor(*cols) ⇒ `Column`

The Spark SQL ‘bit_xor` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bitwise_not(*cols) ⇒ `Column`

The Spark SQL ‘bitwise_not` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bool_and(*cols) ⇒ `Column`

The Spark SQL ‘bool_and` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#bool_or(*cols) ⇒ `Column`

The Spark SQL ‘bool_or` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#broadcast(df) ⇒ `DataFrame`

Mark a DataFrame for broadcast (map-side) join.

Parameters:

df (DataFrame)

Returns:

(DataFrame)

269	# File 'lib/spark_connect/functions.rb', line 269 def broadcast(df) = df.hint("broadcast")

#bround(col, scale = 0) ⇒ `Column`

Returns HALF_EVEN (“banker’s”) rounding to ‘scale` places.

Returns:

(Column) —

HALF_EVEN (“banker’s”) rounding to ‘scale` places.

82	# File 'lib/spark_connect/functions.rb', line 82 def bround(col, scale = 0) = Column.invoke("bround", _col(col), lit(scale))

#cardinality(*cols) ⇒ `Column`

The Spark SQL ‘cardinality` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cbrt(*cols) ⇒ `Column`

The Spark SQL ‘cbrt` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ceil(*cols) ⇒ `Column`

The Spark SQL ‘ceil` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ceiling(*cols) ⇒ `Column`

The Spark SQL ‘ceiling` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#char_length(*cols) ⇒ `Column`

The Spark SQL ‘char_length` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#character_length(*cols) ⇒ `Column`

The Spark SQL ‘character_length` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#coalesce(*cols) ⇒ `Column`

Returns first non-null among the given columns.

Returns:

(Column) —

first non-null among the given columns.



87
88

# File 'lib/spark_connect/functions.rb', line 87

def coalesce(*cols) = Column.invoke("coalesce", *cols.map { |c| _col(c) })
# @return [Column] `value` if `col` is NaN else `col`.

#col(name) ⇒ `Column` Also known as: column

A column reference by name. ‘“*”` selects all columns.

Returns:

(Column)

28	# File 'lib/spark_connect/functions.rb', line 28 def col(name) = Column.from_name(name.to_s)

#collect_list(*cols) ⇒ `Column`

The Spark SQL ‘collect_list` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#collect_set(*cols) ⇒ `Column`

The Spark SQL ‘collect_set` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#concat(*cols) ⇒ `Column`

The Spark SQL ‘concat` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#concat_ws(sep, *cols) ⇒ `Column`

Returns concatenation of columns separated by literal ‘sep`.

Returns:

(Column) —

concatenation of columns separated by literal ‘sep`.



107
108

# File 'lib/spark_connect/functions.rb', line 107

def concat_ws(sep, *cols) = Column.invoke("concat_ws", lit(sep), *cols.map { |c| _col(c) })
# @return [Column] printf-style formatting using literal `fmt`.

#conv(col, from_base, to_base) ⇒ `Column`

Returns convert a number string from ‘from_base` to `to_base`.

Returns:

(Column) —

convert a number string from ‘from_base` to `to_base`.



145
146

# File 'lib/spark_connect/functions.rb', line 145

def conv(col, from_base, to_base) = Column.invoke("conv", _col(col), lit(from_base), lit(to_base))
# @return [Column] left shift / right shift by literal bit counts.

#corr(*cols) ⇒ `Column`

The Spark SQL ‘corr` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cos(*cols) ⇒ `Column`

The Spark SQL ‘cos` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cosh(*cols) ⇒ `Column`

The Spark SQL ‘cosh` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cot(*cols) ⇒ `Column`

The Spark SQL ‘cot` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#count(col) ⇒ `Column`

Returns count of rows (or non-null values of a column). ‘“*”` counts all rows.

Returns:

(Column) —

count of rows (or non-null values of a column). ‘“*”` counts all rows.



59
60
61

# File 'lib/spark_connect/functions.rb', line 59

def count(col)
  col.to_s == "*" ? Column.invoke("count", lit(1)) : Column.invoke("count", _col(col))
end

#count_distinct(*cols) ⇒ `Column` Also known as: countDistinct

Returns count of distinct combinations of the given columns.

Returns:

(Column) —

count of distinct combinations of the given columns.



64
65
66

# File 'lib/spark_connect/functions.rb', line 64

def count_distinct(*cols)
  Column.invoke("count", *cols.map { |c| _col(c) }, is_distinct: true)
end

#count_if(*cols) ⇒ `Column`

The Spark SQL ‘count_if` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#covar_pop(*cols) ⇒ `Column`

The Spark SQL ‘covar_pop` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#covar_samp(*cols) ⇒ `Column`

The Spark SQL ‘covar_samp` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#crc32(*cols) ⇒ `Column`

The Spark SQL ‘crc32` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#create_map(*cols) ⇒ `Column`

Returns a map from alternating key/value columns.

Returns:

(Column) —

a map from alternating key/value columns.



98
99

# File 'lib/spark_connect/functions.rb', line 98

def create_map(*cols) = Column.invoke("map", *cols.map { |c| _col(c) })
# @return [Column] a map from two array columns (keys, values).

#csc(*cols) ⇒ `Column`

The Spark SQL ‘csc` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#cume_dist ⇒ `Column`

The Spark SQL ‘cume_dist` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_catalog ⇒ `Column`

The Spark SQL ‘current_catalog` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_database ⇒ `Column`

The Spark SQL ‘current_database` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_date ⇒ `Column`

The Spark SQL ‘current_date` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_schema ⇒ `Column`

The Spark SQL ‘current_schema` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_timestamp ⇒ `Column`

The Spark SQL ‘current_timestamp` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_timezone ⇒ `Column`

The Spark SQL ‘current_timezone` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#current_user ⇒ `Column`

The Spark SQL ‘current_user` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#date_add(col, days) ⇒ `Object`

156	# File 'lib/spark_connect/functions.rb', line 156 def date_add(col, days) = Column.invoke("date_add", _col(col), lit(days))

#date_format(col, fmt) ⇒ `Object`

—- Date / time functions with literal arguments ———————

153	# File 'lib/spark_connect/functions.rb', line 153 def date_format(col, fmt) = Column.invoke("date_format", _col(col), lit(fmt))

#date_from_unix_date(*cols) ⇒ `Column`

The Spark SQL ‘date_from_unix_date` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#date_sub(col, days) ⇒ `Object`

157	# File 'lib/spark_connect/functions.rb', line 157 def date_sub(col, days) = Column.invoke("date_sub", _col(col), lit(days))

#date_trunc(fmt, col) ⇒ `Object`

163	# File 'lib/spark_connect/functions.rb', line 163 def date_trunc(fmt, col) = Column.invoke("date_trunc", lit(fmt), _col(col))

#datediff(end_col, start_col) ⇒ `Object`

158	# File 'lib/spark_connect/functions.rb', line 158 def datediff(end_col, start_col) = Column.invoke("datediff", _col(end_col), _col(start_col))

#day(*cols) ⇒ `Column`

The Spark SQL ‘day` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofmonth(*cols) ⇒ `Column`

The Spark SQL ‘dayofmonth` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofweek(*cols) ⇒ `Column`

The Spark SQL ‘dayofweek` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dayofyear(*cols) ⇒ `Column`

The Spark SQL ‘dayofyear` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#degrees(*cols) ⇒ `Column`

The Spark SQL ‘degrees` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#dense_rank ⇒ `Column`

The Spark SQL ‘dense_rank` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#desc(col) ⇒ `Object`

43	# File 'lib/spark_connect/functions.rb', line 43 def desc(col) = _col(col).desc

#desc_nulls_first(col) ⇒ `Object`

46	# File 'lib/spark_connect/functions.rb', line 46 def desc_nulls_first(col) = _col(col).desc_nulls_first

#desc_nulls_last(col) ⇒ `Object`

47	# File 'lib/spark_connect/functions.rb', line 47 def desc_nulls_last(col) = _col(col).desc_nulls_last

#element_at(col, extraction) ⇒ `Object`

214	# File 'lib/spark_connect/functions.rb', line 214 def element_at(col, extraction) = Column.invoke("element_at", _col(col), lit(extraction))

#every(*cols) ⇒ `Column`

The Spark SQL ‘every` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#exists(col, &block) ⇒ `Object`

246	# File 'lib/spark_connect/functions.rb', line 246 def exists(col, &block) = Column.invoke("exists", _col(col), _lambda(block))

#exp(*cols) ⇒ `Column`

The Spark SQL ‘exp` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#explode(*cols) ⇒ `Column`

The Spark SQL ‘explode` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#explode_outer(*cols) ⇒ `Column`

The Spark SQL ‘explode_outer` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#expm1(*cols) ⇒ `Column`

The Spark SQL ‘expm1` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#expr(sql) ⇒ `Column`

Parse a SQL expression string into a Column.

Returns:

(Column)



37
38
39

# File 'lib/spark_connect/functions.rb', line 37

def expr(sql)
  Column.from_expr(Proto::Expression.new(expression_string: Proto::Expression::ExpressionString.new(expression: sql)))
end

#factorial(*cols) ⇒ `Column`

The Spark SQL ‘factorial` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#filter(col, &block) ⇒ `Object`

248	# File 'lib/spark_connect/functions.rb', line 248 def filter(col, &block) = Column.invoke("filter", _col(col), _lambda(block))

#first(*cols) ⇒ `Column`

The Spark SQL ‘first` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#first_value(*cols) ⇒ `Column`

The Spark SQL ‘first_value` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#flatten(*cols) ⇒ `Column`

The Spark SQL ‘flatten` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#floor(*cols) ⇒ `Column`

The Spark SQL ‘floor` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#forall(col, &block) ⇒ `Object`

247	# File 'lib/spark_connect/functions.rb', line 247 def forall(col, &block) = Column.invoke("forall", _col(col), _lambda(block))

#format_number(col, d) ⇒ `Column`

Returns number formatted to ‘d` decimal places.

Returns:

(Column) —

number formatted to ‘d` decimal places.



111
112

# File 'lib/spark_connect/functions.rb', line 111

def format_number(col, d) = Column.invoke("format_number", _col(col), lit(d))
# @return [Column] substring of length `len` from 1-based `pos`.

#format_string(fmt, *cols) ⇒ `Column`

Returns printf-style formatting using literal ‘fmt`.

Returns:

(Column) —

printf-style formatting using literal ‘fmt`.



109
110

# File 'lib/spark_connect/functions.rb', line 109

def format_string(fmt, *cols) = Column.invoke("format_string", lit(fmt), *cols.map { |c| _col(c) })
# @return [Column] number formatted to `d` decimal places.

#from_json(col, schema, options = {}) ⇒ `Object`

Parameters:

schema (Types::DataType, String)

# File 'lib/spark_connect/functions.rb', line 180

def from_json(col, schema, options = {})
  schema_col = schema.is_a?(Types::DataType) ? lit(schema.json) : lit(schema.to_s)
  args = [_col(col), schema_col] + options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] }
  Column.invoke("from_json", *args)
end

#from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ `Object`

164	# File 'lib/spark_connect/functions.rb', line 164 def from_unixtime(col, fmt = "yyyy-MM-dd HH:mm:ss") = Column.invoke("from_unixtime", _col(col), lit(fmt))

#from_utc_timestamp(col, tz) ⇒ `Object`

170	# File 'lib/spark_connect/functions.rb', line 170 def from_utc_timestamp(col, tz) = Column.invoke("from_utc_timestamp", _col(col), lit(tz))

#get_json_object(col, path) ⇒ `Object`

—- JSON / CSV ——————————————————–

176	# File 'lib/spark_connect/functions.rb', line 176 def get_json_object(col, path) = Column.invoke("get_json_object", _col(col), lit(path))

#greatest(*cols) ⇒ `Column`

The Spark SQL ‘greatest` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#grouping(*cols) ⇒ `Column`

The Spark SQL ‘grouping` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hash(*cols) ⇒ `Column`

The Spark SQL ‘hash` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hex(*cols) ⇒ `Column`

The Spark SQL ‘hex` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hour(*cols) ⇒ `Column`

The Spark SQL ‘hour` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#hypot(*cols) ⇒ `Column`

The Spark SQL ‘hypot` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#initcap(*cols) ⇒ `Column`

The Spark SQL ‘initcap` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#inline(*cols) ⇒ `Column`

The Spark SQL ‘inline` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#inline_outer(*cols) ⇒ `Column`

The Spark SQL ‘inline_outer` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_block_length ⇒ `Column`

The Spark SQL ‘input_file_block_length` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_block_start ⇒ `Column`

The Spark SQL ‘input_file_block_start` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#input_file_name ⇒ `Column`

The Spark SQL ‘input_file_name` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#instr(col, substr) ⇒ `Column`

Returns 1-based position of literal ‘substr` within `col` (0 if absent).

Returns:

(Column) —

1-based position of literal ‘substr` within `col` (0 if absent).



117
118

# File 'lib/spark_connect/functions.rb', line 117

def instr(col, substr) = Column.invoke("instr", _col(col), lit(substr))
# @return [Column] 1-based position of `substr` in `col` at/after `pos`.

#isnan(*cols) ⇒ `Column`

The Spark SQL ‘isnan` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#isnull(*cols) ⇒ `Column`

The Spark SQL ‘isnull` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#json_tuple(col, *fields) ⇒ `Object`

177	# File 'lib/spark_connect/functions.rb', line 177 def json_tuple(col, fields) = Column.invoke("json_tuple", _col(col), fields.map { \|f\| lit(f) })

#kurtosis(*cols) ⇒ `Column`

The Spark SQL ‘kurtosis` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lag(col, offset = 1, default = nil) ⇒ `Object`

—- Window / analytic functions ————————————–

225	# File 'lib/spark_connect/functions.rb', line 225 def lag(col, offset = 1, default = nil) = Column.invoke("lag", _col(col), lit(offset), lit(default))

#last(*cols) ⇒ `Column`

The Spark SQL ‘last` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#last_day(*cols) ⇒ `Column`

The Spark SQL ‘last_day` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#last_value(*cols) ⇒ `Column`

The Spark SQL ‘last_value` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lcase(*cols) ⇒ `Column`

The Spark SQL ‘lcase` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lead(col, offset = 1, default = nil) ⇒ `Object`

226	# File 'lib/spark_connect/functions.rb', line 226 def lead(col, offset = 1, default = nil) = Column.invoke("lead", _col(col), lit(offset), lit(default))

#least(*cols) ⇒ `Column`

The Spark SQL ‘least` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#length(*cols) ⇒ `Column`

The Spark SQL ‘length` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lit(value) ⇒ `Column`

A literal value column. See Column.lit for supported Ruby types.

Returns:

(Column)

33	# File 'lib/spark_connect/functions.rb', line 33 def lit(value) = Column.lit(value)

#ln(*cols) ⇒ `Column`

The Spark SQL ‘ln` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#locate(substr, col, pos = 1) ⇒ `Column`

Returns 1-based position of ‘substr` in `col` at/after `pos`.

Returns:

(Column) —

1-based position of ‘substr` in `col` at/after `pos`.



119
120

# File 'lib/spark_connect/functions.rb', line 119

def locate(substr, col, pos = 1) = Column.invoke("locate", lit(substr), _col(col), lit(pos))
# @return [Column] left-padded string.

#log(*cols) ⇒ `Column`

The Spark SQL ‘log` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log10(*cols) ⇒ `Column`

The Spark SQL ‘log10` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log1p(*cols) ⇒ `Column`

The Spark SQL ‘log1p` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#log2(*cols) ⇒ `Column`

The Spark SQL ‘log2` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lower(*cols) ⇒ `Column`

The Spark SQL ‘lower` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#lpad(col, len, pad) ⇒ `Column`

Returns left-padded string.

Returns:

(Column) —

left-padded string.



121
122

# File 'lib/spark_connect/functions.rb', line 121

def lpad(col, len, pad) = Column.invoke("lpad", _col(col), lit(len), lit(pad))
# @return [Column] right-padded string.

#ltrim(*cols) ⇒ `Column`

The Spark SQL ‘ltrim` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#make_date(year, month, day) ⇒ `Object`

172	# File 'lib/spark_connect/functions.rb', line 172 def make_date(year, month, day) = Column.invoke("make_date", _col(year), _col(month), _col(day))

#map_concat(*cols) ⇒ `Column`

The Spark SQL ‘map_concat` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_contains_key(col, key) ⇒ `Object`

221	# File 'lib/spark_connect/functions.rb', line 221 def map_contains_key(col, key) = Column.invoke("map_contains_key", _col(col), lit(key))

#map_entries(*cols) ⇒ `Column`

The Spark SQL ‘map_entries` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_filter(col, &block) ⇒ `Object`

252	# File 'lib/spark_connect/functions.rb', line 252 def map_filter(col, &block) = Column.invoke("map_filter", _col(col), _lambda(block))

#map_from_arrays(keys, values) ⇒ `Column`

Returns a map from two array columns (keys, values).

Returns:

(Column) —

a map from two array columns (keys, values).



100
101

# File 'lib/spark_connect/functions.rb', line 100

def map_from_arrays(keys, values) = Column.invoke("map_from_arrays", _col(keys), _col(values))
# @return [Column] a named struct from alternating name/value arguments.

#map_from_entries(*cols) ⇒ `Column`

The Spark SQL ‘map_from_entries` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_keys(*cols) ⇒ `Column`

The Spark SQL ‘map_keys` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_values(*cols) ⇒ `Column`

The Spark SQL ‘map_values` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#map_zip_with(c1, c2, &block) ⇒ `Object`

253	# File 'lib/spark_connect/functions.rb', line 253 def map_zip_with(c1, c2, &block) = Column.invoke("map_zip_with", _col(c1), _col(c2), _lambda(block))

#max(*cols) ⇒ `Column`

The Spark SQL ‘max` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#max_by(*cols) ⇒ `Column`

The Spark SQL ‘max_by` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#md5(*cols) ⇒ `Column`

The Spark SQL ‘md5` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#mean(*cols) ⇒ `Column`

The Spark SQL ‘mean` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#median(*cols) ⇒ `Column`

The Spark SQL ‘median` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#min(*cols) ⇒ `Column`

The Spark SQL ‘min` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#min_by(*cols) ⇒ `Column`

The Spark SQL ‘min_by` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#minute(*cols) ⇒ `Column`

The Spark SQL ‘minute` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#mode(*cols) ⇒ `Column`

The Spark SQL ‘mode` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#monotonically_increasing_id ⇒ `Column`

The Spark SQL ‘monotonically_increasing_id` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#month(*cols) ⇒ `Column`

The Spark SQL ‘month` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#months_between(d1, d2, round_off = true) ⇒ `Object`

160	# File 'lib/spark_connect/functions.rb', line 160 def months_between(d1, d2, round_off = true) = Column.invoke("months_between", _col(d1), _col(d2), lit(round_off))

#named_struct(*cols) ⇒ `Column`

Returns a named struct from alternating name/value arguments.

Returns:

(Column) —

a named struct from alternating name/value arguments.

102	# File 'lib/spark_connect/functions.rb', line 102 def named_struct(cols) = Column.invoke("named_struct", cols.map { \|c\| _col(c) })

#nanvl(col1, col2) ⇒ `Column`

Returns ‘value` if `col` is NaN else `col`.

Returns:

(Column) —

‘value` if `col` is NaN else `col`.

89	# File 'lib/spark_connect/functions.rb', line 89 def nanvl(col1, col2) = Column.invoke("nanvl", _col(col1), _col(col2))

#negate(*cols) ⇒ `Column`

The Spark SQL ‘negate` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#negative(*cols) ⇒ `Column`

The Spark SQL ‘negative` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#next_day(col, day_of_week) ⇒ `Object`

161	# File 'lib/spark_connect/functions.rb', line 161 def next_day(col, day_of_week) = Column.invoke("next_day", _col(col), lit(day_of_week))

#now ⇒ `Column`

The Spark SQL ‘now` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#nth_value(col, offset, ignore_nulls = false) ⇒ `Object`

228	# File 'lib/spark_connect/functions.rb', line 228 def nth_value(col, offset, ignore_nulls = false) = Column.invoke("nth_value", _col(col), lit(offset), lit(ignore_nulls))

#ntile(n) ⇒ `Object`

227	# File 'lib/spark_connect/functions.rb', line 227 def ntile(n) = Column.invoke("ntile", lit(n))

#octet_length(*cols) ⇒ `Column`

The Spark SQL ‘octet_length` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#overlay(col, replace, pos, len = -1)) ⇒ `Column`

Returns overlay ‘replace` into `col` at `pos` for `len` chars.

Returns:

(Column) —

overlay ‘replace` into `col` at `pos` for `len` chars.



141
142

# File 'lib/spark_connect/functions.rb', line 141

def overlay(col, replace, pos, len = -1) = Column.invoke("overlay", _col(col), _col(replace), lit(pos), lit(len))
# @return [Column] SHA-2 hash with the given bit length (224/256/384/512).

#percent_rank ⇒ `Column`

The Spark SQL ‘percent_rank` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#pmod(*cols) ⇒ `Column`

The Spark SQL ‘pmod` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#posexplode(*cols) ⇒ `Column`

The Spark SQL ‘posexplode` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#posexplode_outer(*cols) ⇒ `Column`

The Spark SQL ‘posexplode_outer` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#positive(*cols) ⇒ `Column`

The Spark SQL ‘positive` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#pow(*cols) ⇒ `Column`

The Spark SQL ‘pow` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#power(*cols) ⇒ `Column`

The Spark SQL ‘power` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#product(*cols) ⇒ `Column`

The Spark SQL ‘product` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#quarter(*cols) ⇒ `Column`

The Spark SQL ‘quarter` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#radians(*cols) ⇒ `Column`

The Spark SQL ‘radians` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rand(seed = nil) ⇒ `Object`

—- Randomness ——————————————————–

236	# File 'lib/spark_connect/functions.rb', line 236 def rand(seed = nil) = seed.nil? ? Column.invoke("rand") : Column.invoke("rand", lit(seed))

#randn(seed = nil) ⇒ `Object`

237	# File 'lib/spark_connect/functions.rb', line 237 def randn(seed = nil) = seed.nil? ? Column.invoke("randn") : Column.invoke("randn", lit(seed))

#rank ⇒ `Column`

The Spark SQL ‘rank` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#regexp_count(col, pattern) ⇒ `Object`

138	# File 'lib/spark_connect/functions.rb', line 138 def regexp_count(col, pattern) = Column.invoke("regexp_count", _col(col), lit(pattern))

#regexp_extract(col, pattern, idx = 0) ⇒ `Column`

Returns the ‘idx`-th group of `pattern` matched in `col`.

Returns:

(Column) —

the ‘idx`-th group of `pattern` matched in `col`.



131
132

# File 'lib/spark_connect/functions.rb', line 131

def regexp_extract(col, pattern, idx = 0) = Column.invoke("regexp_extract", _col(col), lit(pattern), lit(idx))
# @return [Column] all matches of group `idx` of `pattern`.

#regexp_extract_all(col, pattern, idx = 1) ⇒ `Column`

Returns all matches of group ‘idx` of `pattern`.

Returns:

(Column) —

all matches of group ‘idx` of `pattern`.



133
134

# File 'lib/spark_connect/functions.rb', line 133

def regexp_extract_all(col, pattern, idx = 1) = Column.invoke("regexp_extract_all", _col(col), lit(pattern), lit(idx))
# @return [Column] `col` with `pattern` replaced by `replacement`.

#regexp_like(col, pattern) ⇒ `Column`

Returns whether ‘col` matches `pattern`.

Returns:

(Column) —

whether ‘col` matches `pattern`.

137	# File 'lib/spark_connect/functions.rb', line 137 def regexp_like(col, pattern) = Column.invoke("regexp_like", _col(col), lit(pattern))

#regexp_replace(col, pattern, replacement) ⇒ `Column`

Returns ‘col` with `pattern` replaced by `replacement`.

Returns:

(Column) —

‘col` with `pattern` replaced by `replacement`.



135
136

# File 'lib/spark_connect/functions.rb', line 135

def regexp_replace(col, pattern, replacement) = Column.invoke("regexp_replace", _col(col), lit(pattern), lit(replacement))
# @return [Column] whether `col` matches `pattern`.

#regexp_substr(col, pattern) ⇒ `Object`



139
140

# File 'lib/spark_connect/functions.rb', line 139

def regexp_substr(col, pattern) = Column.invoke("regexp_substr", _col(col), lit(pattern))
# @return [Column] overlay `replace` into `col` at `pos` for `len` chars.

#repeat(col, n) ⇒ `Column`

Returns the string repeated ‘n` times.

Returns:

(Column) —

the string repeated ‘n` times.



125
126

# File 'lib/spark_connect/functions.rb', line 125

def repeat(col, n) = Column.invoke("repeat", _col(col), lit(n))
# @return [Column] split `col` by the literal regex `pattern`.

#reverse(*cols) ⇒ `Column`

The Spark SQL ‘reverse` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rint(*cols) ⇒ `Column`

The Spark SQL ‘rint` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#round(col, scale = 0) ⇒ `Column`

Returns HALF_UP rounding to ‘scale` decimal places.

Returns:

(Column) —

HALF_UP rounding to ‘scale` decimal places.



80
81

# File 'lib/spark_connect/functions.rb', line 80

def round(col, scale = 0) = Column.invoke("round", _col(col), lit(scale))
# @return [Column] HALF_EVEN ("banker's") rounding to `scale` places.

#row_number ⇒ `Column`

The Spark SQL ‘row_number` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#rpad(col, len, pad) ⇒ `Column`

Returns right-padded string.

Returns:

(Column) —

right-padded string.



123
124

# File 'lib/spark_connect/functions.rb', line 123

def rpad(col, len, pad) = Column.invoke("rpad", _col(col), lit(len), lit(pad))
# @return [Column] the string repeated `n` times.

#rtrim(*cols) ⇒ `Column`

The Spark SQL ‘rtrim` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#schema_of_json(json, options = {}) ⇒ `Object`



191
192
193

# File 'lib/spark_connect/functions.rb', line 191

def schema_of_json(json, options = {})
  Column.invoke("schema_of_json", _lit_or_col(json), *options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] })
end

#sec(*cols) ⇒ `Column`

The Spark SQL ‘sec` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#second(*cols) ⇒ `Column`

The Spark SQL ‘second` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sequence(start, stop, step = nil) ⇒ `Object`



217
218
219

# File 'lib/spark_connect/functions.rb', line 217

def sequence(start, stop, step = nil)
  step.nil? ? Column.invoke("sequence", _col(start), _col(stop)) : Column.invoke("sequence", _col(start), _col(stop), _col(step))
end

#sha(*cols) ⇒ `Column`

The Spark SQL ‘sha` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sha1(*cols) ⇒ `Column`

The Spark SQL ‘sha1` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sha2(col, num_bits) ⇒ `Column`

Returns SHA-2 hash with the given bit length (224/256/384/512).

Returns:

(Column) —

SHA-2 hash with the given bit length (224/256/384/512).



143
144

# File 'lib/spark_connect/functions.rb', line 143

def sha2(col, num_bits) = Column.invoke("sha2", _col(col), lit(num_bits))
# @return [Column] convert a number string from `from_base` to `to_base`.

#shiftleft(col, num_bits) ⇒ `Column`

Returns left shift / right shift by literal bit counts.

Returns:

(Column) —

left shift / right shift by literal bit counts.

147	# File 'lib/spark_connect/functions.rb', line 147 def shiftleft(col, num_bits) = Column.invoke("shiftleft", _col(col), lit(num_bits))

#shiftright(col, num_bits) ⇒ `Object`

148	# File 'lib/spark_connect/functions.rb', line 148 def shiftright(col, num_bits) = Column.invoke("shiftright", _col(col), lit(num_bits))

#shiftrightunsigned(col, num_bits) ⇒ `Object`

149	# File 'lib/spark_connect/functions.rb', line 149 def shiftrightunsigned(col, num_bits) = Column.invoke("shiftrightunsigned", _col(col), lit(num_bits))

#shuffle(*cols) ⇒ `Column`

The Spark SQL ‘shuffle` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#signum(*cols) ⇒ `Column`

The Spark SQL ‘signum` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sin(*cols) ⇒ `Column`

The Spark SQL ‘sin` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sinh(*cols) ⇒ `Column`

The Spark SQL ‘sinh` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#size(*cols) ⇒ `Column`

The Spark SQL ‘size` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#skewness(*cols) ⇒ `Column`

The Spark SQL ‘skewness` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#slice(col, start, length) ⇒ `Object`

215	# File 'lib/spark_connect/functions.rb', line 215 def slice(col, start, length) = Column.invoke("slice", _col(col), _lit_or_col(start), _lit_or_col(length))

#some(*cols) ⇒ `Column`

The Spark SQL ‘some` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sort_array(col, asc = true) ⇒ `Object`

—- Sorting helpers —————————————————

232	# File 'lib/spark_connect/functions.rb', line 232 def sort_array(col, asc = true) = Column.invoke("sort_array", _col(col), lit(asc))

#soundex(*cols) ⇒ `Column`

The Spark SQL ‘soundex` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#spark_partition_id ⇒ `Column`

The Spark SQL ‘spark_partition_id` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#split(col, pattern, limit = -1)) ⇒ `Column`

Returns split ‘col` by the literal regex `pattern`.

Returns:

(Column) —

split ‘col` by the literal regex `pattern`.



127
128

# File 'lib/spark_connect/functions.rb', line 127

def split(col, pattern, limit = -1) = Column.invoke("split", _col(col), lit(pattern), lit(limit))
# @return [Column] characters of `col` matching `matching` replaced per `replace`.

#sqrt(*cols) ⇒ `Column`

The Spark SQL ‘sqrt` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev(*cols) ⇒ `Column`

The Spark SQL ‘stddev` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev_pop(*cols) ⇒ `Column`

The Spark SQL ‘stddev_pop` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#stddev_samp(*cols) ⇒ `Column`

The Spark SQL ‘stddev_samp` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#struct(*cols) ⇒ `Column`

Returns a struct from the given columns.

Returns:

(Column) —

a struct from the given columns.



94
95

# File 'lib/spark_connect/functions.rb', line 94

def struct(*cols) = Column.invoke("struct", *cols.map { |c| _col(c) })
# @return [Column] an array from the given columns.

#substring(col, pos, len) ⇒ `Column`

Returns substring of length ‘len` from 1-based `pos`.

Returns:

(Column) —

substring of length ‘len` from 1-based `pos`.



113
114

# File 'lib/spark_connect/functions.rb', line 113

def substring(col, pos, len) = Column.invoke("substring", _col(col), lit(pos), lit(len))
# @return [Column] substring before the `count`-th occurrence of `delim`.

#substring_index(col, delim, count) ⇒ `Column`

Returns substring before the ‘count`-th occurrence of `delim`.

Returns:

(Column) —

substring before the ‘count`-th occurrence of `delim`.



115
116

# File 'lib/spark_connect/functions.rb', line 115

def substring_index(col, delim, count) = Column.invoke("substring_index", _col(col), lit(delim), lit(count))
# @return [Column] 1-based position of literal `substr` within `col` (0 if absent).

#sum(*cols) ⇒ `Column`

The Spark SQL ‘sum` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#sum_distinct(col) ⇒ `Column`

Returns sum of distinct values.

Returns:

(Column) —

sum of distinct values.

75	# File 'lib/spark_connect/functions.rb', line 75 def sum_distinct(col) = Column.invoke("sum", _col(col), is_distinct: true)

#tan(*cols) ⇒ `Column`

The Spark SQL ‘tan` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#tanh(*cols) ⇒ `Column`

The Spark SQL ‘tanh` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_micros(*cols) ⇒ `Column`

The Spark SQL ‘timestamp_micros` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_millis(*cols) ⇒ `Column`

The Spark SQL ‘timestamp_millis` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#timestamp_seconds(*cols) ⇒ `Column`

The Spark SQL ‘timestamp_seconds` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#to_date(col, fmt = nil) ⇒ `Object`

154	# File 'lib/spark_connect/functions.rb', line 154 def to_date(col, fmt = nil) = fmt ? Column.invoke("to_date", _col(col), lit(fmt)) : Column.invoke("to_date", _col(col))

#to_json(col, options = {}) ⇒ `Object`

# File 'lib/spark_connect/functions.rb', line 186

def to_json(col, options = {})
  args = [_col(col)] + options.flat_map { |k, v| [lit(k.to_s), lit(v.to_s)] }
  Column.invoke("to_json", *args)
end

#to_timestamp(col, fmt = nil) ⇒ `Object`

155	# File 'lib/spark_connect/functions.rb', line 155 def to_timestamp(col, fmt = nil) = fmt ? Column.invoke("to_timestamp", _col(col), lit(fmt)) : Column.invoke("to_timestamp", _col(col))

#to_utc_timestamp(col, tz) ⇒ `Object`

171	# File 'lib/spark_connect/functions.rb', line 171 def to_utc_timestamp(col, tz) = Column.invoke("to_utc_timestamp", _col(col), lit(tz))

#transform(col) {|element| ... } ⇒ `Column`

Transform each element of an array. The block receives a Column (and optionally the index) and returns a Column.

Yield Parameters:

element (Column)

Returns:

(Column)

245	# File 'lib/spark_connect/functions.rb', line 245 def transform(col, &block) = Column.invoke("transform", _col(col), _lambda(block))

#transform_keys(col, &block) ⇒ `Object`

250	# File 'lib/spark_connect/functions.rb', line 250 def transform_keys(col, &block) = Column.invoke("transform_keys", _col(col), _lambda(block))

#transform_values(col, &block) ⇒ `Object`

251	# File 'lib/spark_connect/functions.rb', line 251 def transform_values(col, &block) = Column.invoke("transform_values", _col(col), _lambda(block))

#translate(col, matching, replace) ⇒ `Column`

Returns characters of ‘col` matching `matching` replaced per `replace`.

Returns:

(Column) —

characters of ‘col` matching `matching` replaced per `replace`.



129
130

# File 'lib/spark_connect/functions.rb', line 129

def translate(col, matching, replace) = Column.invoke("translate", _col(col), lit(matching), lit(replace))
# @return [Column] the `idx`-th group of `pattern` matched in `col`.

#trim(*cols) ⇒ `Column`

The Spark SQL ‘trim` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#trunc(col, fmt) ⇒ `Object`

162	# File 'lib/spark_connect/functions.rb', line 162 def trunc(col, fmt) = Column.invoke("trunc", _col(col), lit(fmt))

#typeof(*cols) ⇒ `Column`

The Spark SQL ‘typeof` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#ucase(*cols) ⇒ `Column`

The Spark SQL ‘ucase` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#udf ⇒ `Object`

UDFs require a server-side execution environment (Python/Scala) and are not supported by the pure-Ruby client.

Raises:

(NotImplementedError)



273
274
275

# File 'lib/spark_connect/functions.rb', line 273

def udf(*)
  raise NotImplementedError, "User-defined functions are not supported by the Ruby Spark Connect client"
end

#unbase64(*cols) ⇒ `Column`

The Spark SQL ‘unbase64` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unhex(*cols) ⇒ `Column`

The Spark SQL ‘unhex` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_date(*cols) ⇒ `Column`

The Spark SQL ‘unix_date` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_micros(*cols) ⇒ `Column`

The Spark SQL ‘unix_micros` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_millis(*cols) ⇒ `Column`

The Spark SQL ‘unix_millis` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_seconds(*cols) ⇒ `Column`

The Spark SQL ‘unix_seconds` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss") ⇒ `Object`



166
167
168

# File 'lib/spark_connect/functions.rb', line 166

def unix_timestamp(col = nil, fmt = "yyyy-MM-dd HH:mm:ss")
  col.nil? ? Column.invoke("unix_timestamp") : Column.invoke("unix_timestamp", _col(col), lit(fmt))
end

#upper(*cols) ⇒ `Column`

The Spark SQL ‘upper` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#uuid ⇒ `Column`

The Spark SQL ‘uuid` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#var_pop(*cols) ⇒ `Column`

The Spark SQL ‘var_pop` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#var_samp(*cols) ⇒ `Column`

The Spark SQL ‘var_samp` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#variance(*cols) ⇒ `Column`

The Spark SQL ‘variance` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#version ⇒ `Column`

The Spark SQL ‘version` function (takes no arguments).

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#weekday(*cols) ⇒ `Column`

The Spark SQL ‘weekday` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#weekofyear(*cols) ⇒ `Column`

The Spark SQL ‘weekofyear` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#when(condition, value) ⇒ `Column`

Start a CASE WHEN expression. Chain Column#when / Column#otherwise.

Returns:

(Column)



51
52
53

# File 'lib/spark_connect/functions.rb', line 51

def when(condition, value)
  Column.invoke("when", condition, value)
end

#xxhash64(*cols) ⇒ `Column`

The Spark SQL ‘xxhash64` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#year(*cols) ⇒ `Column`

The Spark SQL ‘year` function. String arguments are treated as column names.

Returns:

(Column)

# File 'lib/spark_connect/functions.rb', line 822

UNIFORM = %w[
  sum avg mean max min first last stddev stddev_samp stddev_pop variance var_samp var_pop
  skewness kurtosis collect_list collect_set first_value last_value max_by min_by corr
  covar_pop covar_samp median mode any_value every some bit_and bit_or bit_xor bool_and bool_or
  product count_if grouping
  abs acos acosh asin asinh atan atanh atan2 bin cbrt ceil ceiling cos cosh cot csc degrees
  exp expm1 factorial floor hypot ln log log2 log10 log1p negative negate positive pow power
  radians rint sec signum sin sinh sqrt tan tanh hex unhex pmod isnan isnull positive
  upper lower ltrim rtrim trim length char_length character_length octet_length bit_length
  reverse ascii base64 unbase64 initcap soundex crc32 md5 sha1 sha ucase lcase
  size cardinality array_distinct array_max array_min array_compact flatten explode explode_outer
  posexplode posexplode_outer inline inline_outer map_keys map_values map_entries map_from_entries
  array_sort shuffle arrays_zip map_concat concat greatest least hash xxhash64
  array_union array_intersect array_except arrays_overlap
  year quarter month dayofmonth day dayofweek dayofyear hour minute second weekofyear last_day
  weekday unix_date unix_micros unix_millis unix_seconds timestamp_seconds timestamp_millis
  timestamp_micros date_from_unix_date
  bitwise_not bit_count typeof
].uniq.freeze

#zip_with(left, right, &block) ⇒ `Object`

249	# File 'lib/spark_connect/functions.rb', line 249 def zip_with(left, right, &block) = Column.invoke("zip_with", _col(left), _col(right), _lambda(block))

Module: SparkConnect::Functions

Overview

Examples:

Constant Summary collapse

Class Attribute Summary collapse

Instance Method Summary collapse

Class Attribute Details

.lambda_counter ⇒ Object

Instance Method Details

#_col(value) ⇒ Object

#_lambda(block) ⇒ Object

#_lit_or_col(value) ⇒ Object

#abs(*cols) ⇒ Column

#acos(*cols) ⇒ Column

#acosh(*cols) ⇒ Column

#add_months(col, months) ⇒ Object

#aggregate(col, initial, merge, finish = nil) ⇒ Column

#any_value(*cols) ⇒ Column

#approx_count_distinct(col, rsd = nil) ⇒ Column

#array(*cols) ⇒ Column

#array_append(col, value) ⇒ Object

#array_compact(*cols) ⇒ Column

#array_contains(col, value) ⇒ Object

#array_distinct(*cols) ⇒ Column

#array_except(*cols) ⇒ Column

#array_insert(col, pos, value) ⇒ Object

#array_intersect(*cols) ⇒ Column

#array_join(col, delimiter, null_replacement = nil) ⇒ Object

#array_max(*cols) ⇒ Column

#array_min(*cols) ⇒ Column

#array_position(col, value) ⇒ Object

#array_prepend(col, value) ⇒ Object

#array_remove(col, element) ⇒ Object

#array_repeat(col, count) ⇒ Object

#array_sort(*cols) ⇒ Column

#array_union(*cols) ⇒ Column

#arrays_overlap(*cols) ⇒ Column

#arrays_zip(*cols) ⇒ Column

#asc(col) ⇒ Column

#asc_nulls_first(col) ⇒ Object

#asc_nulls_last(col) ⇒ Object

#ascii(*cols) ⇒ Column

#asin(*cols) ⇒ Column

#asinh(*cols) ⇒ Column

#atan(*cols) ⇒ Column

#atan2(*cols) ⇒ Column

#atanh(*cols) ⇒ Column

#avg(*cols) ⇒ Column

#base64(*cols) ⇒ Column

#bin(*cols) ⇒ Column

#bit_and(*cols) ⇒ Column

#bit_count(*cols) ⇒ Column

#bit_length(*cols) ⇒ Column

#bit_or(*cols) ⇒ Column

#bit_xor(*cols) ⇒ Column

#bitwise_not(*cols) ⇒ Column

#bool_and(*cols) ⇒ Column

#bool_or(*cols) ⇒ Column

#broadcast(df) ⇒ DataFrame

#bround(col, scale = 0) ⇒ Column

#cardinality(*cols) ⇒ Column

#cbrt(*cols) ⇒ Column

#ceil(*cols) ⇒ Column

#ceiling(*cols) ⇒ Column

#char_length(*cols) ⇒ Column

#character_length(*cols) ⇒ Column

#coalesce(*cols) ⇒ Column

#col(name) ⇒ Column Also known as: column

#collect_list(*cols) ⇒ Column

#collect_set(*cols) ⇒ Column

#concat(*cols) ⇒ Column

#concat_ws(sep, *cols) ⇒ Column

#conv(col, from_base, to_base) ⇒ Column

#corr(*cols) ⇒ Column

#cos(*cols) ⇒ Column

#cosh(*cols) ⇒ Column

#cot(*cols) ⇒ Column

#count(col) ⇒ Column

#count_distinct(*cols) ⇒ Column Also known as: countDistinct

#count_if(*cols) ⇒ Column

.lambda_counter ⇒ `Object`

#_col(value) ⇒ `Object`

#_lambda(block) ⇒ `Object`

#_lit_or_col(value) ⇒ `Object`

#abs(*cols) ⇒ `Column`

#acos(*cols) ⇒ `Column`

#acosh(*cols) ⇒ `Column`

#add_months(col, months) ⇒ `Object`

#aggregate(col, initial, merge, finish = nil) ⇒ `Column`

#any_value(*cols) ⇒ `Column`

#approx_count_distinct(col, rsd = nil) ⇒ `Column`

#array(*cols) ⇒ `Column`

#array_append(col, value) ⇒ `Object`

#array_compact(*cols) ⇒ `Column`

#array_contains(col, value) ⇒ `Object`

#array_distinct(*cols) ⇒ `Column`

#array_except(*cols) ⇒ `Column`

#array_insert(col, pos, value) ⇒ `Object`

#array_intersect(*cols) ⇒ `Column`

#array_join(col, delimiter, null_replacement = nil) ⇒ `Object`

#array_max(*cols) ⇒ `Column`

#array_min(*cols) ⇒ `Column`

#array_position(col, value) ⇒ `Object`

#array_prepend(col, value) ⇒ `Object`

#array_remove(col, element) ⇒ `Object`

#array_repeat(col, count) ⇒ `Object`

#array_sort(*cols) ⇒ `Column`

#array_union(*cols) ⇒ `Column`

#arrays_overlap(*cols) ⇒ `Column`

#arrays_zip(*cols) ⇒ `Column`

#asc(col) ⇒ `Column`

#asc_nulls_first(col) ⇒ `Object`

#asc_nulls_last(col) ⇒ `Object`

#ascii(*cols) ⇒ `Column`

#asin(*cols) ⇒ `Column`

#asinh(*cols) ⇒ `Column`

#atan(*cols) ⇒ `Column`

#atan2(*cols) ⇒ `Column`

#atanh(*cols) ⇒ `Column`

#avg(*cols) ⇒ `Column`

#base64(*cols) ⇒ `Column`

#bin(*cols) ⇒ `Column`

#bit_and(*cols) ⇒ `Column`

#bit_count(*cols) ⇒ `Column`

#bit_length(*cols) ⇒ `Column`

#bit_or(*cols) ⇒ `Column`

#bit_xor(*cols) ⇒ `Column`

#bitwise_not(*cols) ⇒ `Column`

#bool_and(*cols) ⇒ `Column`

#bool_or(*cols) ⇒ `Column`

#broadcast(df) ⇒ `DataFrame`

#bround(col, scale = 0) ⇒ `Column`

#cardinality(*cols) ⇒ `Column`

#cbrt(*cols) ⇒ `Column`

#ceil(*cols) ⇒ `Column`

#ceiling(*cols) ⇒ `Column`

#char_length(*cols) ⇒ `Column`

#character_length(*cols) ⇒ `Column`

#coalesce(*cols) ⇒ `Column`

#col(name) ⇒ `Column` Also known as: column

#collect_list(*cols) ⇒ `Column`

#collect_set(*cols) ⇒ `Column`

#concat(*cols) ⇒ `Column`

#concat_ws(sep, *cols) ⇒ `Column`

#conv(col, from_base, to_base) ⇒ `Column`

#corr(*cols) ⇒ `Column`

#cos(*cols) ⇒ `Column`

#cosh(*cols) ⇒ `Column`

#cot(*cols) ⇒ `Column`

#count(col) ⇒ `Column`

#count_distinct(*cols) ⇒ `Column` Also known as: countDistinct

#count_if(*cols) ⇒ `Column`

#covar_pop(*cols) ⇒ `Column`

#covar_samp(*cols) ⇒ `Column`

#crc32(*cols) ⇒ `Column`

#create_map(*cols) ⇒ `Column`

#csc(*cols) ⇒ `Column`

#cume_dist ⇒ `Column`

#current_catalog ⇒ `Column`

#current_database ⇒ `Column`