Class: Rdkafka::Producer

Inherits:
Object
  • Object
show all
Includes:
Helpers::OAuth, Helpers::Time
Defined in:
lib/rdkafka/producer.rb,
lib/rdkafka/producer/delivery_handle.rb,
lib/rdkafka/producer/delivery_report.rb,
lib/rdkafka/producer/partitions_count_cache.rb

Overview

A producer for Kafka messages. To create a producer set up a Config and call producer on that.

Defined Under Namespace

Classes: DeliveryHandle, DeliveryReport, PartitionsCountCache, TopicHandleCreationError

Constant Summary collapse

@@partitions_count_cache =
PartitionsCountCache.new

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Helpers::OAuth

#oauthbearer_set_token, #oauthbearer_set_token_failure

Methods included from Helpers::Time

#monotonic_now, #monotonic_now_ms

Constructor Details

#initialize(native_kafka, partitioner) ⇒ Producer

Returns a new instance of Producer.

Parameters:

  • native_kafka (NativeKafka)
  • partitioner (String, nil)

    name of the partitioner we want to use or nil to use the “consistent_random” default



55
56
57
58
59
60
61
62
63
# File 'lib/rdkafka/producer.rb', line 55

def initialize(native_kafka, partitioner)
  @topics_refs_map = {}
  @topics_configs = {}
  @native_kafka = native_kafka
  @partitioner = partitioner || "consistent_random"

  # Makes sure, that native kafka gets closed before it gets GCed by Ruby
  ObjectSpace.define_finalizer(self, native_kafka.finalizer)
end

Instance Attribute Details

#delivery_callbackProc?

Returns the current delivery callback, by default this is nil.

Returns:

  • (Proc, nil)


43
44
45
# File 'lib/rdkafka/producer.rb', line 43

def delivery_callback
  @delivery_callback
end

#delivery_callback_arityInteger? (readonly)

Returns the number of arguments accepted by the callback, by default this is nil.

Returns:

  • (Integer, nil)


49
50
51
# File 'lib/rdkafka/producer.rb', line 49

def delivery_callback_arity
  @delivery_callback_arity
end

Class Method Details

.partitions_count_cacheRdkafka::Producer::PartitionsCountCache

Note:

It is critical to remember, that not all users may have statistics callbacks enabled, hence we should not make assumption that this cache is always updated from the stats.

Global (process wide) partitions cache. We use it to store number of topics partitions, either from the librdkafka statistics (if enabled) or via direct inline calls every now and then. Since the partitions count can only grow and should be same for all consumers and producers, we can use a global cache as long as we ensure that updates only move up.



20
21
22
# File 'lib/rdkafka/producer.rb', line 20

def self.partitions_count_cache
  @@partitions_count_cache
end

.partitions_count_cache=(partitions_count_cache) ⇒ Object

Parameters:



25
26
27
# File 'lib/rdkafka/producer.rb', line 25

def self.partitions_count_cache=(partitions_count_cache)
  @@partitions_count_cache = partitions_count_cache
end

Instance Method Details

#arity(callback) ⇒ Integer

Figures out the arity of a given block/method

Parameters:

  • callback (#call, Proc)

Returns:

  • (Integer)

    arity of the provided block/method



505
506
507
508
509
# File 'lib/rdkafka/producer.rb', line 505

def arity(callback)
  return callback.arity if callback.respond_to?(:arity)

  callback.method(:call).arity
end

#call_delivery_callback(delivery_report, delivery_handle) ⇒ Object

Calls (if registered) the delivery callback

Parameters:



488
489
490
491
492
493
494
495
496
497
498
499
# File 'lib/rdkafka/producer.rb', line 488

def call_delivery_callback(delivery_report, delivery_handle)
  return unless @delivery_callback

  case @delivery_callback_arity
  when 0
    @delivery_callback.call
  when 1
    @delivery_callback.call(delivery_report)
  else
    @delivery_callback.call(delivery_report, delivery_handle)
  end
end

#closeObject

Close this producer and wait for the internal poll queue to empty.



157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# File 'lib/rdkafka/producer.rb', line 157

def close
  return if closed?
  ObjectSpace.undefine_finalizer(self)

  @native_kafka.close do
    # We need to remove the topics references objects before we destroy the producer,
    # otherwise they would leak out
    @topics_refs_map.each_value do |refs|
      refs.each_value do |ref|
        Rdkafka::Bindings.rd_kafka_topic_destroy(ref)
      end
    end
  end

  @topics_refs_map.clear
end

#closed?Boolean

Whether this producer has closed

Returns:

  • (Boolean)


175
176
177
# File 'lib/rdkafka/producer.rb', line 175

def closed?
  @native_kafka.closed?
end

#enable_background_queue_io_events(fd, payload = "\x01") ⇒ nil

Enable IO event notifications for background events

Parameters:

  • fd (Integer)

    file descriptor to signal (from IO.pipe or eventfd)

  • payload (String) (defaults to: "\x01")

    data to write to fd (default: “x01”)

Returns:

  • (nil)

Raises:



141
142
143
# File 'lib/rdkafka/producer.rb', line 141

def enable_background_queue_io_events(fd, payload = "\x01")
  @native_kafka.enable_background_queue_io_events(fd, payload)
end

#enable_queue_io_events(fd, payload = "\x01") ⇒ nil

Enable IO event notifications for fiber scheduler integration When delivery confirmations arrive, librdkafka will write to your FD

Parameters:

  • fd (Integer)

    file descriptor to signal (from IO.pipe or eventfd)

  • payload (String) (defaults to: "\x01")

    data to write to fd (default: “x01”)

Returns:

  • (nil)

Raises:



132
133
134
# File 'lib/rdkafka/producer.rb', line 132

def enable_queue_io_events(fd, payload = "\x01")
  @native_kafka.enable_main_queue_io_events(fd, payload)
end

#events_poll_nb_each {|count| ... } ⇒ nil

Note:

This method holds the inner lock until the queue is empty or ‘:stop` is returned. Other producer operations (produce, close, etc.) will wait until this method returns.

Note:

This method is thread-safe as it uses @native_kafka.with_inner synchronization

Polls for events in a non-blocking loop, yielding the count after each iteration.

This method processes delivery callbacks in a single GVL/mutex session, which is more efficient than repeated individual polls. It uses non-blocking polls internally (no GVL release between polls).

Yields the count of events processed after each poll iteration, allowing the caller to implement timeout or other termination logic by returning ‘:stop`.

Examples:

Drain all pending callbacks

producer.events_poll_nb_each { |_count| }

With timeout control

deadline = monotonic_now + timeout_ms
producer.events_poll_nb_each do |_count|
  :stop if monotonic_now >= deadline
end

Yields:

  • (count)

    Called after each poll iteration

Yield Parameters:

  • count (Integer)

    Number of events processed in this iteration

Yield Returns:

  • (Symbol, Object)

    Return ‘:stop` to break the loop, any other value continues

Returns:

  • (nil)

Raises:



288
289
290
291
292
293
294
295
296
297
298
# File 'lib/rdkafka/producer.rb', line 288

def events_poll_nb_each
  closed_producer_check(__method__)

  @native_kafka.with_inner do |inner|
    loop do
      count = Rdkafka::Bindings.rd_kafka_poll_nb(inner, 0)
      break if count.zero?
      break if yield(count) == :stop
    end
  end
end

#flush(timeout_ms = Defaults::PRODUCER_FLUSH_TIMEOUT_MS) ⇒ Boolean

Note:

We raise an exception for other errors because based on the librdkafka docs, there should be no other errors.

Note:

For ‘timed_out` we do not raise an error to keep it backwards compatible

Wait until all outstanding producer requests are completed, with the given timeout in seconds. Call this before closing a producer to ensure delivery of all messages.

Parameters:

  • timeout_ms (Integer) (defaults to: Defaults::PRODUCER_FLUSH_TIMEOUT_MS)

    how long should we wait for flush of all messages

Returns:

  • (Boolean)

    true if no more data and all was flushed, false in case there are still outgoing messages after the timeout



190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
# File 'lib/rdkafka/producer.rb', line 190

def flush(timeout_ms = Defaults::PRODUCER_FLUSH_TIMEOUT_MS)
  closed_producer_check(__method__)

  code = nil

  @native_kafka.with_inner do |inner|
    code = Rdkafka::Bindings.rd_kafka_flush(inner, timeout_ms)
  end

  # Early skip not to build the error message
  return true if code.zero?

  error = Rdkafka::RdkafkaError.new(code)

  return false if error.code == :timed_out

  raise(error)
end

#nameString

Returns producer name.

Returns:

  • (String)

    producer name



119
120
121
122
123
# File 'lib/rdkafka/producer.rb', line 119

def name
  @name ||= @native_kafka.with_inner do |inner|
    ::Rdkafka::Bindings.rd_kafka_name(inner)
  end
end

#partition_count(topic) ⇒ Integer

Note:

If ‘allow.auto.create.topics’ is set to true in the broker, the topic will be auto-created after returning nil.

Note:

We cache the partition count for a given topic for given time. If statistics are enabled for any producer or consumer, it will take precedence over per instance fetching.

This prevents us in case someone uses ‘partition_key` from querying for the count with each message. Instead we query at most once every 30 seconds at most if we have a valid partition count or every 5 seconds in case we were not able to obtain number of partitions.

Partition count for a given topic.

Parameters:

  • topic (String)

    The topic name.

Returns:

  • (Integer)

    partition count for a given topic or ‘-1` if it could not be obtained.



314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
# File 'lib/rdkafka/producer.rb', line 314

def partition_count(topic)
  closed_producer_check(__method__)

  self.class.partitions_count_cache.get(topic) do
     = nil

    @native_kafka.with_inner do |inner|
       = ::Rdkafka::Metadata.new(inner, topic).topics&.first
    end

     ? [:partition_count] : Rdkafka::Bindings::RD_KAFKA_PARTITION_UA
  end
rescue Rdkafka::RdkafkaError => e
  # If the topic does not exist, it will be created or if not allowed another error will be
  # raised. We here return RD_KAFKA_PARTITION_UA so this can happen without early error
  # happening on metadata discovery.
  return Rdkafka::Bindings::RD_KAFKA_PARTITION_UA if e.code == :unknown_topic_or_part

  raise(e)
end

#produce(topic:, payload: nil, key: nil, partition: nil, partition_key: nil, timestamp: nil, headers: nil, label: nil, topic_config: EMPTY_HASH, partitioner: @partitioner) ⇒ DeliveryHandle

Produces a message to a Kafka topic. The message is added to rdkafka’s queue, call wait on the returned delivery handle to make sure it is delivered.

When no partition is specified the underlying Kafka library picks a partition based on the key. If no key is specified, a random partition will be used. When a timestamp is provided this is used instead of the auto-generated timestamp.

Parameters:

  • topic (String)

    The topic to produce to

  • payload (String, nil) (defaults to: nil)
  • key (String, nil) (defaults to: nil)
  • partition (Integer, nil) (defaults to: nil)

    Optional partition to produce to

  • partition_key (String, nil) (defaults to: nil)

    Optional partition key based on which partition assignment can happen

  • timestamp (Time, Integer, nil) (defaults to: nil)

    Optional timestamp of this message. Integer timestamp is in milliseconds since Jan 1 1970.

  • headers (Hash{String => String, Array<String>}) (defaults to: nil)

    Optional message headers. Values can be either a single string or an array of strings to support duplicate headers per KIP-82

  • label (Object, nil) (defaults to: nil)

    a label that can be assigned when producing a message that will be part of the delivery handle and the delivery report

  • topic_config (Hash) (defaults to: EMPTY_HASH)

    topic config for given message dispatch. Allows to send messages to topics with different configuration

  • partitioner (String) (defaults to: @partitioner)

    name of the partitioner to use

Returns:

  • (DeliveryHandle)

    Delivery handle that can be used to wait for the result of producing this message

Raises:

  • (RdkafkaError)

    When adding the message to rdkafka’s queue failed



354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
# File 'lib/rdkafka/producer.rb', line 354

def produce(
  topic:,
  payload: nil,
  key: nil,
  partition: nil,
  partition_key: nil,
  timestamp: nil,
  headers: nil,
  label: nil,
  topic_config: EMPTY_HASH,
  partitioner: @partitioner
)
  closed_producer_check(__method__)

  # Start by checking and converting the input

  # Get payload length
  payload_size = if payload.nil?
    0
  else
    payload.bytesize
  end

  # Get key length
  key_size = if key.nil?
    0
  else
    key.bytesize
  end

  topic_config_hash = topic_config.hash

  # Checks if we have the rdkafka topic reference object ready. It saves us on object
  # allocation and allows to use custom config on demand.
  set_topic_config(topic, topic_config, topic_config_hash) unless @topics_refs_map.dig(topic, topic_config_hash)
  topic_ref = @topics_refs_map.dig(topic, topic_config_hash)

  if partition_key
    partition_count = partition_count(topic)

    # Check if there are no overrides for the partitioner and use the default one only when
    # no per-topic is present.
    selected_partitioner = @topics_configs.dig(topic, topic_config_hash, :partitioner) || partitioner

    # If the topic is not present, set to -1
    if partition_count.positive?
      partition = Rdkafka::Bindings.partitioner(
        topic_ref,
        partition_key,
        partition_count,
        selected_partitioner
      )
    end
  end

  # If partition is nil, use RD_KAFKA_PARTITION_UA to let librdafka set the partition randomly or
  # based on the key when present.
  partition ||= Rdkafka::Bindings::RD_KAFKA_PARTITION_UA

  # If timestamp is nil use 0 and let Kafka set one. If an integer or time
  # use it.
  raw_timestamp = if timestamp.nil?
    0
  elsif timestamp.is_a?(Integer)
    timestamp
  elsif timestamp.is_a?(Time)
    (timestamp.to_i * 1000) + (timestamp.usec / 1000)
  else
    raise TypeError.new("Timestamp has to be nil, an Integer or a Time")
  end

  delivery_handle = DeliveryHandle.new
  delivery_handle.label = label
  delivery_handle.topic = topic
  delivery_handle[:pending] = true
  delivery_handle[:response] = Rdkafka::Bindings::RD_KAFKA_PARTITION_UA
  delivery_handle[:partition] = Rdkafka::Bindings::RD_KAFKA_PARTITION_UA
  delivery_handle[:offset] = Rdkafka::Bindings::RD_KAFKA_PARTITION_UA
  DeliveryHandle.register(delivery_handle)

  args = [
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_RKT, :pointer, topic_ref,
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_MSGFLAGS, :int, Rdkafka::Bindings::RD_KAFKA_MSG_F_COPY,
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_VALUE, :buffer_in, payload, :size_t, payload_size,
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_KEY, :buffer_in, key, :size_t, key_size,
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_PARTITION, :int32, partition,
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_TIMESTAMP, :int64, raw_timestamp,
    :int, Rdkafka::Bindings::RD_KAFKA_VTYPE_OPAQUE, :pointer, delivery_handle
  ]

  headers&.each do |key0, value0|
    key = key0.to_s
    if value0.is_a?(Array)
      # Handle array of values per KIP-82
      value0.each do |value|
        value = value.to_s
        args << :int << Rdkafka::Bindings::RD_KAFKA_VTYPE_HEADER
        args << :string << key
        args << :pointer << value
        args << :size_t << value.bytesize
      end
    else
      # Handle single value
      value = value0.to_s
      args << :int << Rdkafka::Bindings::RD_KAFKA_VTYPE_HEADER
      args << :string << key
      args << :pointer << value
      args << :size_t << value.bytesize
    end
  end

  args << :int << Rdkafka::Bindings::RD_KAFKA_VTYPE_END

  # Produce the message
  response = @native_kafka.with_inner do |inner|
    Rdkafka::Bindings.rd_kafka_producev(
      inner,
      *args
    )
  end

  # Raise error if the produce call was not successful
  if response != Rdkafka::Bindings::RD_KAFKA_RESP_ERR_NO_ERROR
    DeliveryHandle.remove(delivery_handle.to_ptr.address)
    raise RdkafkaError.new(response)
  end

  delivery_handle
end

#purgeObject

Purges the outgoing queue and releases all resources.

Useful when closing the producer with outgoing messages to unstable clusters or when for any other reasons waiting cannot go on anymore. This purges both the queue and all the inflight requests + updates the delivery handles statuses so they can be materialized into ‘purge_queue` errors.



215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
# File 'lib/rdkafka/producer.rb', line 215

def purge
  closed_producer_check(__method__)

  code = nil

  @native_kafka.with_inner do |inner|
    code = Bindings.rd_kafka_purge(
      inner,
      Bindings::RD_KAFKA_PURGE_F_QUEUE | Bindings::RD_KAFKA_PURGE_F_INFLIGHT
    )
  end

  code.zero? || raise(Rdkafka::RdkafkaError.new(code))

  # Wait for the purge to affect everything
  sleep(Defaults::PRODUCER_PURGE_SLEEP_INTERVAL_MS / 1000.0) until flush(Defaults::PRODUCER_PURGE_FLUSH_TIMEOUT_MS)

  true
end

#queue_sizeInteger Also known as: queue_length

Note:

This method is thread-safe as it uses the @native_kafka.with_inner synchronization

Returns the number of messages and requests waiting to be sent to the broker as well as delivery reports queued for the application.

This provides visibility into the producer’s internal queue depth, useful for:

  • Monitoring producer backpressure

  • Implementing custom flow control

  • Debugging message delivery issues

  • Graceful shutdown logic (wait until queue is empty)

Examples:

producer.queue_size #=> 42

Returns:

  • (Integer)

    the number of messages in the queue

Raises:



251
252
253
254
255
256
257
# File 'lib/rdkafka/producer.rb', line 251

def queue_size
  closed_producer_check(__method__)

  @native_kafka.with_inner do |inner|
    Rdkafka::Bindings.rd_kafka_outq_len(inner)
  end
end

#set_topic_config(topic, config, config_hash) ⇒ Object

Note:

It is not allowed to re-set the same topic config twice because of the underlying librdkafka caching

Sets alternative set of configuration details that can be set per topic

Parameters:

  • topic (String)

    The topic name

  • config (Hash)

    config we want to use per topic basis

  • config_hash (Integer)

    hash of the config. We expect it here instead of computing it, because it is already computed during the retrieval attempt in the ‘#produce` flow.



73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# File 'lib/rdkafka/producer.rb', line 73

def set_topic_config(topic, config, config_hash)
  # Ensure lock on topic reference just in case
  @native_kafka.with_inner do |inner|
    @topics_refs_map[topic] ||= {}
    @topics_configs[topic] ||= {}

    return if @topics_configs[topic].key?(config_hash)

    # If config is empty, we create an empty reference that will be used with defaults
    rd_topic_config = if config.empty?
      nil
    else
      Rdkafka::Bindings.rd_kafka_topic_conf_new.tap do |topic_config|
        config.each do |key, value|
          error_buffer = FFI::MemoryPointer.new(:char, 256)
          result = Rdkafka::Bindings.rd_kafka_topic_conf_set(
            topic_config,
            key.to_s,
            value.to_s,
            error_buffer,
            256
          )

          unless result == :config_ok
            raise Config::ConfigError.new(error_buffer.read_string)
          end
        end
      end
    end

    topic_handle = Bindings.rd_kafka_topic_new(inner, topic, rd_topic_config)

    raise TopicHandleCreationError.new("Error creating topic handle for topic #{topic}") if topic_handle.null?

    @topics_configs[topic][config_hash] = config
    @topics_refs_map[topic][config_hash] = topic_handle
  end
end

#startObject

Note:

Not needed to run unless explicit start was disabled

Starts the native Kafka polling thread and kicks off the init polling



114
115
116
# File 'lib/rdkafka/producer.rb', line 114

def start
  @native_kafka.start
end