Class: HTM::TimeframeExtractor

Inherits:
Object
  • Object
show all
Defined in:
lib/htm/timeframe_extractor.rb

Overview

Timeframe Extractor - Extracts temporal expressions from queries

This service parses natural language time expressions from recall queries and returns both the timeframe and the cleaned query text.

Supports:

  • Standard time expressions via Chronic gem (“yesterday”, “last week”, etc.)

  • “few” keyword mapped to FEW constant (e.g., “few days ago” → “3 days ago”)

  • “recent/recently” without units defaults to FEW days

Examples:

Basic usage

result = TimeframeExtractor.extract("what did we discuss last week about PostgreSQL")
result[:query]     # => "what did we discuss about PostgreSQL"
result[:timeframe] # => #<Range: 2025-11-21..2025-11-28>

With “few” keyword

result = TimeframeExtractor.extract("show me notes from a few days ago")
result[:timeframe] # => Time object for 3 days ago

With “recently”

result = TimeframeExtractor.extract("what did we recently discuss")
result[:timeframe] # => Range from 3 days ago to now

Defined Under Namespace

Classes: Result

Constant Summary collapse

FEW =

The numeric value for “few” and “recently” without units

3
DEFAULT_RECENT_UNIT =

Default unit for “recently” when no time unit is specified

:days
TIME_UNITS =

Time unit patterns for matching

%w[
  seconds? minutes? hours? days? weeks? months? years?
].join('|').freeze
WORD_NUMBERS =

Word-to-number mapping for written numbers

{
  'one' => 1, 'two' => 2, 'three' => 3, 'four' => 4, 'five' => 5,
  'six' => 6, 'seven' => 7, 'eight' => 8, 'nine' => 9, 'ten' => 10
}.freeze
UNIT_SECONDS =

Seconds per singular time unit (used by parse_last_x and parse_recent)

{
  'second' => 1,
  'minute' => 60,
  'hour'   => 3_600,
  'day'    => 86_400,
  'week'   => 604_800,
  'month'  => 2_592_000,
  'year'   => 31_536_000
}.freeze
TEMPORAL_PATTERNS =

Patterns for temporal expressions (order matters - more specific first) Each pattern should match ORIGINAL text (including “few”, “a few”)

[
  # "between X and Y" - date ranges
  /\bbetween\s+(.+?)\s+and\s+(.+?)(?=\s+(?:about|regarding|for|on|with)|$)/i,

  # "from X to Y" - date ranges
  /\bfrom\s+(.+?)\s+to\s+(.+?)(?=\s+(?:about|regarding|for|on|with)|$)/i,

  # "since X" - from date to now
  /\bsince\s+(.+?)(?=\s+(?:about|regarding|for|on|with)|$)/i,

  # "before/after X"
  /\b(before|after)\s+(.+?)(?=\s+(?:about|regarding|for|on|with)|$)/i,

  # "in the last/past X units" (including "few", "a few", "several")
  /\bin\s+the\s+(?:last|past)\s+(?:\d+|few|a\s+few|several)\s+(?:#{TIME_UNITS})/i,

  # "weekend before last" / "the weekend before last"
  /\b(?:the\s+)?weekend\s+before\s+last\b/i,

  # "N weekends ago" (numeric or written)
  /\b(?:\d+|one|two|three|four|five|six|seven|eight|nine|ten|few|a\s+few|several)\s+weekends?\s+ago\b/i,

  # "a few X ago" or "few X ago"
  /\b(?:a\s+)?few\s+(?:#{TIME_UNITS})\s+ago\b/i,

  # "X units ago"
  /\b\d+\s+(?:#{TIME_UNITS})\s+ago\b/i,

  # "last/this/next weekend"
  /\b(?:last|this|next)\s+weekend\b/i,

  # "last/this/next X" (week, month, year, monday, etc.)
  /\b(?:last|this|next)\s+(?:week|month|year|monday|tuesday|wednesday|thursday|friday|saturday|sunday)\b/i,

  # "recently" or "recent" as standalone or with context
  /\b(?:recently|recent)\b/i,

  # Standard time words
  /\b(?:yesterday|today|tonight|this\s+morning|this\s+afternoon|this\s+evening|last\s+night)\b/i
].freeze

Class Method Summary collapse

Class Method Details

.extract(query) ⇒ Result

Extract timeframe from a query string

Parameters:

  • query (String)

    The query to parse

Returns:

  • (Result)

    Struct with :query (cleaned), :timeframe, :original_expression



110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# File 'lib/htm/timeframe_extractor.rb', line 110

def extract(query)
  return Result.new(query: query, timeframe: nil, original_expression: nil) if query.nil? || query.strip.empty?

  # Try each pattern against the ORIGINAL query
  TEMPORAL_PATTERNS.each do |pattern|
    match = query.match(pattern)
    next unless match

    original_expression = match[0].strip
    timeframe = parse_expression(original_expression)
    next unless timeframe

    # Remove the matched expression from query
    cleaned_query = clean_query(query, original_expression)

    return Result.new(
      query: cleaned_query,
      timeframe: timeframe,
      original_expression: original_expression
    )
  end

  # No temporal expression found
  Result.new(query: query, timeframe: nil, original_expression: nil)
end

.temporal?(query) ⇒ Boolean

Check if query contains a temporal expression

Parameters:

  • query (String)

    The query to check

Returns:

  • (Boolean)


141
142
143
144
145
# File 'lib/htm/timeframe_extractor.rb', line 141

def temporal?(query)
  return false if query.nil? || query.strip.empty?

  TEMPORAL_PATTERNS.any? { |pattern| query.match?(pattern) }
end