Class: LlmDocsBuilder::Parser

Inherits:
Object
  • Object
show all
Defined in:
lib/llm_docs_builder/parser.rb

Overview

Parses llms.txt files into structured data

Reads and parses llms.txt files according to the llms.txt specification, extracting the title, description, and structured sections (Documentation, Examples, Optional) with their links.

Examples:

Parse an llms.txt file

parser = LlmDocsBuilder::Parser.new('llms.txt')
parsed = parser.parse
parsed.title              # => "My Project"
parsed.description        # => "Project description"
parsed.documentation_links # => [{title: "README", url: "...", description: "..."}]

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(file_path) ⇒ Parser

Initialize a new parser

Parameters:

  • file_path (String)

    path to the llms.txt file to parse



28
29
30
31
# File 'lib/llm_docs_builder/parser.rb', line 28

def initialize(file_path)
  @file_path = file_path
  @content = File.read(file_path)
end

Instance Attribute Details

#contentString (readonly)

Returns raw content of the llms.txt file.

Returns:

  • (String)

    raw content of the llms.txt file



23
24
25
# File 'lib/llm_docs_builder/parser.rb', line 23

def content
  @content
end

#file_pathString (readonly)

Returns path to the llms.txt file.

Returns:

  • (String)

    path to the llms.txt file



20
21
22
# File 'lib/llm_docs_builder/parser.rb', line 20

def file_path
  @file_path
end

Instance Method Details

#parseParsedContent

Parse the llms.txt file

Parses the file content and returns a LlmDocsBuilder::ParsedContent object containing the extracted title, description, and structured sections with links.

Returns:

  • (ParsedContent)

    parsed content with title, description, and sections



39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/llm_docs_builder/parser.rb', line 39

def parse
  sections = {}
  current_section = nil
  current_content = []

  lines = content.lines

  lines.each_with_index do |line, index|
    if line.start_with?('# ')
      save_section(sections, current_section, current_content) if current_section

      sections[:title] = line[2..].strip if sections.empty?
      current_section = :description if index == 1 && line.start_with?('> ')
      current_content = []
    elsif line.start_with?('> ') && sections[:title] && !sections[:description]
      sections[:description] = line[2..].strip
    elsif line.start_with?('## ')
      save_section(sections, current_section, current_content) if current_section

      current_section = line[3..].strip.downcase.gsub(/\s+/, '_').to_sym
      current_content = []
    elsif !line.strip.empty?
      current_content << line
    end
  end

  save_section(sections, current_section, current_content) if current_section

  ParsedContent.new(sections)
end