Class: LlmDocsBuilder::UrlFetcher
- Inherits:
-
Object
- Object
- LlmDocsBuilder::UrlFetcher
- Defined in:
- lib/llm_docs_builder/url_fetcher.rb
Overview
Lightweight HTTP client for fetching remote documentation pages.
Provides common functionality needed by multiple commands (transform, compare) including strict scheme validation, redirect handling and sensible timeouts.
Constant Summary collapse
- DEFAULT_USER_AGENT =
Default user agent string for HTTP requests
'llm-docs-builder/1.0 (+https://github.com/mensfeld/llm-docs-builder)'- MAX_REDIRECTS =
Maximum number of redirects to follow
10
Instance Method Summary collapse
-
#fetch(url_string, redirect_count = 0) ⇒ String
Fetch remote URL content while following redirects.
-
#initialize(user_agent: DEFAULT_USER_AGENT, verbose: false, output: $stdout) ⇒ UrlFetcher
constructor
A new instance of UrlFetcher.
Constructor Details
#initialize(user_agent: DEFAULT_USER_AGENT, verbose: false, output: $stdout) ⇒ UrlFetcher
Returns a new instance of UrlFetcher.
21 22 23 24 25 |
# File 'lib/llm_docs_builder/url_fetcher.rb', line 21 def initialize(user_agent: DEFAULT_USER_AGENT, verbose: false, output: $stdout) @user_agent = user_agent @verbose = verbose @output = output end |
Instance Method Details
#fetch(url_string, redirect_count = 0) ⇒ String
Fetch remote URL content while following redirects.
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/llm_docs_builder/url_fetcher.rb', line 33 def fetch(url_string, redirect_count = 0) if redirect_count >= MAX_REDIRECTS raise( Errors::GenerationError, "Too many redirects (#{MAX_REDIRECTS}) when fetching #{url_string}" ) end uri = validate_and_parse_url(url_string) http = Net::HTTP.new(uri.host, uri.port) http.use_ssl = uri.scheme == 'https' http.open_timeout = 10 http.read_timeout = 30 request = Net::HTTP::Get.new(uri.request_uri) request['User-Agent'] = @user_agent response = http.request(request) case response when Net::HTTPSuccess response.body when Net::HTTPRedirection redirect_url = absolute_redirect_url(uri, response['location']) log_redirect(redirect_url) fetch(redirect_url, redirect_count + 1) else raise( Errors::GenerationError, "Failed to fetch #{url_string}: #{response.code} #{response.}" ) end rescue Errors::GenerationError raise rescue StandardError => e raise( Errors::GenerationError, "Error fetching #{url_string}: #{e.}" ) end |