Class: RedditPostToMarkdown::UrlValidator

Inherits:
Object
  • Object
show all
Defined in:
lib/reddit_post_to_markdown/url_validator.rb

Overview

Validates and normalises Reddit post URLs.

Constant Summary collapse

PATTERNS =
[
  %r{\Ahttps://(?:www\.)?reddit\.com/r/[^/]+/comments/[a-z0-9]+/},
  %r{\Ahttps://(?:www\.)?reddit\.com/[^/]+/comments/[a-z0-9]+/},
  %r{\Ahttps://(?:old\.)?reddit\.com/r/[^/]+/comments/[a-z0-9]+/},
  %r{\Ahttps://redd\.it/[a-z0-9]+}
].freeze

Class Method Summary collapse

Class Method Details

.clean_url(url) ⇒ String

Strips common tracking parameters and the trailing slash from a Reddit URL.

Removes query strings beginning with ?utm_source, ?ref=, or ?context=, then strips any trailing slash. Leading and trailing whitespace is also removed.

Parameters:

  • url (String)

    the URL to clean

Returns:

  • (String)

    the cleaned URL



39
40
41
42
43
44
45
# File 'lib/reddit_post_to_markdown/url_validator.rb', line 39

def self.clean_url(url)
  url = url.to_s.strip
  url = url.split("?utm_source").first
  url = url.split("?ref=").first
  url = url.split("?context=").first
  url.chomp("/")
end

.valid_post_url?(url) ⇒ Boolean

Returns true if url looks like a direct Reddit post URL.

A valid post URL must use HTTPS and match one of the following forms:

Subreddit listings, user profiles, search pages, and similar URLs return false.

Parameters:

  • url (String, nil)

    the URL to check

Returns:

  • (Boolean)


24
25
26
27
28
29
# File 'lib/reddit_post_to_markdown/url_validator.rb', line 24

def self.valid_post_url?(url)
  return false if url.nil? || url.empty?
  return false unless url.start_with?("https://")

  PATTERNS.any? { |pattern| url.match?(pattern) }
end