Class: RedditPostToMarkdown::UrlValidator
- Inherits:
-
Object
- Object
- RedditPostToMarkdown::UrlValidator
- Defined in:
- lib/reddit_post_to_markdown/url_validator.rb
Overview
Validates and normalises Reddit post URLs.
Constant Summary collapse
- PATTERNS =
[ %r{\Ahttps://(?:www\.)?reddit\.com/r/[^/]+/comments/[a-z0-9]+/}, %r{\Ahttps://(?:www\.)?reddit\.com/[^/]+/comments/[a-z0-9]+/}, %r{\Ahttps://(?:old\.)?reddit\.com/r/[^/]+/comments/[a-z0-9]+/}, %r{\Ahttps://redd\.it/[a-z0-9]+} ].freeze
Class Method Summary collapse
-
.clean_url(url) ⇒ String
Strips common tracking parameters and the trailing slash from a Reddit URL.
-
.valid_post_url?(url) ⇒ Boolean
Returns
trueifurllooks like a direct Reddit post URL.
Class Method Details
.clean_url(url) ⇒ String
Strips common tracking parameters and the trailing slash from a Reddit URL.
Removes query strings beginning with ?utm_source, ?ref=, or ?context=, then strips any trailing slash. Leading and trailing whitespace is also removed.
39 40 41 42 43 44 45 |
# File 'lib/reddit_post_to_markdown/url_validator.rb', line 39 def self.clean_url(url) url = url.to_s.strip url = url.split("?utm_source").first url = url.split("?ref=").first url = url.split("?context=").first url.chomp("/") end |
.valid_post_url?(url) ⇒ Boolean
Returns true if url looks like a direct Reddit post URL.
A valid post URL must use HTTPS and match one of the following forms:
-
www.reddit.com/r/<sub>/comments/<id>/
-
reddit.com/r/<sub>/comments/<id>/
-
old.reddit.com/r/<sub>/comments/<id>/
-
redd.it/<id>
Subreddit listings, user profiles, search pages, and similar URLs return false.
24 25 26 27 28 29 |
# File 'lib/reddit_post_to_markdown/url_validator.rb', line 24 def self.valid_post_url?(url) return false if url.nil? || url.empty? return false unless url.start_with?("https://") PATTERNS.any? { |pattern| url.match?(pattern) } end |