Class: Html2rss::AutoSource::Scraper::LinkHeuristics::DestinationFacts

Inherits:
Data
  • Object
show all
Defined in:
lib/html2rss/auto_source/scraper/link_heuristics.rb

Overview

Normalized URL plus reusable route-classification facts for one link.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Attribute Details

#content_pathObject (readonly)

Returns the value of attribute content_path

Returns:

  • (Object)

    the current value of content_path



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def content_path
  @content_path
end

#destinationObject (readonly)

Returns the value of attribute destination

Returns:

  • (Object)

    the current value of destination



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def destination
  @destination
end

#high_confidence_junk_pathObject (readonly)

Returns the value of attribute high_confidence_junk_path

Returns:

  • (Object)

    the current value of high_confidence_junk_path



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def high_confidence_junk_path
  @high_confidence_junk_path
end

#high_confidence_utility_destinationObject (readonly)

Returns the value of attribute high_confidence_utility_destination

Returns:

  • (Object)

    the current value of high_confidence_utility_destination



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def high_confidence_utility_destination
  @high_confidence_utility_destination
end

#segmentsObject (readonly)

Returns the value of attribute segments

Returns:

  • (Object)

    the current value of segments



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def segments
  @segments
end

#shallowObject (readonly)

Returns the value of attribute shallow

Returns:

  • (Object)

    the current value of shallow



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def shallow
  @shallow
end

#strong_post_suffixObject (readonly)

Returns the value of attribute strong_post_suffix

Returns:

  • (Object)

    the current value of strong_post_suffix



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def strong_post_suffix
  @strong_post_suffix
end

#taxonomy_pathObject (readonly)

Returns the value of attribute taxonomy_path

Returns:

  • (Object)

    the current value of taxonomy_path



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def taxonomy_path
  @taxonomy_path
end

#urlObject (readonly)

Returns the value of attribute url

Returns:

  • (Object)

    the current value of url



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def url
  @url
end

#utility_pathObject (readonly)

Returns the value of attribute utility_path

Returns:

  • (Object)

    the current value of utility_path



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def utility_path
  @utility_path
end

#vanity_pathObject (readonly)

Returns the value of attribute vanity_path

Returns:

  • (Object)

    the current value of vanity_path



12
13
14
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 12

def vanity_path
  @vanity_path
end

Class Method Details

.build(url) ⇒ DestinationFacts

Returns route facts for downstream link scoring.

Parameters:

Returns:



27
28
29
30
31
32
33
34
35
# File 'lib/html2rss/auto_source/scraper/link_heuristics.rb', line 27

def self.build(url)
  classifier = PathClassifier.new(url.path_segments)

  new(
    url:,
    destination: url.to_s,
    **classifier.destination_attributes
  )
end