Class: Arachni::OptionGroups::Scope

Inherits:
Arachni::OptionGroup show all
Defined in:
lib/arachni/option_groups/scope.rb

Overview

Scan scope options, maintains rules used to decide which resources should be considered for crawling/auditing/etc. during the scan.

Author:

  • Tasos “Zapotek” Laskos <tasos.laskos@arachni-scanner.com>

Constant Summary collapse

EXCLUDE_MIME_TYPES =
{
  # Media
  image:       %w(
        gif bmp tif tiff jpg jpeg jpe pjpeg png ico psd xcf 3dm max svg eps
        drw ai
    ),
  video:       %w(asf rm mpg mpeg mpe 3gp 3g2  avi flv mov mp4 swf vob wmv),
  audio:       %w(aif mp3 mpa ra wav wma mid m4a ogg flac),

  # Generic data
  archive:     %w(zip zipx tar gz 7z rar bz2),
  disk:        %w(bin cue dmg iso mdf vcd raw),

  # Executables -- or thereabouts.
  application: %w(exe apk app jar pkg deb rpm msi),

  # Assets
  #
  # The browsers will not check the scope for asset files, so these shouldn't
  # mess with it, they should only narrow down the audit.
  font:        %w(ttf otf woff woff2 fon fnt),
  stylesheet:  %w(css),
  script:      %w(js),

  # Documents
  #
  # Allow rtf, ps, xls, doc, ppt, ppts since they can contain greppable text.
  document:    %w(pdf docx xlsx pptx odt odp),
}
EXCLUDE_FILE_EXTENSIONS =
Set.new( EXCLUDE_MIME_TYPES.values.flatten.uniq )

Instance Attribute Summary collapse

Instance Method Summary collapse

Methods inherited from Arachni::OptionGroup

#==, attr_accessor, attributes, #attributes, defaults, #defaults, #hash, inherited, #initialize, #merge, set_defaults, #to_h, #to_hash, #update, #validate

Constructor Details

This class inherits a constructor from Arachni::OptionGroup

Instance Attribute Details

#auto_redundant_pathsBool

Returns Sets a limit to how many paths with identical query parameter names to process. Helps avoid processing redundant/identical resources like entries in calendars and catalogs.

Returns:

  • (Bool)

    Sets a limit to how many paths with identical query parameter names to process. Helps avoid processing redundant/identical resources like entries in calendars and catalogs.

See Also:


123
124
125
# File 'lib/arachni/option_groups/scope.rb', line 123

def auto_redundant_paths
  @auto_redundant_paths
end

#directory_depth_limitInteger

Note:

`nil` is infinite – default is `nil`.

Returns How deep to go into the site's directory tree.

Returns:

  • (Integer)

    How deep to go into the site's directory tree.

See Also:


55
56
57
# File 'lib/arachni/option_groups/scope.rb', line 55

def directory_depth_limit
  @directory_depth_limit
end

#dom_depth_limitInteger

Note:

`nil` is infinite – default is `10`.

Returns How deep to go into each page's DOM tree.

Returns:

  • (Integer)

    How deep to go into each page's DOM tree.

See Also:


63
64
65
# File 'lib/arachni/option_groups/scope.rb', line 63

def dom_depth_limit
  @dom_depth_limit
end

#dom_event_inheritance_limitInteger

Note:

`nil` is infinite – default is `nil`.

Returns How many elements should inherit the DOM events of their parents.

Returns:

  • (Integer)

    How many elements should inherit the DOM events of their parents.

See Also:


79
80
81
# File 'lib/arachni/option_groups/scope.rb', line 79

def dom_event_inheritance_limit
  @dom_event_inheritance_limit
end

#dom_event_limitInteger

Note:

`nil` is infinite – default is `nil`.

Returns How many DOM events to trigger for each snapshot.

Returns:

  • (Integer)

    How many DOM events to trigger for each snapshot.

See Also:


71
72
73
# File 'lib/arachni/option_groups/scope.rb', line 71

def dom_event_limit
  @dom_event_limit
end

#exclude_binariesBool Also known as: exclude_binaries?

Note:

Default is `false`.

Returns Exclude pages with binary content from the audit. Mainly used to avoid having grep checks confused by random binary content.

Returns:

  • (Bool)

    Exclude pages with binary content from the audit. Mainly used to avoid having grep checks confused by random binary content.

See Also:

  • HTTP::Response::Scope#exclude_as_binary?

159
160
161
# File 'lib/arachni/option_groups/scope.rb', line 159

def exclude_binaries
  @exclude_binaries
end

#exclude_content_patternsArray<Regexp>

Returns Page/HTTP::Response bodies matching any of these patterns will be are ignored.

Returns:

See Also:

  • HTTP::Response::Scope#exclude_content?

150
151
152
# File 'lib/arachni/option_groups/scope.rb', line 150

def exclude_content_patterns
  @exclude_content_patterns
end

#exclude_file_extensionsArray<String>

Returns Extension exclusion patterns, resources whose extension is in the list will not be considered.

Returns:

  • (Array<String>)

    Extension exclusion patterns, resources whose extension is in the list will not be considered.

See Also:


137
138
139
# File 'lib/arachni/option_groups/scope.rb', line 137

def exclude_file_extensions
  @exclude_file_extensions
end

#exclude_path_patternsArray<Regexp>

Returns Path exclusion patterns, resources that match any of the specified patterns will not be considered.

Returns:

  • (Array<Regexp>)

    Path exclusion patterns, resources that match any of the specified patterns will not be considered.

See Also:


144
145
146
# File 'lib/arachni/option_groups/scope.rb', line 144

def exclude_path_patterns
  @exclude_path_patterns
end

#extend_pathsArray<String>

Returns Paths to use in addition to crawling.

Returns:

  • (Array<String>)

    Paths to use in addition to crawling.

See Also:

  • Framework#push_to_page_queue
  • Framework#push_to_url_queue

103
104
105
# File 'lib/arachni/option_groups/scope.rb', line 103

def extend_paths
  @extend_paths
end

#https_onlyBool Also known as: https_only?

Returns If an HTTPS Arachni::Options#url has been provided, **do not** downgrade to to a insecure link.

Returns:

  • (Bool)

    If an HTTPS Arachni::Options#url has been provided, **do not** downgrade to to a insecure link.

See Also:


176
177
178
# File 'lib/arachni/option_groups/scope.rb', line 176

def https_only
  @https_only
end

#include_path_patternsArray<Regexp>

Returns Path inclusion patterns, only resources that match any of the specified patterns will be considered.

Returns:

  • (Array<Regexp>)

    Path inclusion patterns, only resources that match any of the specified patterns will be considered.

See Also:


130
131
132
# File 'lib/arachni/option_groups/scope.rb', line 130

def include_path_patterns
  @include_path_patterns
end

#include_subdomainsBool

Note:

Default if `false`.

Returns Take into consideration URLs pointing to different subdomains from the seed URL.

Returns:

  • (Bool)

    Take into consideration URLs pointing to different subdomains from the seed URL.

See Also:


169
170
171
# File 'lib/arachni/option_groups/scope.rb', line 169

def include_subdomains
  @include_subdomains
end

#page_limitInteger

Note:

`nil` is infinite – default is `nil`.

Returns How many pages to consider (crawl/audit)?.

Returns:

  • (Integer)

    How many pages to consider (crawl/audit)?

See Also:

  • Framework#push_to_page_queue
  • Framework#push_to_url_queue
  • Framework#audit_page
  • Trainer#push

90
91
92
# File 'lib/arachni/option_groups/scope.rb', line 90

def page_limit
  @page_limit
end

#redundant_path_patternsHash{Regexp => Integer}

Returns Filters for redundant paths in the form of `{ pattern => counter }`. Once the `pattern` has matched a path `counter` amount of times, the resource will be ignored from then on.

Useful when scanning pages that dynamically generate a large number of pages like galleries and calendars.

Returns:

  • (Hash{Regexp => Integer})

    Filters for redundant paths in the form of `{ pattern => counter }`. Once the `pattern` has matched a path `counter` amount of times, the resource will be ignored from then on.

    Useful when scanning pages that dynamically generate a large number of pages like galleries and calendars.

See Also:


114
115
116
# File 'lib/arachni/option_groups/scope.rb', line 114

def redundant_path_patterns
  @redundant_path_patterns
end

#restrict_pathsArray<String>

Returns Paths to use instead of crawling.

Returns:

  • (Array<String>)

    Paths to use instead of crawling.

See Also:

  • Framework#push_to_url_queue

96
97
98
# File 'lib/arachni/option_groups/scope.rb', line 96

def restrict_paths
  @restrict_paths
end

#url_rewritesHash<Regexp => String>

Returns Regular expression and substitution pairs, used to rewrite Element::Capabilities::Submittable#action.

Returns:

See Also:


185
186
187
# File 'lib/arachni/option_groups/scope.rb', line 185

def url_rewrites
  @url_rewrites
end

Instance Method Details

#auto_redundant?Boolean

Returns:

  • (Boolean)
[View source]

245
246
247
# File 'lib/arachni/option_groups/scope.rb', line 245

def auto_redundant?
    @auto_redundant_paths.to_i > 0
end

#auto_redundant_counterObject

[View source]

249
250
251
# File 'lib/arachni/option_groups/scope.rb', line 249

def auto_redundant_counter
    @auto_redundant_counter ||= Hash.new( 0 )
end

#crawlObject

[View source]

257
258
259
# File 'lib/arachni/option_groups/scope.rb', line 257

def crawl
    self.page_limit = nil
end

#crawl?Boolean

Returns:

  • (Boolean)
[View source]

261
262
263
# File 'lib/arachni/option_groups/scope.rb', line 261

def crawl?
    !page_limit || page_limit != 0
end

#do_not_crawlObject

[View source]

253
254
255
# File 'lib/arachni/option_groups/scope.rb', line 253

def do_not_crawl
    self.page_limit = 0
end

#dom_event_limit_reached?(count) ⇒ Boolean

Returns:

  • (Boolean)
[View source]

269
270
271
# File 'lib/arachni/option_groups/scope.rb', line 269

def dom_event_limit_reached?( count )
    dom_event_limit && count >= dom_event_limit
end

#page_limit_reached?(count) ⇒ Boolean

Returns:

  • (Boolean)
[View source]

265
266
267
# File 'lib/arachni/option_groups/scope.rb', line 265

def page_limit_reached?( count )
    page_limit && page_limit.to_i > 0 && count >= page_limit
end

#to_rpc_dataObject

[View source]

299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
# File 'lib/arachni/option_groups/scope.rb', line 299

def to_rpc_data
    d = super

    d['exclude_file_extensions'] = d['exclude_file_extensions'].to_a

    %w(redundant_path_patterns url_rewrites).each do |k|
        d[k] = d[k].inject({}) { |h, (k2, v)| h.merge k2.source => v }
    end

    %w(exclude_path_patterns exclude_content_patterns include_path_patterns).each do |k|
        d[k] = d[k].map(&:source)
    end

    d
end