Class: YaraTools::YaraRule
- Inherits:
-
Object
- Object
- YaraTools::YaraRule
- Defined in:
- lib/yara-normalize/yara-normalize.rb
Instance Attribute Summary collapse
-
#condition ⇒ Object
readonly
Returns the value of attribute condition.
-
#meta ⇒ Object
readonly
Returns the value of attribute meta.
-
#name ⇒ Object
readonly
Returns the value of attribute name.
-
#normalized_strings ⇒ Object
readonly
Returns the value of attribute normalized_strings.
-
#original ⇒ Object
readonly
Returns the value of attribute original.
-
#strings ⇒ Object
readonly
Returns the value of attribute strings.
-
#tags ⇒ Object
readonly
Returns the value of attribute tags.
Instance Method Summary collapse
-
#_normalize_condition(condition) ⇒ Object
Replace named variable references in a condition line with positional tokens so that renaming $mshtmlExec_1 → $a does not change the hash.
-
#hash ⇒ Object
Return a stable identifier for this rule in the form: yn<VERSION>:<strings_fingerprint>:<condition_fingerprint>.
-
#initialize(ruletext) ⇒ YaraRule
constructor
A new instance of YaraRule.
-
#normalize ⇒ Object
Return a canonical, human-readable rendering of the rule with consistent indentation and ordering.
Constructor Details
#initialize(ruletext) ⇒ YaraRule
Returns a new instance of YaraRule.
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
# File 'lib/yara-normalize/yara-normalize.rb', line 12 def initialize(ruletext) # Normalize line endings and strip single-line (//) comments before # any further parsing so they never appear in meta/strings/condition. ruletext = ruletext.gsub(/[\r\n]+/, "\n").gsub(/^\s*\/\/.*$/, '') @original = ruletext # Lookup table used by _normalize_condition to replace variable names # ($foo, #foo) with stable positional tokens ($0, $1, …) so that # cosmetic renames do not affect the normalized condition hash. @lookup_table = {} @next_replacement = 0 # Single-pass regex parse. The rule grammar is: # rule <name> [: <tags>] { [meta: …] strings: … condition: … } # The .*? quantifiers are non-greedy so they stop at the first matching # delimiter keyword rather than consuming the whole file. rule_re = /rule\s+([\w\-]+)(\s*:\s*(\w[\w\s]+\w))?\s*\{\s*(meta:\s*(.*?))?strings:\s*(.*?)\s*condition:\s*(.*?)\s*\}/m if ruletext =~ rule_re name, _, , _, , strings, condition = $~.captures @name = name # Tags are optional; split on whitespace/commas when present. @tags = .strip.split(/[,\s]+/) if # Parse the meta section into a key/value Hash. Each line has the # form: key = value (value may contain spaces and quotes). @meta = {} if .split(/\n/).each do |m| k, v = m.strip.split(/\s*=\s*/, 2) @meta[k] = v if v end end # Parse the strings section, normalizing whitespace around '=' and # canonicalizing any hex byte strings (e.g. { 4D 5A } → { 4d 5a }). @normalized_strings = [] @strings = strings.split(/\n/).map do |s| s = s.strip # Collapse any amount of whitespace around '=' to a single ' = '. s[/\s*=\s*/, 0] = " = " if s[/\s*=\s*/, 0] # Hex byte strings: normalise spacing and case so that # { 4D5A } and { 4d 5a } produce the same output. if s =~ /= \{([0-9a-fA-F\s]+)\}/ hexstr = $1.gsub(/\s+/, '').downcase.scan(/../).join(" ") s = s.gsub(/= \{([0-9a-fA-F\s]+)\}/, "= { #{hexstr} }") end # Collect only the value portion (right of ' = ') for hashing, # so that variable renames ($a → $b) do not change the hash. _, val = s.split(/ = /, 2) @normalized_strings << (val || s) s end @normalized_strings.sort! @condition = condition.split(/\n/).map(&:strip) @normalized_condition = @condition.map { |x| _normalize_condition(x) } end end |
Instance Attribute Details
#condition ⇒ Object (readonly)
Returns the value of attribute condition.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def condition @condition end |
#meta ⇒ Object (readonly)
Returns the value of attribute meta.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def @meta end |
#name ⇒ Object (readonly)
Returns the value of attribute name.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def name @name end |
#normalized_strings ⇒ Object (readonly)
Returns the value of attribute normalized_strings.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def normalized_strings @normalized_strings end |
#original ⇒ Object (readonly)
Returns the value of attribute original.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def original @original end |
#strings ⇒ Object (readonly)
Returns the value of attribute strings.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def strings @strings end |
#tags ⇒ Object (readonly)
Returns the value of attribute tags.
10 11 12 |
# File 'lib/yara-normalize/yara-normalize.rb', line 10 def @tags end |
Instance Method Details
#_normalize_condition(condition) ⇒ Object
Replace named variable references in a condition line with positional tokens so that renaming $mshtmlExec_1 → $a does not change the hash. Both count (#) and match ($) sigils are preserved. NOTE: This method is intentionally prefixed with _ to signal that it is an internal implementation detail; do not call it from outside this class.
81 82 83 84 85 86 87 88 89 90 91 |
# File 'lib/yara-normalize/yara-normalize.rb', line 81 def _normalize_condition(condition) condition.gsub(/[\$\#]\w+/) do |x| key = x[1, 1000] @lookup_table[key] ||= begin val = @next_replacement.to_s @next_replacement += 1 val end x[0].chr + @lookup_table[key] end end |
#hash ⇒ Object
Return a stable identifier for this rule in the form:
yn<VERSION>:<strings_fingerprint>:<condition_fingerprint>
The strings fingerprint is the last 16 hex chars of the SHA-256 digest of the sorted, normalised string values joined by ‘%’. The condition fingerprint is the last 10 hex chars of the SHA-256 digest of the normalised condition lines joined by ‘%’.
Using SHA-256 (replacing the previous MD5) gives 256-bit collision resistance and avoids MD5’s well-known preimage and collision weaknesses.
SECURITY NOTE: This method is named ‘hash` to match the public API, but it overrides Ruby’s built-in Object#hash, which is expected to return an Integer for use as a Hash table key. Do NOT use YaraRule objects as Hash keys; use .hash (this method) only for YARA rule fingerprinting.
134 135 136 137 138 139 140 |
# File 'lib/yara-normalize/yara-normalize.rb', line 134 def hash normalized_strings = @normalized_strings.join("%") normalized_condition = @normalized_condition.join("%") strings_digest = Digest::SHA256.hexdigest(normalized_strings) condition_digest = Digest::SHA256.hexdigest(normalized_condition) "yn#{VERSION}:#{strings_digest[-16, 16]}:#{condition_digest[-10, 10]}" end |
#normalize ⇒ Object
Return a canonical, human-readable rendering of the rule with consistent indentation and ordering. Tags, meta, strings, and condition are preserved in their original order.
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
# File 'lib/yara-normalize/yara-normalize.rb', line 96 def normalize text = "rule #{@name} " text += ": #{@tags.join(' ')} " if @tags && !@tags.empty? text += "{\n" if @meta && !@meta.empty? text += " meta:\n" @meta.each { |k, v| text += " #{k} = #{v}\n" } end if @strings && !@strings.empty? text += " strings:\n" @strings.each { |s| text += " #{s}\n" if s =~ /\w/ } end if @condition && !@condition.empty? text += " condition:\n" @condition.each { |c| text += " #{c}\n" if c =~ /\w/ } end text + "}" end |