yara-normalize
Normalizes YARA signatures into a repeatable, stable hash even when non-semantic changes are made (whitespace, comments, tag ordering, variable renaming, etc.).
To enable consistent comparisons between YARA rules, a uniform fingerprinting standard is applied:
-
*Strings section* — each string value (the part after the ‘=’) is extracted, sorted alphabetically, and the sorted list is hashed with SHA-256. Variable names ($a, $mshtmlExec_1, …) are excluded from the hash so that renaming does not change the fingerprint.
-
*Condition section* — variable references ($name, #name) are replaced with positional tokens ($0, $1, …) in order of first appearance, so cosmetic renames do not affect the hash. The resulting text is hashed with SHA-256.
The rule fingerprint is:
yn<VERSION>:<last-16-hex-chars-of-strings-SHA256>:<last-10-hex-chars-of-condition-SHA256>
Prior to version 0.4.0 the fingerprint used MD5 and carried the prefix yn01. Since 0.4.0 the fingerprint uses SHA-256 and carries the prefix yn02. The two identifier series are not interchangeable.
Usage
require 'yara-normalize'
sig = <<~EOS
rule DataConversion__wide : IntegerParsing DataConversion {
meta:
weight = 1
strings:
$ = "wtoi" nocase
$ = "wtol" nocase
$ = "wtof" nocase
$ = "wtodb" nocase
condition:
any of them
}
EOS
yn = YaraTools::YaraRule.new(sig)
puts yn.hash
# => yn02:6783b7082bed88dc:6821e3f6a3
puts yn.name # => DataConversion__wide
pp yn. # => ["IntegerParsing", "DataConversion"]
pp yn. # => {"weight"=>"1"}
pp yn.strings # => ["$ = \"wtoi\" nocase", ...]
puts yn.normalize
# => rule DataConversion__wide : IntegerParsing DataConversion {
# meta:
# weight = 1
# strings:
# $ = "wtoi" nocase
# $ = "wtol" nocase
# $ = "wtof" nocase
# $ = "wtodb" nocase
# condition:
# any of them
# }
Splitting a multi-rule file:
rules = YaraTools::Splitter.split(File.read("ruleset.yar"))
rules.each { |r| puts "#{r.name}: #{r.hash}" }
Security notes
-
Fingerprints use SHA-256 (as of yn02). MD5-based yn01 hashes should be considered legacy and re-computed.
-
YaraRule#hash overrides Ruby’s Object#hash. Do not use
YaraRuleobjects as Hash keys; the method returns a String fingerprint, not the Integer that Ruby’s Hash tables require.
Contributing to yara-normalize
-
Check out the latest master to make sure the feature hasn’t been implemented or the bug hasn’t been fixed yet.
-
Check out the issue tracker to make sure someone already hasn’t requested it and/or contributed it.
-
Fork the project.
-
Start a feature/bugfix branch.
-
Commit and push until you are happy with your contribution.
-
Make sure to add tests for it. This is important so I don’t break it in a future version unintentionally.
-
Please try not to mess with the Rakefile, version, or history.
Copyright
Copyright © 2012 chrislee35. See LICENSE.txt for further details.