jeudi 31 mai 2018

Signature Patterns Parsing Efficiency

I convert binary files into hex strings to search through them for specific patterns that the user has provided, the same way anti-viruses handle their signatures database. If a pattern is found, then it will return true.

One difficulty I'm facing is wildcards and the slow scanning speed. The user has thousands of patterns ranging up to 200 characters each or even more.

For example, this pattern is used to verify if a file was compiled under C++, while the "?" character is a wildcard (that can match any one character):

55 8B EC 53 8B 5D 08 56 8B 75 0C 85 F6 57 8B 7D 10 ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 01

Patterns similar to that one are all stacked in a file ranging in length, so I guess you get the general idea.

This is the code that I'm using (works fine, but is extremely slow compared with other tools that do pattern scanning in mere seconds, like ExeInfoPE or Die)

        static bool compare(string[] mask, byte[] buffer, int position)
    {
        var i = 0; // index
        foreach (string x in mask) // loop through the mask
        {
            if (x.Contains("?")) // is current mask position a wildcard?
            {
            }// if so skip comparison
            else if (byte.Parse(x, System.Globalization.NumberStyles.HexNumber) == buffer[position + i]) // else try to compare
            {
            }// succeeded, move onto next byte
            else
                return false; // failed, pattern not found
            ++i; // increment the index.
        }
        return true; // pattern was found
    }

Any ideas on how to tremendously increase the speed, while maintaining support for wildcards so that my tool can be used in the real world?

Aucun commentaire:

Enregistrer un commentaire