Important notice: this blog is for educational purposes only. Do not use these techniques against targets without explicit permission of the system owner. This blog is part of the Blue Hat Hackers group insights.
This blog is about an investigation into antivirus detection methods and how they perform nowadays (6 years later). Don’t worry – read on and it will all be explained. Let’s dive into it!
Imagine you’re a hacker. You want to deploy your malware on a system. You are well prepared, so when your target downloads fun_game_i_made.exe from your website and clicks on it, it loads a fake game and voila – the target computer connects to your computer and you get a shell. You now have complete access to it. Although, that was the plan. In reality the user just got the notification of an antivirus product saying: This is malware, I will safely remove it from your computer.
To create malware and make it land successfully without being detected, you need to know how antivirus software works and how you can get around their clever detection mechanisms. This blog takes a look at what antivirus products do exactly, and how evasion of these products works.
First of all – why would you want to make undetectable malware? We are White (Blue!) Hat hackers after all, and do not intend to harm anyone. Two reasons: we want to simulate cybercriminals when we train companies how to defend against real cybercriminals (in Red Teaming exercises, for example) and when breaking an antivirus system we can identify its weaknesses to improve it (as with all offensive cybersecurity).
Let’s look at a bit of history to understand where malware came from. The first virus was already around in the early seventies, called the ‘Creeper’ virus. A program which spread over ARPANET (the old internet), printing messages on teletype machines where it went (see image). This is also when the first ‘antivirus’ software was created – the ‘Reaper’ – in order to spread (like a virus) and remove the ‘Creeper’ from machines.
From there it got worse and worse. Programmers got to know how viruses work and what you can do with it.
At first, some ‘innocent’ viruses spread more than expected. And gradually, as more viruses appeared – and also the first real malware – the need for antivirus software grew as well, and between 1990 and 2000 it became a complete industry.
Check process memory score on Virustotal – source: https://www.virustotal.com/
Inner workings of antivirus software
So how do they work? At first, the method was quite simple. When a new file is detected, a signature of it is compared to a large database of signatures of known malicious files. When it appears in the database, it must be malicious and the file is removed or quarantined. This worked quite well for viruses which spread by copying themselves or wide-spread malware. When a new malicious program is found, it is analyzed by the antivirus company and added to the database. It is, however, easy to see how this can fail. When customized malware is created for a specific target, it can easily go unnoticed since the signature is not (yet) in the databases of antivirus products. It gets even more complicated when viruses spread by not copying exactly, but modifying themselves so they do exactly the same but with slightly different code. As a response, antivirus products have evolved and most of them contain a dozen of methods to detect malicious software. An important breakthrough in this respect is the use of heuristics, which also enable the detection of malicious files that are not necessarily exact matches. Think about searching with wildcards, or even with regular expressions. When a virus now replicates and, for example, adds random garbage to the end of the code, it could be detected by something like `[important virus feature] [random data]` and still be caught. This can also work well for virus variations that do not yet exist. A beautiful example is YARA, with which you can create descriptions of files. You can check it out at http://virustotal.github.io/yara/ and below is an image taken from their website to give you an idea how it works.
Yara example – source: http://virustotal.github.io/yara/
It is, however, still possible to see how this can be evaded. Even when using heuristics, customized malware can still be very difficult to detect. There are even viruses which can manipulate themselves in such a way that they look completely different by using encryption and changing the decryption part. This is called polymorphic code – a very interesting topic (but unfortunately out of scope for this blog). This means that the antivirus products need more than just static analysis, since there are only limited possibilities when it comes to just looking at the code. Machine learning could help, or maybe more advanced methods to define heuristics. This is where dynamic analysis comes in – and where it gets more interesting.
Up until now, the detection techniques are about static analysis of the file. With dynamic analysis you want to run the file and check what it does. You can then check for weird patterns such as encrypting a lot of files, connecting to a weird host, etcetera. Of course, this dynamic analysis happens in heavily restricted sandbox environments to minimize the risk of the file causing harm to the system. Let’s look at an example of what such an analysis could look like.
First let’s try to create a very well-known piece of malware: A reverse meterpreter shell. We can just generate one with the msfvenom command. Of course this gets detected by the static analysis immediately. Next step, use some encryption. We encrypt the reverse meterpreter shell and write a small piece of code to decrypt it on runtime and then run it – all in one single executable file. Let’s assume our decryption code is unique and the encrypted meterpreter shell uses a non-deterministic encryption (different every time). Static analysis should now not (easily) be able to detect the meterpreter shell. But dynamic analysis can run this file in a sandbox, it decrypts itself in memory, and then the memory gets executed. At this point, both the behavior and the memory can be analyzed, which makes it probable it will be detected. At this point, the easily detectable meterpreter shell is unencrypted in the system memory.
To the attack!
There are different methods known to circumvent the dynamic analysis. The main idea of those methods is to (ab)use the differences between a normal system and the sandbox environment of the antivirus software. I used the ideas as described by Sevagas on the following blog/paper: https://blog.sevagas.com/?Bypass-Antivirus-Dynam ic-Analysis. Spoiler alert: The other blog posts are also interesting to read! Below I created an overview of some methods. These methods abuse the fact that the sandbox environment can look different from a normal system (variables, internet connection, etc.) and the fact that the use of expensive operations must be limited (otherwise a scan of a file would take ages / too much resources).
|Allocate and fill 100M memory||Just allocate a lot of memory. The idea is that a sandbox just does not want to execute the memory allocation because it is expensive.|
|Hundred million increments||Do a lot of operations. The idea is that a sandbox ends termination after a certain amount of operations.|
|Attempt to open a system process||A sandbox limits the capabilities of the file and it is probably not allowed to open system processes.|
|Attempt to open a non-existing URL||Sandboxes probably do not allow internet access. To simulate this, the sandbox could generate a fake webpage to return, possibly for non- existent URLs as well.|
|Action which depends on local username||The local username is probably not present in the sandbox. If the username is known to the attacker, this can be used.|
|What the **** is NUMA?||Use a very specific function (NUMA) that is probably not supported by the sandbox environment.|
|What the **** are FLS?||Use a very specific function (FLS) that is probably not supported by the sandbox environment.|
|Check process memory||Apparently, retrieving information about the process memory could return different results when in a sandbox.|
|Time distortion||Time intensive processes are generally avoided by sandboxes (otherwise a scan of a file could take ages). This could be done by just skipping a sleep command. This could be measured to detect a sandbox.|
|What is my name?||A sandbox probably does not contain the same environment variables as a real system. As such, a process is probably not spawned under its own name, which we can detect.|
I was wondering how well these methods keep up with time. In the paper (https://blog. sevagas.com/?Bypass-Antivirus-Dynamic-Analysis) they were investigated over 6 years ago with very good results. I ran the experiments again to see how they keep up. To do this, I created a basic meterpreter reverse shell payload with meterpreter, which was easily detected by antivirus products. I then used this tool (https://github.com/yashmundra/Shellcode-Encryption) to encrypt the payload, and put it into a C++ file, including a small piece of code to decrypt and run it during runtime. Note that this is a bit different than the setup in the Sevagas paper. At this point, I have the base case which I can compile and test. Now for every method described above I added the code to the C++ file, compiled it and tested it as well. To test the binaries, I used Virustotal (https://www.virustotal.com/gui/) which tests the file against quite a number of antivirus products. A small side note here is that they do not guarantee that the settings of the antivirus products are the same as on all systems, but it should give us an idea of how well the method works.
|Base case without dynamic analysis protection||21/69|
|Allocate and fill 100M memory||17/69|
|Hundred million increments||18/69|
|Attempt to open a system process||20/69|
|Attempt to open a non-existing URL||22/69|
|Action which depends on local username||16/69|
|What the **** is NUMA?||18/69|
|What the **** are FLS?||19/69|
|Check process memory||7/69|
|What is my name?||19/69|
Check process memory score on Virustotal – source: https://www.virustotal.com/
First of all, quite a number of antivirus solutions detect the used methods. These methods may have scored very well over 6 years ago, but nowadays it is harder to remain undetected. I think this is mainly due to clever sandboxes and checking the behavior of the file. For example, it is rather weird if an executable decrypts some code and then executes it. On the other hand, I am surprised that the ‘check memory’ method has such a low detection rate. This does mean that it has an effect, but not enough to remain under everyone’s radar. Still, it is fun to play with and I am sure more advanced methods will work even better. Obviously, you could even combine different methods. And then, of course, there is still the challenge that modern antivirus software also checks the file when the user is actually executing it, making it even harder to look legitimate. So, in conclusion, antivirus products are not the holy grail, but they help keep a lot of malicious software away from your system.
For any further queries, please contact Jasper Boot.