The malware delivery method pioneered by the threat actors behind the REvil ransomware and the Gootkit banking Trojan has been enjoying a renaissance of late, as telemetry indicates that criminals are using the method to deploy an array of malware payloads in South Korea, Germany, France, and across North America.
The Gootkit malware family has been around more than half a decade – a mature Trojan with functionality centered around banking credential theft. In recent years, almost as much effort has gone into improvement of its delivery method as has gone into the NodeJS-based malware itself.
In the past, Sophos and other security experts have bundled the discussion of the malware itself with analysis of the delivery mechanism, but as this method has been adopted to deliver a wider range of malicious code, we assert that this mechanism deserves scrutiny (and its own name), distinct from its payload, which is why we’ve decided to call it Gootloader.
In addition to the REvil and Gootkit payloads, Gootloader has been used most recently to deliver the Kronos trojan and Cobalt Strike.
In its latest attempts to evade detection by endpoint security tools, Gootloader has moved as much of its infection infrastructure to a “fileless” methodology as possible. While it isn’t completely fileless, these techniques are effective at evading detection over a network – right up to the point where the malicious activity trips over behavioral detection rules.
Search engine deoptimization as root cause
Gootloader uses malicious search engine optimization (SEO) techniques to squirm into Google search results. The way it accomplishes this task deserves some discussion, because it centers as much around technology as human psychology.
To accomplish this phase of the attack, the operators of Gootloader must maintain a network of servers hosting hacked, legitimate websites (we estimate roughly 400 such servers are in operation at any given time). The example shown above belongs to a legitimate business, a neonatal medical practice based in Canada. None of the site’s legitimate content has anything to do with real estate transactions – its doctors deliver babies – and yet it is the first result to appear in a query about a very narrowly defined type of real estate agreement. Google itself indicates the result is not an ad, and they have known about the site for nearly seven years. To the end user, the entire thing looks on the up-and-up.
When the visitor clicks through the link in this search result, they’re presented with another, very specific page that seems to deliver the answer to their exact question, using precisely the same wording as the search query (which sometimes comes across quite awkwardly).
And if that same site visitor clicks the “direct download link” provided on this page, they receive a .zip archive file with a filename that exactly matches the search query terms used in the initial search, which itself contains another file named in precisely the same way. This .js file is the initial infector, and the only stage of the infection at which a file is written to the filesystem. Everything that happens after the target double-clicks this script runs entirely in memory, out of the reach of traditional endpoint protection tools.
In our experience, many of these hacked sites serving the fake message board are running a well-known content management system, to which the threat actors make modifications that subtly rewrite how the contents of the website are presented to certain visitors, based on characteristics of the individual visitors (including how they arrive on the hacked site).
It isn’t clear how the threat actors gain access to the backend of these sites, but historically, these kinds of website compromises may be the result of any of a number of methods: The attackers may simply obtain the sites’ passwords from the Gootkit malware itself, or from any of a number of criminal markets that trade in stolen credentials, or by leveraging any of a number of security exploits in the plugins or add-ons of the CMS software. The operators of the websites seem not to know their sites are being abused in this way.
Regardless of how the attackers access the websites, what they do next is to insert a few additional lines of code into the body of the web page. The elements where the attackers inject the code could be within one of the following div tags.
The modified code is a simple script tag that looks like this:
The server checks to see whether the conditions in which the page gets loaded meet the criteria Gootloader has been looking for. Notably, the script appears to inspect the User-Agent string in the GET request header information to determine whether the visitor’s computer is running an operating system with the specific language/localization preferences that the attackers have been targeting. It may also be using IP geolocation to determine whether the person browsing the site is doing so from within the territory the attackers are targeting.
Server side, the attacker also checks whether the Referrer: header in the request indicates the page was loaded after the victim clicked a Google search result. (Our tests indicated that other search engines were not targeted, or were not targeted as frequently – or successfully – as Google’s.) These kinds of checks make it more difficult for a website owner to identify the problem with their own site.
In cases where the criteria is not met, the browser simply displays a normal-looking (but forged) web page, such as this blog post that starts out well, but spins into mostly-unintelligible word salad near the end:
If the right conditions are met (and there have been no previous visits to the website from the visitor’s IP address), the malicious code running server-side redraws the page to give the visitor the appearance that they have stumbled into a message board or blog comments area in which people are discussing precisely the same topic, using exactly the same terms the victim used in their search.
These fake forum posts include what appears to be an authoritative post from a site administrator offering a download of a document that purportedly gives the answer to the question raised by the search terms.
Interestingly, these fake comments/message boards all share an identical appearance.
The parameter of the request also contains the search terms that led to the fake forum page, which the download site uses to construct a payload, on the fly, with a file name that matches the original search terms. A quick survey of the filenames of samples we’ve collected give an indication of what the targets might have been searching for when they stumbled into Gootloader’s malicious SEO trap.
In addition to the English-language payloads (targeting users in North America), malware repositories contain a lot of Gootloader samples with filenames in German, French, and Korean, which appear to correspond to well-publicized campaigns targeting those countries. For instance, here’s a variant of the fake forum targeting German-language speakers:
And another example, in French, in which the search term exemple de dédicace à une amie (“example of dedication to a friend”) has been leveraged in both the title of the post and the link to the Gootloader payload. Note that this “French” website uses English words as labels for menu items and other elements. The fake page header typically displays the phrase “Questions And Answers.”
And still another, in Korean. The Hangul translation reads “here is the download link” with the URL pointing to the same domain hosting Gootloader payloads that also target French and German speakers.
The similarity between the pages is unmistakable; All languages feature a “forum post” by a new user with a five-petal flower as their user icon, and a reply from an account called Admin that uses an hourglass icon. The text of the Google search query is repeated at the top of the page and within the fake “message board” posts.
Needless to say, it would be best if you avoid downloading files from pages that look identical to these.
First stage payloads: twice obfuscated
Gootloader’s initial payload is a .zip archive containing a file with a .js extension. Files with the .js extension normally invoke the Windows Scripting Host (wscript.exe) when run.
This “first stage” script is the only component of the attack written to the filesystem. Because it’s the only one exposed to conventional AV scanning methods, the author has obfuscated the script and added two layers of encryption to strings and data blobs related to the next stage of the attack.
Gootloader randomly generates variable names, and splits its decryption code into several small component functions. The first two lines of the code shown above, for example, perform two very minor tasks: one is a simple addition, the other is a string split function. Splitting them in this unexpected and unnecessary way complicates static analysis of the script file.
This stage runs a block of data through the first decryption method, which outputs a second form of the data block that itself is obfuscated and encrypted, and contains embedded functions to decrypt itself. Only after it runs through this second decryption routine does the script reveal its final instructions.
The obfuscation techniques have evolved over time. In the example shown above, the variables are formed of random alphanumeric strings. Newer versions name the variables from randomly selected dictionary words, and may even include word-salad code “comments.”
The first stage script only exists to fetch the second stage code, cycling through three different hardcoded web domains if necessary.
Gootloader even adds complications to the URL that retrieves the second stage: It appends a unique parameter of random-looking characters (highlighted in yellow, above) and a random long number to the URL query string. The script shown above designates a “sleep” period of more than 22 seconds between some steps to slow down the process. And some Gootloader scripts attempt to resolve the domain name(s) hosting the payloads from DNS before attempting to contact their C2, possibly as an anti-sandboxing measure.
Second-stage payload: Registry stuffing
If the first stage successfully contacts a C2, it receives a long string of numbers as a reply. These numbers are the decimal (numeric) values that represent ASCII text characters, which the first stage loads directly into memory, leaving no trace on the filesystem.
This stage contains a large blob of data that it, first, decodes from its numeric value into text, then writes directly into a series of keys in the Windows Registry, under the HKCUSoftware hive. The key name varies from sample to sample.
Next, this stage creates an autorun entry for a PowerShell script. This script, when run (at every subsequent boot), decodes the contents of the Registry keys it wrote out in the previous step. (It also names this autorun entry after the same string of random-looking text it used as a Registry key name.)
Because this next stage doesn’t completely execute until the next time the computer reboots, the target may not actually discover the infection until some hours or even days later – whenever they fully reboot Windows.
After a reboot: the final dominoes fall
Once the computer reboots, it triggers the PowerShell script to run, which starts a sequence of events culminating in Gootloader attempting to download its final payload. But Gootloader is not finished with its complications.
The current generation of Gootloader samples actually stores not one, but a pair of payloads in the Registry: a small C# executable, and a second executable that the first one decodes from the weird way it has been stored in the Registry.
Here’s the first payload, the C# executable, identifiable by its use of Windows “MZ” header (hexadecimal 4d5a) as the first two bytes.
Here’s the second, and final, payload – counterintuitively, from its appearance, also an executable. In this case, the creator has encoded the numbers that make up the hexadecimal ASCII values as sequences of letters.
The secret decoder ring to parse this blob of data looks like this. The script runs the data in the Registry keys through this substitution script, ends up with a hexadecimal representation of the second executable, then executes it (also directly into memory). Not all characters are substituted, so the first four bytes shown above, ydua, represent the 4d5a of the MZ header.
The script then executes the payload and, to give itself persistence after reboot, creates a Registry run key that will execute the payload on the next startup (with the help of a PowerShell command):
This is the command registered by the registry loader. It serves as a failsafe mechanism for the Gootloader infection process to survive a reboot.
dotNET injector with a twist
The final stage of the elaborate infection plan involves a dotNet injector. Executed either by the registry loader or the failsafe PowerShell script, the result is the same: a simple .NET loader that contains the next stage, a Delphi-based loader malware, in the form of a data blob. Over time, this part of the infection process has evolved.
At first, the dotNET component simply decrypted the Delphi executable, which dropped and executed the eventual payload. Eventually, the attackers switched up the attack and added an intermediate step: The dotNET component would launch a benign application called ImagingDevices.exe, an innocent system component installed by default on Windows operating systems, then injected the Delphi executable into it using a process hollowing technique.
The most recent versions of the attack now involve the dotNET component writing out a different, benign executable that belongs to a commercial software package called the Embarcadero External Translation Manager to the file system (using as its filename the username of the currently logged-in user). It then performs a process hollowing on that executable to load the Delphi component.
It performs this function by holding a copy of both the benign and the malicious payload inside of itself.
The first one (stored in the variable text2) is the benign application, digitally signed by its publisher. If the user of an infected computer suspects foul play, and investigates a program that’s causing suspicious network traffic and/or high CPU load in the system, they would see what Windows considers a trusted application.
It drops and executes this clean application, then replaces the code in memory using process hollowing techniques with the contents of the second PE file (stored in the variable text3).
The Delphi loader contains the final payload – Kronos, REvil, Gootkit, or Cobalt Strike – in encrypted form. In those cases, the loader decrypts the payload, then uses its own PE loader to execute the payload in memory.
Throughout the infection process, none of the malicious code is written to disk, maintaining the fileless execution scheme right up to the end.
Cause and effect
What does all this obfuscation, leaping from one scripting platform to another, and the most absurdly, Vizzini-grade complications of almost any malware distribution platform achieve?
If you’re an analyst, it might cost you a few hours of work to fully unpack and understand each stage of the attack. We haven’t even covered in this blog post all the possible variations we’ve observed Gootloader using as final payload delivery methods, since it also might deliver .net or Delphi-based code-injector executables, additional PowerShell scripts, or Cobalt Strike modules.
But a criminal, ultimately, is just trying to buy a few minutes-to-hours of time remaining undetected to permit the attack to proceed without interference from endpoint protection software. Instead of actively attacking the endpoint tools, as some malware distributors do, the creators of Gootloader have traded the more aggressive approach for a technique that’s closer to a massive setup of dominoes that conceal the end result.
At several points, it’s possible for end users to avoid the infection, if they recognize the signs. The problem is that, even trained people can easily be fooled by the chain of social engineering tricks Gootloader’s creators use. Script blockers like NoScript for Firefox could help a cautious web surfer remain safe by preventing the initial replacement of the hacked web page to happen, but not everyone uses those tools (or finds them convenient or even intuitive). Even attentive users who are aware of the trick involving the fake forum page might not recognize it until it’s too late.
In the end, it’s up to the search engines, whose algorithm the malware games to get a high search result, to address the initial attack vector. Users can be trained to do things like enable visible file suffixes in Windows, so they can see they’re clicking a file with a .js extension, but they can’t choose which search results appear near the top of the list or how those sites get manipulated by threat actors.
Protection and indicators-of-compromise
SophosLabs acknowledges the research contributions of Fraser Howard, Mark Loman, Peter Mackenzie, Vikas Singh, and Feliz Weyne to this analysis and to the detection of Gootloader.