Identifying the country of origin for a malware PE executable

Advanced Threats, forensics, malware, Malware Analysis, PE EXE files, Reverse Engineering, trojan No Comments

Update 11/29/10: Added a short discussion about non-malware executables also.

Have you ever wondered how people writing reports about malware can say where the malware was likely developed?

Sometimes you get totally lucky and log files created by the malware will help answer the question. Given the following line from a log:

11/16/2009 6:41:48 PM –>  Hook instalate lsass.exe

 

We can use Google Translate’s “language detect” feature to help up determine the language used (click to enlarge):

Of course, it’s not often we get THAT lucky!

A more interesting method is the examination of certain structures known as the Resource Directory within the executable file itself. For the purpose of this post, I will not be describing the Resource Directory structure. It’s a complicated beast, making it a topic I will save for later posts that actually warrant and/or require a low-level understanding of it. Suffice it to say, the Resource Directory is where embedded resources like bitmaps (used in GUI graphics), file icons, etc. are stored. The structure is frequently compared to the layout of files on a file system, although I think it’s insulting to file systems to say such a thing. For those more graphically inclined, I took the following image from http://www.devsource.com/images/stories/PEFigure2.jpg. (Click to enlarge.)

 

For the sake of example, here’s some images showing you just a few of the resources embedded inside of notepad.exe: (using CFF Explorer from: http://www.ntcore.com/exsuite.php)

Now it’s important to note that an executable may have only a few or even zero resources – especially in the case of malware. Consider the following example showing a recent piece of malware with only a single resource called “BINARY.” (Click to enlarge.)

Moving on, let’s look at another piece of malware… Below, we see this piece of malware has five resource directories.

We could pick any of the five for this analysis, but I’ll pick RCData – mostly because it’s typically an interesting directory to examine when reverse engineering malware. (This is because RCData defines a raw data resource for an application. Raw data resources permit the inclusion of any binary data directly in the executable file.) Under RCData, we see three separate entries:

The first one to catch my eye is the one called IE_PLUGIN. I’ll show a screenshot of it below, but am saving the subject of executables embedded within executables for a MUCH more technical post in the near future (when it’s not 1:30 am and I actually feel like writing more!). ;-) (Click to enlarge.)

Going back to the entry structure itself, the IE_PLUGIN entry will have at least one Directory Entry underneath it to describe the size(s) and offset(s) to the data contained within that resource. I have expanded it as shown next:

And that’s where things get interesting – as it relates to answering the question at the start of this post anyways. Notice the ID: 1055. That’s our money shot for helping to determine what country this binary was compiled in. Or, more specifically, the default locale codepage of the computer used to compile this binary. Those ID’s have very legitimate uses, for example, you can have the same dialog in English, French and German localized forms. The system will choose the dialog to load based on the thread’s locale. However, when resources are added to the binary without explicitly setting them to different locale IDs, those resources will be assigned the default locale ID of the compiler’s computer.

So in the example above, what does 1055 mean?

It means this piece of malware likely was developed (or at least compiled in) Turkey.

How do we know that one resource wasn’t added with a custom ID? Because we see the same ID when looking at almost all the other resources in the file (anything with an ID of zero just means “use the default locale”):

In this case, we are also lucky enough to have other strings in the binary (once unpacked) to help solidify the assertion this binary is from Turkey. One such string is “Aktif Pencere,” which Google’s Translation detection engine shows as: (Click to enlarge.)

However, as you can see, this technique is very useful even when no strings are present – in logs or the binary itself.

So is this how the default binary locale identification works normally (eg: non-malware executable files)?

Not exactly. The above techniques are generally used with malware (if the malware even has exposed resources), but not generally with normal/legitimate binaries. Consider the following legitimate binary. What is the source locale for the following example?

As you see in the green box, we have some cursor resources with the ID for the United States. (I’m including a lookup table at the bottom of this post.) In the orange box, there are additional cursor resources with the ID for Germany. In the red box is RCData, like we examined before, but all of these resources have the ID specifying the default language of the computer executing the application.

As it turns out, the normal value to examine is the ID for the Version Information Table resource (in the blue box). In the case above, it’s the Czech Republic. The Version Information Table contains the “metadata” you normally see depicted in locations like this:

In the above screenshot, Windows is identifying the source/target local as English, and specifically, United States English (as opposed to UK English, Australian English, etc…). That information is not stored within the Version Information table, but rather is determined by the ID of the Version Information Table.

However, in malware, the Version Information table is almost always stripped or mangled, as is the case with our original example from earlier:

Because of that, the earlier techniques are more applicable to malware.

Below, I’m including a table to help you translate Resource Entry IDs to locales (sorted by decimal ID number).

-Gary Golomb

Locale Language LCID Decimal Codepage
         
Arabic – Saudi Arabia ar ar-sa 1025 1256
Bulgarian bg bg 1026 1251
Catalan ca ca 1027 1252
Chinese – Taiwan zh zh-tw 1028  
Czech cs cs 1029 1250
Danish da da 1030 1252
German – Germany de de-de 1031 1252
Greek el el 1032 1253
English – United States en en-us 1033 1252
Spanish – Spain (Traditional) es es-es 1034 1252
Finnish fi fi 1035 1252
French – France fr fr-fr 1036 1252
Hebrew he he 1037 1255
Hungarian hu hu 1038 1250
Icelandic is is 1039 1252
Italian – Italy it it-it 1040 1252
Japanese ja ja 1041  
Korean ko ko 1042  
Dutch – Netherlands nl nl-nl 1043 1252
Norwegian – Bokml nb no-no 1044 1252
Polish pl pl 1045 1250
Portuguese – Brazil pt pt-br 1046 1252
Raeto-Romance rm rm 1047  
Romanian – Romania ro ro 1048 1250
Russian ru ru 1049 1251
Croatian hr hr 1050 1250
Slovak sk sk 1051 1250
Albanian sq sq 1052 1250
Swedish – Sweden sv sv-se 1053 1252
Thai th th 1054  
Turkish tr tr 1055 1254
Urdu ur ur 1056 1256
Indonesian id id 1057 1252
Ukrainian uk uk 1058 1251
Belarusian be be 1059 1251
Slovenian sl sl 1060 1250
Estonian et et 1061 1257
Latvian lv lv 1062 1257
Lithuanian lt lt 1063 1257
Tajik tg tg 1064  
Farsi – Persian fa fa 1065 1256
Vietnamese vi vi 1066 1258
Armenian hy hy 1067  
Azeri – Latin az az-az 1068 1254
Basque eu eu 1069 1252
Sorbian sb sb 1070  
FYRO Macedonia mk mk 1071 1251
Sesotho (Sutu)     1072  
Tsonga ts ts 1073  
Setsuana tn tn 1074  
Venda     1075  
Xhosa xh xh 1076  
Zulu zu zu 1077  
Afrikaans af af 1078 1252
Georgian ka   1079  
Faroese fo fo 1080 1252
Hindi hi hi 1081  
Maltese mt mt 1082  
Sami Lappish     1083  
Gaelic – Scotland gd gd 1084  
Yiddish yi yi 1085  
Malay – Malaysia ms ms-my 1086 1252
Kazakh kk kk 1087 1251
Kyrgyz – Cyrillic     1088 1251
Swahili sw sw 1089 1252
Turkmen tk tk 1090  
Uzbek – Latin uz uz-uz 1091 1254
Tatar tt tt 1092 1251
Bengali – India bn bn 1093  
Punjabi pa pa 1094  
Gujarati gu gu 1095  
Oriya or or 1096  
Tamil ta ta 1097  
Telugu te te 1098  
Kannada kn kn 1099  
Malayalam ml ml 1100  
Assamese as as 1101  
Marathi mr mr 1102  
Sanskrit sa sa 1103  
Mongolian mn mn 1104 1251
Tibetan bo bo 1105  
Welsh cy cy 1106  
Khmer km km 1107  
Lao lo lo 1108  
Burmese my my 1109  
Galician gl   1110 1252
Konkani     1111  
Manipuri     1112  
Sindhi sd sd 1113  
Syriac     1114  
Sinhala; Sinhalese si si 1115  
Amharic am am 1118  
Kashmiri ks ks 1120  
Nepali ne ne 1121  
Frisian – Netherlands     1122  
Filipino     1124  
Divehi; Dhivehi; Maldivian dv dv 1125  
Edo     1126  
Igbo – Nigeria     1136  
Guarani – Paraguay gn gn 1140  
Latin la la 1142  
Somali so so 1143  
Maori mi mi 1153  
HID (Human Interface Device)     1279  
Arabic – Iraq ar ar-iq 2049 1256
Chinese – China zh zh-cn 2052  
German – Switzerland de de-ch 2055 1252
English – Great Britain en en-gb 2057 1252
Spanish – Mexico es es-mx 2058 1252
French – Belgium fr fr-be 2060 1252
Italian – Switzerland it it-ch 2064 1252
Dutch – Belgium nl nl-be 2067 1252
Norwegian – Nynorsk nn no-no 2068 1252
Portuguese – Portugal pt pt-pt 2070 1252
Romanian – Moldova ro ro-mo 2072  
Russian – Moldova ru ru-mo 2073  
Serbian – Latin sr sr-sp 2074 1250
Swedish – Finland sv sv-fi 2077 1252
Azeri – Cyrillic az az-az 2092 1251
Gaelic – Ireland gd gd-ie 2108  
Malay – Brunei ms ms-bn 2110 1252
Uzbek – Cyrillic uz uz-uz 2115 1251
Bengali – Bangladesh bn bn 2117  
Mongolian mn mn 2128  
Arabic – Egypt ar ar-eg 3073 1256
Chinese – Hong Kong SAR zh zh-hk 3076  
German – Austria de de-at 3079 1252
English – Australia en en-au 3081 1252
French – Canada fr fr-ca 3084 1252
Serbian – Cyrillic sr sr-sp 3098 1251
Arabic – Libya ar ar-ly 4097 1256
Chinese – Singapore zh zh-sg 4100  
German – Luxembourg de de-lu 4103 1252
English – Canada en en-ca 4105 1252
Spanish – Guatemala es es-gt 4106 1252
French – Switzerland fr fr-ch 4108 1252
Arabic – Algeria ar ar-dz 5121 1256
Chinese – Macau SAR zh zh-mo 5124  
German – Liechtenstein de de-li 5127 1252
English – New Zealand en en-nz 5129 1252
Spanish – Costa Rica es es-cr 5130 1252
French – Luxembourg fr fr-lu 5132 1252
Bosnian bs bs 5146  
Arabic – Morocco ar ar-ma 6145 1256
English – Ireland en en-ie 6153 1252
Spanish – Panama es es-pa 6154 1252
French – Monaco fr   6156 1252
Arabic – Tunisia ar ar-tn 7169 1256
English – Southern Africa en en-za 7177 1252
Spanish – Dominican Republic es es-do 7178 1252
French – West Indies fr   7180  
Arabic – Oman ar ar-om 8193 1256
English – Jamaica en en-jm 8201 1252
Spanish – Venezuela es es-ve 8202 1252
Arabic – Yemen ar ar-ye 9217 1256
English – Caribbean en en-cb 9225 1252
Spanish – Colombia es es-co 9226 1252
French – Congo fr   9228  
Arabic – Syria ar ar-sy 10241 1256
English – Belize en en-bz 10249 1252
Spanish – Peru es es-pe 10250 1252
French – Senegal fr   10252  
Arabic – Jordan ar ar-jo 11265 1256
English – Trinidad en en-tt 11273 1252
Spanish – Argentina es es-ar 11274 1252
French – Cameroon fr   11276  
Arabic – Lebanon ar ar-lb 12289 1256
English – Zimbabwe en   12297 1252
Spanish – Ecuador es es-ec 12298 1252
French – Cote d’Ivoire fr   12300  
Arabic – Kuwait ar ar-kw 13313 1256
English – Phillippines en en-ph 13321 1252
Spanish – Chile es es-cl 13322 1252
French – Mali fr   13324  
Arabic – United Arab Emirates ar ar-ae 14337 1256
Spanish – Uruguay es es-uy 14346 1252
French – Morocco fr   14348  
Arabic – Bahrain ar ar-bh 15361 1256
Spanish – Paraguay es es-py 15370 1252
Arabic – Qatar ar ar-qa 16385 1256
English – India en en-in 16393  
Spanish – Bolivia es es-bo 16394 1252
Spanish – El Salvador es es-sv 17418 1252
Spanish – Honduras es es-hn 18442 1252
Spanish – Nicaragua es es-ni 19466 1252
Spanish – Puerto Rico es es-pr 20490 1252

 

Gary Golomb

.

Network Forensics and Reversing Part 1 – gzip web content, java malware, and a little JavaScript

Breach, Decompile, Java malware, JavaScript, Malware Analysis, NetWitness Rules, Network Forensics, network forensics, Network Visbility, Obfuscated traffic, Reverse Engineering, trojan No Comments

Something I’ve found unsettling for some time now is the drastically increased usage of gzip as a Content-Encoding transfer type from web servers. By default now, Yahoo, Google, Facebook, Twitter, Wikipedia, and many other organizations compress the content they send to your users. From that list alone, you can infer that most of the HTTP traffic on any given network is not transferred in plaintext, but rather as compressed bytes.

That means web content you’d expect to look like this on the wire (making it easily searchable for policy violations and security threats):

In reality, looks like this:

As it turns out, the two screenshot above are for the exact same network session, the later screenshot being from wireshark and showing the data sent by the webserver really is compressed and not discernable.

By extension, you can likely say that most real-time network forensics/monitoring tools are [realistically] “blind” to [plausibly] a majority of the web the traffic flowing into your organization.

Combined with the fact that a vast majority of compromises are delivered to clients via HTTP (at this time, typically through the use of javascript), my use of the word “unsettling” should be an understatement. This includes everything from “APT” types of threats (or whatever soapbox you stand on to describe the same thing), down to drive-by’s and mass exploitations.

The good news: Current trends in exploitation have given us very powerful methods for generic detection (eg: without needing “signatures,” or more precisely – preexisting knowledge about the details of particular vulnerabilities or exploits) by examining traits of javascript, iframes, html, pdf’s, etc.

The bad news: Webservers are reducing the chance of network technologies from detecting those conditions by compression based transfer (obfuscation).

I find no fault with organizations choosing to use gzip as their transfer type. HTTP is a horribly repetitive and redundant language (read: bloated). Every opening <tag> has an identical closing </tag>. XML is even worse. For massive sites with massive traffic, the redundancy and bloat of protocols like HTTP and XML translate directly to lost revenue via extremely large amounts of wasted bandwidth.

Nonetheless, as forensic engineers, our challenge is to discover and compensate for all the things proactive security technologies like AV, firewalls, IPS, etc. continually fail to identify and stop. Recently, I added the following rule on a customer’s network in NetWitness:

If you’re not familiar with the NetWitness rule syntax, the rule above does the following:

If the server application/version (as extracted by the protocol parsing engine) contains the string: “nginx,”

AND

If the Content-Encoding used by the server is gzip

THEN

Create a tag labeled “http_gzip_from_nginx” in a key called “monitors.”

In the Investigator GUI, you would see something like this in the “monitors” key:

Why nginx? As it turns out, a lot of hackers tend to use nginx webservers, so this seemed like a good place to start experimenting. The question I was trying to answer is:

If the content body of a web response is gzip’ed (so we can’t examine traits of “suspiciousness” inside the body), then what can we see outside the body to indicate this gzip’ed traffic is worth examining further?

We’ll revisit this question in later blog posts, but for now, nginx as a webserver is an amazingly powerful place to start! We’ll examine one such example in this post, with an additional post using the gzip + nginx combination. As the small screenshot above shows, there were 33 sessions meeting the criteria of gzip + nginx (out of about 50,000 sessions). With only 33 sessions, it’s possible to examine them by drilling into the packets of all 33, examining them each one-by-one (eg: brute-force forensic examination), but that would be poor forensic technique and defeat the entire point of a technical and educational network forensics blog! The examples in these series of blog posts will employ good forensic practices using “correlative techniques,” allowing us to have a good idea of what is inside the packet contents before we ever drill that deeply into the network data (an indication you are using good network forensics practices).

The first pivot point we’ll examine are countries. Keep in mind, this is after we used the rule above to include only network sessions where the server returned gzip compressed content, and where the webserver was some type of nginx. We could have manually done the same by first pivoting on the content type of gzip:

Doing the first pivot reduces the number of sessions we’re examining from about 50,000 down to 2,878. Then we can do a custom filter to only include servers with the string “nginx” within those 2,878 session. Doing so gives us the same 33 sessions mentioned above.

In those 33 sessions, the countries communicated with are:

Not only do we tend to see a higher degree of malicious traffic from countries like Latvia, it immediately looks suspicious simply because it’s an outlier in the list. (Don’t worry Latvia, we’ll pick on our own country in the next post!) Additionally, there’s only a single session to examine here, meaning drilling into the packet-level detail is an ok decision at this point.

In the request, we see the client requested the file “/th/inyrktgsxtfwylf.php” from the host “ertyi.net,” as shown next:

As expected, based on the meta information NetWitness already extracted, we see the gzip’ed reply from a nginx server:

Fortunately, Investigator makes it easy for us to examine gzip’ed content by right-clicking in the session display and selecting decode as compressed data:

Doing so shows us a MUCH different story!

The traffic appears to be obfuscated javascript. We can extract it from NetWitness (a few different ways) to clean it up and examine. I’ll skip those steps and just show the cleaned-up and nicely formatted content the webserver returned.

There are a few things to notice in here. At the very bottom of the image above, we clearly see encoded javascript, a trait extremely common to client-side exploit delivery and malicious webpages. We’ll save full javascript reverse engineering for another blog post.

But the worst (or most interesting) part is the decoding and evaluation for this encoded data, while implemented in javascript, is stored inside a TextArea HTML object! This technique makes the real logic invisible and indiscernible to most automated javascript reverse engineering tools.

Indeed, if we upload this webpage to one of my favorite js reversing sites (jsunpack, located at: http://jsunpack.jeek.org/dec/go), we see the following results when the site attempts to automatically reverse engineer the javascript:

Without going further into the process of reverse engineering the javascript (for now – we have an endless supply of blog posts coming!), we can be quite sure we’re looking at something suspicious. At the very least, we know for a fact we’re looking at something that does not make it easy to discern what it’s doing!

The telltale signs of “badness” don’t stop there. At the top of the decoded body data we saw an embedded java applet, as follows:

While we don’t know (yet) what the applet does, there’s a pretty strong indication it’s a downloader or C&C (command and control) application of some type. How can we make such a guess without knowing anything about it?

Look closely at the embedded parameter passed into the applet:

We can make a guess that the string contained in the “value” parameter is encoded data using a simple substitution cypher where “S”[parm] = “T”[actual] and “T”[parm] = “/”[actual]. If we made such a guess, then it’s possible the decoded parameter value actually starts with the string “http://”.

Of course, because we have the download of the jar file within our full packet capture and storage database, we’ll just extract it from NetWitness to validate our hunch and possibly learn more. In the below screenshot, I already performed the following steps:

  1. Switched to the session with the jar file download. (Simply clicked on the next session between that same client and server.)
  2. Extracted the jar file by saving the raw data from the server using the “Extract Payload Side 2” option in NetWitness.
  3. Opened the jar file using the following java decompiler:

The first line of code in the java applet takes the parameter passed to it (the encoded value we identified above), and hands it to a function called “b.” The result of that function is stored in a string variable called str1.

Following the decompiled java code to function “b,” we see the following:

It turns out the applet actually is using a simple substitution cypher, replacing one given character with another. When the parameter “RSS=,TT!;LBIB@STSRTYG$I=R=” is decoded, we end up with the string “http://uijn.net/th/fs7.php?i=1.”

The java malware then continues with additional string functions as shown next:

First, we see the declaration of str2 through str5, with values assigned to each.

Then, str6 through srt8 is simply the reversal of str2 through str4, resulting in the following strings:

Str6 = .exe

Str7 = java.io.tmpdir

Str8 = os.name

Combining that with the last three lines of code shown above, we see the following:

Str10 is a filename ending in “.exe” where the actual filename is a randomly generated number.

Str11 is the path to temporary files for the current user.

Str12 is the name of the Operating System the java malware is currently running on.

The last part of this java malware (that we’ll examine here anyways) is shown next:

First, it tests to see if the string “Windows” is contained anywhere in the name of the Operating System. If so, then it goes through the process of opening a connection to the URL (the one we decoded above), downloads the file, saves it to the temporary directory, then executes the file.

This file appears to be malware as a first-stage downloader for other executables that are likely far more malicious.

Pre-Summary

Even though a large amount of web traffic is coming into your organization gzip compressed, making most inline/real-time security products totally “blind” to what’s inside, we can use standard forensic principals to identify which of those sessions are worth examination. In this case, we combined to following traits to reduce 50,000 network sessions to a single one:

  1. Gzip’ed web content
  2. Suspicious country
  3. Uncommon webserver application

Once we drilled into that single session, we saw how trivial it was to use NetWitness to automatically decompress and content, extract it, then validate it as “bad.”

Epilogue

Does the process stop there? Of course not! If you had to repeat this process every time, not only would it make your job boring as heck, but would call into question the value you and your tools are really providing the organization in the first place! There are many ways to maximize the intelligence gained from the process just shown. I’ll highlight one method here, while saving others for later blog posts.

There are several interesting “indicators” gathered from this traffic so far. The ones I’ll focus on here are host names. In the request made by the client, we saw the following tag in the HTTP Request header:

Host: ertyi.net

In the java malware we decompiled, after decoding the encoded parameter value, we saw the executable to be downloaded was from the host “uijn.net.”

At this point, network rules should be added to firewalls, proxies, NetWitness intelligence feeds, and any other technology you have that can alert to other hosts going to either of those servers – preferably blocking all traffic to those servers.

But, can we extend our security perimeter in relation to the hackers using those servers?

Interestingly, we find both those domains are hosted on the same IP block: 194.8.250.60 and 194.8.250.61.

That leads to the question, “What other domains are hosted on those server?”

Normally I use http://www.robtex.com to answer questions like that, but in this case, robtex does not provide a lot of information about that question. It’s possible the hackers are brining-up and tearing-down DNS records as needed for the domain names they manage.

Another source of helpful information can be found querying the “Passive DNS replication” database hosted at: http://www.bfk.de/bfk_dnslogger.html Here, we can find an audit trail of all historically observed DNS replies pointing to IPs you submit queries about. In this case, we do indeed find valuable information, including about 40 unique host names that have been hosted on those two IP’s. A shortened list is included below showing some of the names that have been hosted there.

aeriklin.com

aijkl.net

asdfiz.net

asuyr.net

campag.net

iifgn.net

jhgi.net

jugv.net

kobqq.com

krclear.com

lilif.net

nadwq.com

oiuhx.net

pokiz.net

uijn.net

As we can see, none of them look immediately legitimate, so we can infer this is a hacking group using a set of servers for domains they have registered simply to be “thrown away” if any of those domain names are discovered and end up on a blacklist somewhere.

The Real Summary

By combining a few pivot points and looking inside compressed web traffic most products ignore, from a single network session we proactively increased the security posture of your organization by creating an intelligence feed of nearly 40 hosts names and 2 IP’s. You could now audit DNS queries made by all hosts in your organization to see if other clients are compromised and doing look-ups when trying to communicate with those hosts.

For the truly paranoid (or safe, depending on how you look at it), you could also blackhole all traffic to those apparently malicious networks:

route: 194.8.250.0/23

origin: AS29557

Considering the Google Safe Browsing report for that AS, it’s probably not a bad idea!

Gary Golomb

Bredolab Takedown – Just the tip of the Iceberg

Advanced Threats, cybercrime, Malware Analysis, network forensics, trojan 1 Comment

Recent reports from various sources in the security industry show that a large takedown of servers associated with the “Bredolab” trojan occurred within the past few weeks. While most of the reports have focused around the idea that this infrastructure was solely related to the command and control of Bredolab, our research shows that these servers were used as an all-purpose hosting infrastructure for criminal activity.

This criminal system came to our attention in July 2010, when NetWitness analysts were asked to investigate a hacked wordpress blog.

We found that the following obfuscated script had been injected into all .html and php pages on the site:

When decoded, this script created a redirect to the following location:

hxxp://bakedonlion.ru:8080/google.com/pcpop.com/torrentdownloads.net.php

Further investigation revealed an injection of the script into victim webpages via FTP:

These IPs all connected to the victim website within a 20-minute period on May 8th, and when plotted on a map, it becomes obvious that this is likely a botnet.

Read the rest…

Sometimes the answer really is that simple…

Advanced Threats, Malware Analysis, Network Forensics No Comments

Early this year, we were challenged by our CEO Amit Yoran to take this perpetual battle against Malware to the next level.  Easily, one of the most common use cases today of NetWitness Nextgen, is the combating of various forms of commercial and custom malicious code.  The goal of helping our customers optimize their efforts in this regard seemed like a natural progression.  It seemed like every day customers or NetWitness analysts were finding yet another zero day, or yet another piece of custom code, or yet another group of professional thieves.  We are very fortunate to have so many experts on staff, as well as customers, who regularly use our solution to sift through terabytes of network data to identify threats from malware.

The first step was to interview these experts, and ask them how we could ease this effort.  We also asked them to quantify and explain their magic sauce.  Surprisingly, explaining how they go about their day-to-day efforts was very easy for most of them. However, when we asked what our system should do to help – nearly every one of them found it hard to come up with any specific requirements.  Inevitably, they would defer and simply say, “Well – do some of that stuff I just told you.  That’s a start.”

Tell us the same thing enough times, and we will eventually listen.  What if we automated all their steps during investigations? What if we could ask all the questions they ask?  What if all the information needed to highlight what was bad, was analyzed for them?  It slowly dawned on us, that their “secret sauce” was the answer.  We did not need to invent a new paradigm.  We needed to make their paradigms work and scale. They were telling us what they wanted done for them, by describing all the laborious steps they took to get there manually.  They were telling us what tools, services, and intelligence they liked to use.  They were telling us the combinations of indicators that really peaked their interest.  And we had a very distinct advantage.  They were telling us all this, by showing us in NetWitness what they look for.  We already had collected the majority of the information we needed.  We just needed to ask the right questions.

Today at the 2nd annual NetWitness user conference, we introduced NetWitness Spectrum.  We are in the process of taking requests for early access any NetWitness customer. Spectrum is an expert automated analytics engine that provides extraction and prioritization of executable content within an enterprise.  Spectrum is your virtual Malware expert, sifting through thousands of executables and doing the laborious legwork to prioritize malicious content, all on a continual, real-time, port and protocol independent basis.

Over the next few weeks, we will be discussing more and more features and capabilities of Spectrum. We have a history here at NetWitness of thinking a little differently than the industry tells us to think. We prefer to innovate rather than copy, lead rather than follow.  This time however, our innovation is purely you, our user community.  We are following your lead.

For more information regarding Spectrum and the early access program, visit www.netwitness.com.

Tim Belcher, CTO

It’s Malware!

Breach, Competitor Hype, cybercrime, Malware Analysis, Network Forensics, network forensics, Network Visbility, trojan, zeus No Comments

Zeus is evolving. In regards to a new release, one Anti-Virus vendor recently noted:

“[the new exe] uses techniques designed to avoid automatic heuristics-based detection.”

The discussion then proceeds to examine how the exe is different from previous versions of the malware.

Should we be alarmed that Zeus is getting so sophisticated that it evades heuristics-based detection mechanisms?

I suppose if it actually evaded heuristics-based detection mechanisms, that would be alarming. I’m sure the version of Zeus in question evades the mechanisms of certain AV vendors. However, when looking at the exact sample in question (verified by MD5) using the techniques we use for malware identification here, we see the sample stands out like a sore thumb.

Using our own internally-developed heuristic malware identification methods (also used by components of NextGen), we see the exe has traits such as the following (not a complete list!):

  1. The binary contains packed sections, indicative of packed, obfuscated, and/or encrypted malware.
  2. The size of the binary is abnormally small considering the conditions and context in which it was found.
  3. The PE checksum fails to validate, something malware packers are notoriously bad about.
  4. The binary does not have any information normally found within the version info table in the resource section of the PE.

But… Why get overly wrapped around the minutia related to the abnormal facets of this particular sample of Zeus? There’s a more important note to be made here. That is, Zeus is malware, so it does the things that malware does! You can’t get more “heuristically obvious” than that!

From the same vendor as above:

“…common ZeuS 2.0 variants contain relatively few imported external APIs… By contrast, [this version] imports many external APIs. To a heuristic scanner, this changes the appearance of the file and lowers the possibility of detection.”

Finding a binary that has very few external imports is generally a sign that something is suspicious. Specifically, it’s generally a sign the file is packed, obfuscated, and/or encrypted and the real imports are likely hidden inside. Such is the case when finding binaries that only import between two and five specific API’s from kernel32.dll (in the more obvious cases).

However, when finding a binary with a lot of imports, that’s even better since you get to see the full range of imports needed by the binary/malware! Without even running the sample or doing deep low-level reverse engineering, you can start to make assumptions about the functionality of the binary based on the API’s it uses. Further, it’s a simple matter to separate malware from legitimate binaries by comparing the API’s it uses to the ones it doesn’t need/use.

As is the case with this sample of Zeus, we see it (like the thousands of different types of malware not related to Zeus) imports APIs related to hooking the Windows API, creating mutexes, and managing services – without importing the functions used by legitimate binaries that also use the same functions.

So, should we be alarmed some people say Zeus is getting so sophisticated that it evades heuristics-based detection mechanisms?

If your security vendor is looking for Zeus, then yes, you should be alarmed. However, if your security vendor is looking for general signs of malware, infection, and so on, then no… Fortunately Zeus is still malware, just like all the rest of it…

Gary Golomb

I need to watch for 74,000 unique domains!

Uncategorized No Comments

In the “malware of the minute” news,  information surrounding the “Murofet” trojan has hit some malware research blogs.

Details around this trojan, which shares code similarities with ZeuS, can be found here:

What’s interesting about Murofet is that it borrows a page from the Conficker playbook and uses an algorithm to generate command and control domain names on the fly based on the date and time on the infected host. This makes it very difficult to take down from a defender standpoint because coordinated effort is required to control all of the possible domain names as they are detected.

http://blog.threatexpert.com/2010/10/domain-name-generator-for-murofet.html

In this case, reverse engineering has revealed a way to generate the domain names used by the malware in advance, which allows us to build a list of all possible domains that will be used by the malware in its current state.

But that brings us to our challenge. Murofet can generate 1,020 usable domain names a day… which if we say, push that out for a few months in advance, quickly reaches into the tens of thousands of possible domain names. If I’m an incident responder at a large enterprise, I may need to parse through multiple gigabytes a day of proxy logs to attempt to locate these tens of thousands of possibly malicious domains. As you can imagine, this can quickly become a very tedious and unwieldy problem.

One of the many strengths of the NextGen framework is that it is built around addressing this sort of “needle in a haystack” problem. The NetWitness Live system is built around the concept of using external intelligence and applying it to *your* network in real-time, with alerting and in some cases we have feeds with *millions* of entries.

In this case, and given a big list of Murofet domains, it is a trivial exercise to create a custom feed that identifies when they are seen on the network. Add an Informer Alert, and you have real-time notification if any one of these 74,000 domains are accessed by any of your monitored hosts. This strategy was also successfully used to track Conficker infections at some of our clients.

If you’d like more information on creating your own custom feeds, please see this link in the community:

https://www.netwitness.com/community/showthread.php?t=320

Hello Hilary…..I see you’ve met ZeuS.

kneber, zeus 1 Comment

The press has been a buzz over the past few weeks with news of law enforcement busts of some prominent ZeuS miscreants.

This has renewed interest at NetWitness around the data and publication of our “Kneber” paper, which documented the data stolen by a large ZeuS botnet.

Today I took a second look at the domains reported to malwaredomainlist.com (since in the release of our research in February) that were registered by our nemesis,  hilarykneber@yahoo.com

http://www.malwaredomainlist.com/mdl.php?search=kneber&colsearch=Registrant&quantity=all

Here’s what I found:

Since February, seven-one new “kneber” domains have been identified as malicious and whois records indicate the vast majority of them were created after the publication of our research:

  • Created On:09-Feb-2010 20:20:43 UTC
  • Created On:13-Apr-2010 14:58:46 UTC
  • Created On:20-Jan-2010 13:02:23 UTC
  • Created On:20-Jan-2010 13:02:23 UTC
  • Created: 2009-12-22
  • Created: 2010-01-14
  • Created: 2010-02-09
  • Created: 2010-02-11
  • Created: 2010-02-12
  • Created: 2010-02-17
  • Created: 2010-02-18
  • Created: 2010-02-23
  • Created: 2010-02-23
  • Created: 2010-03-11
  • Created: 2010-03-11
  • Created: 2010-03-11
  • Created: 2010-03-15
  • Created: 2010-03-15
  • Created: 2010-03-15
  • Created: 2010-03-16
  • Created: 2010-04-13
  • Created: 2010-04-27
  • Created: 2010-05-06
  • Created: 2010-05-26
  • Created: 2010-06-10
  • Created: 2010-06-14
  • Created: 2010-06-14
  • Created: 2010-06-25
  • Created: 2010-06-29
  • Created: 2010-07-05
  • Created: 2010-07-08
  • Created: 2010-07-16
  • Created: 2010-07-26
  • Created: 2010-07-29
  • Created: 2010-08-01
  • Created: 2010-08-02
  • Created: 2010-08-06
  • Created: 2010-08-06
  • Created: 2010-08-06
  • Created: 2010-08-13
  • Created: 2010-08-14
  • Created: 2010-08-14
  • Created: 2010-08-17
  • Created: 2010-08-26
  • Created: 2010-08-28
  • Created: 2010-08-28
  • Created: 2010-08-28
  • Created: 2010-08-28
  • Created: 2010-09-05
  • Created: 2010-09-09
  • Created: 2010-09-21
  • Created: 2010-09-21
  • Created: 2010-10-05

Of these domains, 56 had registrar information in the whois records, and 53 of those were a single registrar:

Registrar: BIZCN.COM, INC.

These domains are being reported for a number of different malicious elements, but there are 100 instances of ZeuS components from this group of domains, including:

  • zeus v1 config file
  • zeus v1 drop zone
  • zeus v1 trojan
  • zeus v2 config file
  • zeus v2 drop zone
  • zeus v2 trojan

So what does this tell us about the state of the internet?

  • Domain registration and monitoring of activities is still a weak point in the security of the internet.
  • Top-level .com and .net dns providers are in a key place to act against this sort of activity but don’t.
  • Despite massive press coverage and industry acknowledgement of hilarykneber@yahoo.com and associated maliciousness,  registrars (and BIZCN in particular) are still allowing ongoing registration by this email address, and not suspending existing “kneber” domains.
  • Not surprisingly, ZeuS is still very active.

NetWitness customers that subscribe to NetWitness Live automatically detect these domains due to our partnership with malwaredomainlist.com.

Happy Hunting!

Alex Cox, Principal Research Analyst

Tracking the “Here You Have” Worm

Malware Analysis, Network Forensics, Network Visbility, Situational Awareness, Uncategorized No Comments

If you’ve kept a view on security news in the past 24 hours, you may have noticed some press around a new email worm spreading on corporate networks.   Dubbed the “Here You Have” worm, it is a good case study on how to manage emerging threats with your NetWitness technology.  You can find additional info on the worm here:

http://isc.sans.edu/diary.html?storyid=9529

As a general overview, the worm works in a similar manner as other recent malware observed in the wild.

  • It tempts the user to click on an attachment or link with a social engineering hook.
  • When clicked, the malware establishes itself on the targeted machine to run automatically and propogates itself.
  • The malware downloads additional executables intended to steal saved credentials and establishes a beacon mechanism to receive updates or transmit stolen data.

Like most emerging threats, research teams at NetWitness analyzed this variant as soon as we found out about it, and I’ll use a few basic incident response questions to demonstrate detection mechanisms using our technology.   One thing to note is that none of this worm’s activity requires any content generation other than simple application rules since the metadata extraction process in our engine  extracts all of the relevant meta by default.

1)  Who in my environment was targeted?

Targeted email addresses related to this worm’s activity can be detected by simply using a custom-drill in Investigator:

subject contains ‘here you have’,'just for you’ && email = ‘iraq_resistance@yahoo.com’

This drill will focus the collection on the email sessions related to this activity, and relevant email addresses, ip addresses, hostnames, etc. can be extracted for additional analysis.

2) Who in my environment actually clicked on the link or attachment?

In this case, there are a few ways to detect this activity.   Once executed, the malware downloads a number of files with the extension “iq”.   Since this is an unusual extension, an initial quick pivot to locate infected hosts is:

extension = ‘iq’

Or, you could specifically target some of the filenames themselves:

filename = ‘ie.iq’,'pspv.iq’,'op.iq’,'im.iq’,'m.iq’,'w.iq’,'gc.iq’,'ff.iq’,'rd.iq’,'tryme.iq’

Or, you could look for hits to the alias.host where the files reside:

alias.host = members.multimania.co.uk && directory contains ‘yahoophoto’

Or, if your sniffing equipment is monitoring a backbone, you could look for the malware being copied to mapped network drives:

filename = ‘pdf_document21_025542010_pdf.scr’

3) Who in my organization is infected and beaconing?

In this case, one of the downloaded files in Step 2 attempts to contact “tarekbinziad.no-ip.biz”, so you can use an alias.host pivot to locate machines that may have transmitted credentials to a third-party:

alias.host = ‘tarekbinziad.no-ip.biz’

One thing to keep in mind is that both “tarekbinziad.no-ip.biz” and “members.multimania.co.uk/yahoophoto/” have been taken down by the security industry at this point, so with this variant, you are looking at a cleanup effort.   Also keep in mind that infected machines will continue to spam messages until they are cleaned.

Happy Hunting!

Mini Decoder in Action @TaoSecurity

Network Forensics No Comments

Thanks Richard at TaoSecurity for the post on our mini device.

Leveraging Custom Actions in NetWitness Investigator

Malware Analysis, Network Forensics, Network Visbility, pentesting, Situational Awareness 1 Comment

One of the lesser-known features that was recently introduced in NetWitness Investigator are Custom Actions.   Have you ever been analyzing a pcap in Investigator and thought “I wish there was an easy way to push this information into another system…”.   Custom Actions is a flexible extension system that will allow you to do just that.    Here are just a few examples of what can be accomplished:

  • Automatically search your favorite search engine for a meta element.
  • Push meta into other systems for additional analysis or automation action.
  • Point and click searching of your favorite threat intelligence source.
Using a simple masking system and right-click actions, we allow you to quickly expand your investigation.  Below are three examples of common tasks involved in an incident response scenario to provide a working example.

Scenario:   You are a senior incident responder at your organization, and one of your company’s help desk analysts reports strange behavior from an executive’s workstation.   Luckily, this analyst is on his game and has provided you with a network capture from the workstation, which you import into Investigator for analysis.  What are some tasks that I might want to do with this investigation?

Here are three possible tasks:
  1. Perform an nmap scan of the executive’s workstation to ascertain what services are listening on the network.
  2. Perform a Google search of observed filenames.
  3. Perform a threat intelligence search of the target domain using the URLvoid service.
To access custom actions, do the following:
  • Click on the Edit drop-down menu, and select Custom-Actions

Once you are here, you’ll see a GUI overview of the custom action system with a few examples. You may have a few in the list that were installed by default.

For Task 1, we’ll add a new NMAP Version Scan Custom Action. First, determine the desired command-line string for your nmap scan.  In this case I’m going to use:

cmd.exe /K c:\PROGRA~1\nmap\nmap.exe -PN -sV –top-ports 1000 192.168.2.245


To prep this as a custom action, I’d simply swap out the IP address with the ${VALUE} mask as follows:

cmd.exe /K c:\PROGRA~1\nmap\nmap.exe -PN -sV –top-ports 1000 ${VALUE}


In layman’s terms, this is going to open a command prompt window, launch nmap with a few options and scan the IP address that I specify in Investigator.
Once it has been added to the custom actions list, it’s a point and click affair going forward.
  • Right click on the target IP and Select “NMAP Version Scan” from the list of Custom Actions.

  • Watch it go!

For Task 2…I want to build a custom action that does a Google search on a target keyword.  In this example, the easiest way to start is to build a Google search query as follows:

http://www.google.com/search?q=dog

And just like above, replace the search term “dog” with the ${VALUE} mask, which will look like this:

http://www.google.com/search?q=${VALUE}

And again, add this to custom actions, and then it’s a right-click away inside Investigator.

For Task 3, we are going to use the same basic concept to search URLvoid.com for a domain’s presence on blacklists.  Just like Google, we’ll start with a search on URLvoid:

http://www.urlvoid.com/scan/cnn.com

Same steps as before, just replace the domain with the ${VALUE} mask:

http://www.urlvoid.com/scan/${VALUE}

Once these have been added, we can then use them to quickly investigate our scenario:

Using the nmap scan custom action, we find that our executive’s workstation at 192.168.2.246 is listening only on ports that it should be per our company’s security policy. We see that our Google custom action reveals that the involved filenames are related to a known ZeuS trojan infection, and we know by our URLvoid results that this command and control server now appears to be down.     This PC is compromised, so we then declare an incident and proceed with our incident response plans.

As you can see, custom actions are very easy to implement.  These are simple examples, but with some effort, tools, a favorite scripting language, etc., the sky is the limit.  You could do things like:

  • As a Penetration Tester,  launch a metasploit attack against a target IP with a right click.
  • As a Firewall Administrator, add a firewall rule with a custom action feeding a script.
  • As an Incident Responder, quickly add a domain name to a threat database.

Do you have a good idea or example for a custom action?  Post about it in the NetWitness Community, we’d love to hear about it!

Happy Hunting!

« Previous Entries Next Entries »