\[ \definecolor{data}{RGB}{18,110,213} \definecolor{unknown}{RGB}{217,86,16} \definecolor{learned}{RGB}{175,114,176} \]

Nymaim

The untold story

Jarosław Jedynak
Maciej Kotowicz

Intro

$ whois msm

Jarosław Jedynak

Software/Security Engineer @ CERT.pl
P4 CTF
Reverse engineering
Cryptography / Algorithmics
@msmcode

$ whois mak

Maciej Kotowicz

Principal Malware Researcher @ CERT.pl
DragonSector CTF
RE/Exploit dev
Automatization / Formal methods
@maciekkotowicz

Nymaim, long story short

Discoverd in 2013, dropping TorrentLocker
Rediscovered in Feb/Apr 2016
With banking capablilty, makes us interested.

Firstly dicovered back in 2013, dropping TorrentLocker. Went silent until February/April 2016, when it incorporates ISFB leaked code into its MO, dubbed GozNym. Starts beeing interesting for us ;]

Wild Goose Malware Chase.

There are two GozNym hashes provided in the latest IBM/Trusteer blog post https://securityintelligence.com/goznym-launches-redirection-in-the-united-states/ // but not in vt...

mak: I'm looking for real sample for a while now, only have some nymaims and no banking part

after a while many people come throgh but ther is no source like Shadowserver!

Distribition

EK/malvertising
H1N1/Rockguv/Hancitor malspam
standalone malspam

This is a little retrospective since all data presented here came after we conclude our intial research.

Nymaim's EK history

kudos to Kafeine

eks

As we can see those actors are not picky, they will use everything that works

H1N1 malspam

h1n1 (hash:b2ed2df6dc227919ec139cba434093cb3cb0c1552a413c1d6b1a83286ef41696)

also: inst1.exe, c1.exe, s2.bin

Can be observerd mostly 2-3 moths ago, now only vawtrak is being dropped via this vector?

Recent malspam

    To: Jakob Lang <jakob.lang@freenet.de>
    Message-ID: <8a2bdf4dee2842b311df802b3b33f1dd@guardian-vlg.ru>
    From: noreply@unverified.beget.ru
    Reply-To: Stellvertretender Sachbearbeiter Pay Online24 AG <admin@amazon.com> 
    ...
    Subject: *** GMX Spamverdacht *** Offene Rechnung: Buchungsnummer 39863821
    X-GMX-Antispam: 4 (nemesis spam server blocker); Detail=V3;
    X-GMX-Antivirus: 0 (no virus found)
    ...
    --b1_8a2bdf4dee2842b311df802b3b33f1dd
    Content-Type: application/octet-stream;name="Jakob Lang 28.09.2016.zip"
    Content-Transfer-Encoding: base64
    Content-Disposition: attachment; filename="Jakob Lang 28.09.2016.zip"

Part of email sample founded on VT, as we can see its higly personalized

Prior research

getting started

Obfuscation

There were at leat few articules describing nymaim's obfuscation so we wont dig into details of it here, just a quick reminder.

Obfuscation

All arguements are pushed on stack via function that pushes given register based on its id
Adresses of all functions are computed in runtime by applying one of few transformations
Most of constatnts are obtained during runtime via xor'ing with some magic value

Obfuscation

External functions

Real return address obfuscated, most tools confused.
Very frustrating when debugging & single-stepping.

External functions

def nymaim_get_api_consts():
    kpatt = '8F 45 E8 89 4D E4 E8 ? ? ? ? 89 C3 E8 ? ? ? ? 89 C2 89 45 FC 8B 4D E4 B8'
    kpatt+=' ? ? ? ? 29 C1 89 4D E0  C1 E9 02 83 F9 00 74 05 01 D3'
    kpatt2 = '8B 45 D8 3D ? ? ? ? 0F 84 ? ? ? ? 3D ? ? ? ?  0F 84 ? ? ? ? 3D ? ? ? ? 0F 84'
    faddr = FindBinary(idaapi.get_imagebase(),SEARCH_DOWN | SEARCH_REGEX, kpatt)
    key =  GetOperandValue(GetOperandValue(faddr + 6,0),1)
    xstep=  GetOperandValue(GetOperandValue(faddr + 13,0),1)
    off  =  GetOperandValue(faddr+26,1)
    kernl_h = 0x4b1ffe8e;ntdll_h = 0xab30a50a ## rol32(x,25) ^ c 
    faddr = FindBinary(idaapi.get_imagebase(),SEARCH_DOWN | SEARCH_REGEX, kpatt2)    
    x1 = Dword(faddr+4) ^ ntdll_h
    x2 = Dword(faddr+15) ^ kernl_h
    hash_xor = x1
    if x1 != x2:
        print '[!] whoppse, please find your key manually'
        hash_xor=0
    return off,key,xstep,hash_xor

External functions

def nymaim_get_api(api_off,off,key,xstep,hash_xor):
    api_va = idaapi.get_imagebase() + api_off
    xsize = api_va - off
    for _ in range(xsize/4):
        key = (key + xstep) & 0xffffffff

    r = 0
    for i in range(4):
        r |= (Byte(api_va+i) ^ (_ror(key,(xsize&3)*8)&0xff) ) << i*8
        xsize +=1
        if xsize % 4 == 0:
            key = (key + xstep) & 0xffffffff

    return (r ^ hash_xor)

Other Obfuscations

Almost every constant used in program is stored encrypted, decrypted only just before being used, and encrypted again afterwards.
"Encrypted memcpy" function - works like memcpy, but checks if data needs to be encrypted/decrypted before copying.
Few functions are encrypted/decrypted on the fly.

Deobfuscation

Our deobfuscator is able to revert (more or less) all mentioned obfuscation techniques:

Untangle control flow, remove detours and junk code.
Decrypt all constants stored in program.
Recover API calls.

We're going to publish our toolset, eventually.

Static config

We'd like to extract static config from binaries, especially things like:

C&C addresses
DGA hashes
Encryption keys
Malware version
Other stuff needed for communication

So, we have a working deobfuscator, and we can finally get to real work. We'd like to extract (...), basically everything the binary needs to communicate with CnC. Because we'd like to analyze traffic, decrypt traffic, maybe send some requests on our own, download more samples/injects, or maybe, just maybe, become part of the botnet ourselves.

Static config

    def nymaim_extract_blob(self, mem, ndx):
        """decrypt final config (read keys and length and decrypt raw data)"""
        key0 = mem.dword(ndx)
        key1 = mem.dword(ndx+4)
        len = mem.dword(ndx+8)
        raw = mem.read(ndx + 12, len)

        prev_chr = 0
        result = ''
        for i, c in enumerate(raw):
            bl = ((key0 & 0x000000FF) + prev_chr) & 0xFF
            key0 = (key0 & 0xFFFFFF00) + bl
            prev_chr = ord(c) ^ bl
            result += chr(prev_chr)
            key0 = (key0 + key1) & 0xFFFFFFFF
            key0 = ((key0 & 0x00FFFFFF) << 8) + ((key0 & 0xFF000000) >> 24)
        return result

So, how can we do this. It turns out that all those things are stored in static config block, encrypted with custom algorithm. As you can see, it's not very complex and we can just run it on unpacked nymaim to automatically get static config.

Static config

struct chunk {
    uint32_t type;
    uint32_t length;
    char data[chunk_length];
}

Format of config is really simple - it's just sequence of chunks. Each chunk consists of type, length and raw data. At the bottom you can see static config after decryption and parsing. That's not very clean representation, so we ignore most of chunks, and extract only most interesting ones.

Static config

And this is final interpretation of that static config - as you can see, we have everything we originally wanted from that executable, and we can now start tampering with network communication.

Infection timeline

What? When? Why?

First things first

Dropper is doing few sanity checks, for example:

Makes sure that it's not virtualized or incubated
Compares current date to "expiration time" from static config
Checks that DNS works as it should (by trying to resolve microsoft.com and google.com)

If something isn't right, dropper shuts down and infection doesn't happen.

First nymaim (we call it dropper) executed on system is doing some checks, and then downloads another nymaim (we call it payload). Dropper makes sure that it's not virtualized/incubated, that's kind of standard now. More interesting thing is, dropper checks current date and asserts that it's neither too early nor too late - usually droppers are working only for three days, so infecting yourself with month-old nymaim is impossible. Finally it checks that internet works by doing some DNS requests.

DNS magic

Static config contains (among others) two interesting pieces of information:

DNS server (virtually always it's 8.8.8.8 and 8.8.4.4).
C&C domain name (for example ejdqzkd.com or sjzmvclevg.com).

Nymaim is asking DNS server for A records for that domain... But returned IPs are not real C&C ip addresses.

https://github.com/vrtadmin/goznym/blob/master/DGA_release.py

Static config contains both DNS server and C&C domain name. Nymaim is resolving that domain, but returned A records are not real CnC addresses - they are used in quite complex algorithm to get real IP address. We don't have time to say more about this, but if anyone is interested, there is an article from talos intel published a week ago about it.

Beyond dropper (stage 1)

When dropper gets to know C&C address, it starts real communication. It downloads two important binaries, and a lot more:

payload - banker module (responsible for webinjects - passive member of botnet)
optional bot module (it is trying to open ports on router, and become active part of botnet. When it fails to do so, it removes itself from system).
few additional malicious binaries (VNC, password stealers, etc - not very interesting for us).

Just read the slide.

Beyond dropper (stage 1)

Reiterate previous slide.

Beyond dropper (stage 2)

Payload is very different from dropper when it comes to network communication:

No hardcoded domain
But has DGA
And P2P

Just read the slide.

Beyond dropper (stage 2) - DGA

def dga_single(self, state):
    name = ''
    len = self.getbyte(state, 8) + 5
    for i in range(len):
        r = self.getbyte(state, 0xFFFFFFFF)
        c = self.getbyte(state, 26) + 0x61
        name += chr(c)
    n = 0
    while n == 0:
        n = self.getbyte(state, 5)
    name += '.' + [0, 'net', 'com', 'in', 'pw'][n]
    return name

And this is Domain Generating Algorithm. I wan't talk about it too long, but I thought it is worth publishing. As you can see, it's just concatenating random characters.

Beyond dropper (stage 2) - DGA

XorShift variation

def getbyte(self, state, param):
    temp0 = ((state[0] << 11) ^ state[0]) & 0xFFFFFFFF
    temp2 = state[2]
    state[0] = (state[0] + state[1]) & 0xFFFFFFFF
    state[1] = (state[1] + state[2]) & 0xFFFFFFFF
    state[2] = (state[2] + state[3]) & 0xFFFFFFFF
    state[3] = ((state[3] >> 19) ^ state[3] ^ temp0 ^ (temp0 >> 8)) & 0xFFFFFFFF
    return (((state[3] + temp2) & 0xFFFFFFFF) % (param * 100)) / 100

Random characters are generated by variation of XorShift.

Beyond dropper (stage 2) - DGA

    def __init__(self, seed, date):
        arg8 = seed + date.day + (date.year << 9) + (date.month << 5)
        state = [0] * 4
        state[0] = (arg8 + seed) & 0xFFFFFFFF
        state[1] = ror(state[0] * 2, 4)
        state[2] = ror(bswap(state[1]), 0xE) + seed
        state[3] = ror(state[2] + state[1], 0x12)
        for i in range(16):
            next_byte = self.getbyte(state, 0xFFFFFFFF)
            dword_ndx, byte_ndx = i / 4, i % 4
            byte_mask = 0xFF << (byte_ndx * 8)
            state[dword_ndx] = (state[dword_ndx] & ~byte_mask) |
                ((next_byte & 0xFF) << (byte_ndx * 8))
        self.state = state

And initial state of PRNG depends only on seed stored in static config, and current date.

Beyond dropper (stage 2) - P2P

DGA is not the only option
Peers are exchanging their IP addresses between themselves.
We managed to intercept over 15000 IP addresses over last few weeks (most of them unresponsive/dead now).
More about it later.

But DGA is not an only option, and it's the most important thing we wanted to share. Because (...).

Network communication

or where are our injects

Typical request

And finally something technical. This is an example of typical nymaim request
(P2P and C2 communication looks the same in that regard).
* Host header taken from static config
* POST variable name and path randomized
* POST variable value = encrypted request (base64 encoded)
* Everything else hardcoded

Typical response

And this is typical response.
* Not really nginx
* All headers hardcoded
* Body = encrypted response from peer.

Message format

And this is format of encrypted message.
* Nibble of first byte = length of salt
* Nibble of second byte = length of padding
* Everything between salt and padding - encrypted message

Message format

Code for message decryption

    def nymaim_decrypt(key, raw_bytes):
        nibble0 = raw_bytes[0] & 0xF
        nibble1 = raw_bytes[1] & 0xF
        salt = raw_bytes[2:2+nibble0]
        password = key + salt

        data = raw_bytes[2+nibble0:len(raw_bytes)-nibble1]
        decrypted = rc4_decrypt(password, body)
        decrypted_len = struct.unpack('<I', decrypted[:4])[0]
        assert decrypted_len == len(decrypted - 4)

        return decrypted

After reverse engineering the algorithm, it's easy to decrypt message. We just (read the code).

Message format

Message = sequence of chunks.

Chunk has type, length, and type-specific data

After decrypting message, we get sequence of chunks, just like with static config. Each chunk has it's type, length, and data in that order.

Message format

Code used for message processing

def parse_message(blob):
    i = 0
    while i < len(blob):
        chunk_type = blob[i:i+4]
        chunk_len = from_uint32(blob[i+4:i+8])
        chunk_content = blob[i+8:i+8+chunk_len]
        process_chunk(chunk_type, chunk_content)
        i += 8 + chunk_len

This is basic code used for parsing message. Each chunk type needs to be processed a bit differently

Message format

Important chunks have another layer of encryption & compression

So we can't push our binaries or injects to whole botnet (without private key, at least)

Just read the slide.

Message format

A lot of data is compressed with aplib32 before encryption, to save some transfer.

Just read the slide.

Message format

    def inner_decrypt(raw, rsa_key):
        encrypted_header, encrypted_data = raw[-0x40:], raw[:-0x40]
        decrypted_data = rsa_decrypt(encrypted_header, rsa_key)

        md5 = decrypted_data[0:16]
        blob = decrypted_data[16:32]
        length = from_uint32(decrypted_data[32:36])

        serpent_decrypted = crypto.s_decrypt(encrypted_data, blob)[:length]
        assert md5 == hashlib.md5(serpent_decrypted).digest()

        return serpent_decrypted

Message format - request

Interesting things sent to server

Current state - versions of downloaded files, injects, etc (for example chunks with types 4a6fbfd2, 1c225a3e and 8fc11cf3)
Various fingerprints (for example chunks with types 6ee5d5ff, e02b4e01 and f90670f7)
Fragment of current peer list (chunk with type 14c58ebe)
Pressed keys (only when infected user visits 'interesting' website, for example bank)

Just read the slide.

Message format - request

This is debug view from our tool used to impersonate nymaim and communicate with C2. For example you can see a lot of fingerprints sent to remote server.

Message format - response

Interesting chunks received from server

More peers (chunk with type 14c58ebe) again
More executable modules (bf2f5c87, 2861bc3b, ae61bc39, 6cc51d26, 6cc51fa0, and more)
Web filters (types 35e7f241, 48a9c01e, 5ea9c018, 2e7c713d)
Injects (types 40185e1f, 0c2f0f92, 0c2f0f93)!
Other - public ip of peer, number of seconds for client to sleep (usually between 80 and 300), etc.

Just read the slide.

Message format - response

Another screen from our tool, showing typical response from server. Most interesting thing here is list of peers. If C&C dies, or DGA fails, P2P network will still be working.

Message format - response

Last screen, no peers this time, but server have just sent us fresh injects.

P2P Network

the untold story

P2P? Yes, P2P.

Adding an exception to Windows firewall? Check.

╰─$ strings decrypted_nymaim | grep -E "#!#|Firewall"
#!#*|Action=Allow|#*
#!#*|Action=Block|#*
#!#*|Active=TRUE|#*
#!#*|Active=FALSE|#*
#!#*|Dir=In|#*
#!#*|Dir=Out|#*
#!#*|Profile=Private|#*
#!#*|Profile=Public|#*
#!#*|LPort=#*
#!#*|RPort=#*
\Registry\Machine\SYSTEM\ControlSet001\Services\SharedAccess\Parameters
    \FirewallPolicy\StandardProfile\AuthorizedApplications\List
\Registry\Machine\SYSTEM\ControlSet001\Services\SharedAccess\Parameters
    \FirewallPolicy\FirewallRules

We couldn't find any previous research about nymaim's P2P botnet, so it's the most interesting thing we wanted to share in our presentation. But if someone still has doubts that nymaim really is P2P, I'll try to convince you with some strings from p2p bot module. First, P2P malware have to avoid firewall - as you can see, nymaim adds itself to firewall exceptions.

P2P? Yes, P2P.

Opening ports on router? Check.

╰─$ strings decrypted_nymaim | grep -E "PortMap|upnp"
DeletePortMapping
urn:schemas-upnp-org:service:WANPPPConnection:1
urn:schemas-upnp-org:device:InternetGatewayDevice:1
GetSpecificPortMappingEntry
upnp:rootdevice
AddPortMapping
AddAnyPortMapping
urn:schemas-upnp-org:service:WANIPConnection:1
NewPortMappingDescription

Here we have strings characteristic for opening ports on router with UPNP - something that P2P malware often do, to breach through the NAT on consumer grade networks.

P2P? Yes, P2P.

Masquarading as nginx server? Check.

╰─$ strings decrypted_nymaim | grep -E "nginx" -B 4
HTTP/1.1 200 OK
Connection: close
Content-Length: %u
Content-Type: application/octet-stream
Server: nginx/1.9.4

And here, we have the same fake nginx response that C&C is returning. So, if we tried hard enough, this binary we are looking at could be used to become a peer in nymaim's botnet.

P2P monitoring

We didn't become a peer after all, but we crawled through the whole thing, and here's what we found. According to our crawler, most of the supernodes are in Poland, some in Germany, and 15% in the US. But botnet is strongly geolocated, and we were focusing on Poland (we're working for polish company after all), so YMMV.

P2P monitoring

49.9% (~7.5k) of found supernodes in Poland, 30% (~7.5k) in Germany, 15.7% (~2.2k) in US
Botnet is geolocated, and we were focusing on Poland, so YMMV
All of them used to be alive, though most seems to be down now.

(repeat)