Extracting Shellcode from .xlsx document - CVE-2017-11882


This blog post will walk you through on how to unpack shellcode and retrieve IOC's related to the pony malware - a info stealer. The sample used in this writeup can be obtained from the open source repository app.any[.]run. The .xlsx document didn’t have any VBA or XLM macros, hidden or very hidden or protected sheets. It was a bit odd. So let’s see what it takes to analyze a document such as this.

Tools used:

pestudio (static analysis)

Oledumpy.py

hxd (static analysis)

Scdbg.exe

Disclaimer

Disclaimer *** You are dealing with a real malware sample. Run and Analyze it in a controlled environment (sandbox) with no connections to the internal network or internet. I am not responsible for any consequences or damages.

Hash:

MD5 - 25c200472d75090d7565f49f56157703.

Analysis:

In a nutshell, this Excel spreadsheet file is a weaponized malicious Excel spreadsheet thats extracted from a zipfile. The interesting part in this attack technique is the way that the Excel spreadsheet leverages Microsoft Equation Editor to download the Pony malware. This technique is well-known and was discovered as CVE-2017-11882 where RTF documents exploit a vulnerability in Microsoft Equation Editor (EQNEDT32.EXE) to download shell code. This XLSX file contains an OLENativeStream instead of the commonly used Equation native stream. Upon unzipping the file, we can find oleObject1.bin inside the xl/embeddings folder.







OLe10nATive stream

oledump.py shows that the oleObject1.bin contained a stream called OLe10nATive. These are the storage objects that correspond to the linked or embedded objects. That stream is present when data from the embedded object in the container document in OLE1.0 is converted to the OLE2.0 format. We extract this stream by using oledump.py to select object A1 and dump it to a file using the command "oledump.py -s A1 -v -d "FILE.xlsx" > output.bin". As you can see in the image below there is an embedded ole object in A1.



The command above will parse that A1 element, decompress it and dump it into a binary file. We also search this stream output for a hex string "E8 00 00 00 00 " using a hexeditor (HXD) and to id presence of and extract the shellcode. A hex string E8 00 00 00 00 can be an indicator of where position-independent code may start. We also use the XORSearch tool to search the binary file for multiple 32-bit shellcode patterns using the command "xorsearch.exe -W output.bin". See below for output.


.


Next, To analyze the shellcode, load the ole10native.bin in scDbg with a start offset of 0x13E. As you saw above, shellcode methods in a memory location (GetEIP methods) were found at address 13E. To analyze this shellcode, we use the scdbg tool using commands "/f filepath /foff hexnum of offset". The output displays depicts that we are on the right track as we can see the unhooked call to ExpandEnvironmentStringsW. The decoded output from this shellcode shows we see the downloader URL, the saved file path, and the Windows API calls used by the second stage shellcode which will download and execute the final payload

URLDOwnloadToFileW call is used to download the pony malware into the host from the URL hxxp://mattgraumann[.]com/.../tochi3.exe then saves it in the current user %APPDATA% folder then it will attempt to create a new process of this binary. The key takeaway is the shift to an OLENativeStream using excel docs which makes detection more difficult as RTF documents which are more common use this OLE object



Thanks for reading!

Comments