:: hiddenillusion :: ... look beyond

Go Prefetch Yourself



Overview

If you’re reading this then I’m sure you’re aware of what Prefetch on a Windows system is so I won’t bore you with a recap. Instead, I’d rather touch upon a different view of Prefetch and how I’ve leveraged it in non-traditional ways during my forensicating. Occasionally I’ve come across a few situations where I needed both sides of a Prefetch file. By two sides, I’m referring to:

  1. the prefetch filename (application name + path hash)
  2. the full path for where the file it was created for resided during execution

I’ve come across various verbiage when reading on this topic so for the remainder of this post, I’ll be referring to these two items, and some others as:

Term Example
original path C:\Users\user\AppData\Local\Temp\svchost.exe
filename svchost.exe
file directory \Users\user\AppData\Local\Temp\
kernel path \DEVICE\HARDDISKVOLUME1\USERS\USER\APPDATA\LOCAL\TEMP\SVCHOST.EXE
device path \DEVICE\(HARDDISKVOLUME#|LANMANREDIRECTOR|HGFS)
prefetch file SVCHOST.EXE-41CE8261.pf
path hash 41CE8261

TL;DR

In the event you only have details about the prefetch file, one can attempt to “bruteforce” the original path by iterating combinations of:

device path known file directory filename

Otherwise, if you only have details about the file directory\filename but aren’t sure which device held the filename, one can attempt to “bruteforce” the original path by iterating combinations of:

all possible/known device path s known file directory filename

Hashing

If you look at libjoachim’s notes, the steps to generate the name of a prefetch file involve:

  1. Determine the full path for the executable, e.g. let’s assume the full path for “notepad.exe” is “C:\Windows\notepad.exe”.
  2. Convert the full path into an upper-case Windows device path: “\DEVICE\HARDDISKVOLUME1\WINDOWS\NOTEPAD.EXE”
  3. Convert the string into an UTF-16 little-endian stream without a byte-order-mark or an end-of-string character (2x 0-bytes)
  4. Apply the appropriate hash function.

To put this into perspective:

On a Windows XP (32-bit) system, calculating the prefetch hash of “\DEVICE\HARDDISKVOLUME1\WINDOWS\NOTEPAD.EXE” should generate the value 0x189578da. This in turn should correspond to the prefetch hash value in the prefetch file (e.g. - C:\Windows\Prefetch\NOTEPAD.EXE-189578DA.pf).

Those Ah-Ha’s

In addition to the hashing method described above, you may come across an instance where:

Note: On Windows Vista and Window 7 the volume indicated by C: is often the second volume (where the boot partition is the first) hence the Windows device path for C: will be “\DEVICE\HARDDISKVOLUME2”. ref

While that note is definitely something to be aware of, you also need to consider situations where that may not be the case (e.g. - a Windows 7 virtual machine vs. a harddisk with Windows 7 pre-installed from Dell). If you’re unsure, the best approach is just to loop through “\device\harddiskvolume#”.

What about hosting applications and command line arguments?

In these cases, the Prefetch file name no longer relies on a device .exe path only. It does take it into account of course, but it also includes a command line used to launch an application itself and/or /Prefetch command line argument if it exists. ref

Tooling Around

Now that we have an understanding of how a Prefetch file’s hash is created we need to translate those steps into some usable code. There are other code snippets out in the interwebs but since these are commonly referenced and have been used successfully, here are three resources to help with the generation of Prefetch path hashes:

Code/Resource Notes
prefetch_hash.py Standalone python script that generates the name of the prefetch file given a kernel path to the program. Supports1 XP/2003/Vista/2008/7 SCCA hashing algorithms.
libscca Contains python functions to produce the same as the above script but has support for newer SCCA hashing & doesn’t require an additional module. Supports XP/2003/Vista/2008/7/2012/8/8.1 SCCA hashing algorithms.
prefhashcalc.pl Prefetch hash calculator and lookup table generator. It also supports calculating the prefetch hash when there’re command line arguments (e.g. - dllhost, mmc, rundll32, svchost). Supports (some bitness) of XP/2003/Vista/2008/7 SCCA hashing algorithms.

1 Even though it doesn’t say it, this script should likely support Windows 2012/8/8.1

Additionally, here’re some other resources leveraged for this post:

Code/Resource Notes
list_mft.py Python script that parses a $MFT and provides entry details (file paths of files on said system in this case) + supports jinja2 templating
prefetchparser volatility plugin that scans a memory dump for Prefetch files and provides the prefetch file/path hash/original path
generate_prefetch_hashes.py Script I wrote to combine above mentioned hashing algorithms, allows one to supply filepaths a few ways & has the ability to try and brute force a filepath for you.
volatility Because it will rock the memory out of you
Memory dumps from Jackcr’s DFIR challenge Memory dumps used for testing + parsed $MFT’s to validate memory findings & SCCA hash calculations
jq will be your new besty for dealing with JSON data (but it might take some getting used)

Why Do You Care?

There are several reason why this blog post might ring a bell for you or you might bookmark it for further engagements, if you’re not already aware and leveraging this type of technique. Some of the more obvious reasons why I’m even writing about this are:

When you only have the path hash, the ability to map the path hash to an original path produces evidence that:

  1. the file resided at the original path at one point in time
  2. indicates the file at original path executed on the system

When you can determine where the prefetch file was originally located you:

  1. can determine what device the application was actually located on
  2. have the ability to map the original file to having contact with said system
  3. indicates the file at original path executed on the system

Use Cases

So what might some of those situations I made mention of previously actually entail you ask…

  1. Did said file exist on the system, and if so, what was its original path ?
  2. Were there any indications said file executed on the system?
  3. If you recovered or carved the Prefetch file
  4. If you only have references to the prefetch file as a string (think keyword hit in unallocated space)
  5. Reference to the original path was found in another artifact (event logs, $MFT, A/V logs etc.) and no prefetch file was found or what you’re analyzing doesn’t cover/contain those details
  6. You know the prefetch file but can’t determine which device it was originally located on.

Scenario Uno

Through some means (timeline analysis, an entry within A/V logs etc.) we found a file of interest, or original path; However, this file was no longer present on the system at the time of analysis.

Q. What do we know?

In this scenario we only have the full path to the file of interest C:\Users\User\AppData\Local\Temp\svchost.exe. The physical prefetch file is either not present on the system or the evidence we’re sifting through doesn’t contain it (e.g. - just reviewing A/V logs or $MFT).

Q. Solution

We can leverage the known SCCA hashing code and try to determine what the prefech file would have been. (show details) Hide

~/Desktop/

  • generate_prefetch_hashes.py -i 'C:\Users\User\AppData\Local\Temp\svchost.exe'
    • {
    • "2BF01587": [
    • "xp_gleeda",
    • "xp_libyal"
    • ],
    • "41CE8261": [
    • "vista_gleeda",
    • "vista_libyal",
    • "2008_libyal"
    • ],
    • "device_used": "\\DEVICE\\HARDDISKVOLUME1\\",
    • "filepath": "\\DEVICE\\HARDDISKVOLUME1\\USERS\\USER\\APPDATA\\LOCAL\\TEMP\\SVCHOST.EXE"
    • }
  • prefetch_hash.py -v -p '\DEVICE\HARDDISKVOLUME1\USERS\USER\APPDATA\LOCAL\TEMP\SVCHOST.EXE
  • SVCHOST.EXE-41CE8261.pf

    In the above example, we:

    1. Used the original path that we knew and ran it through each SCCA hashing function with the device path as HARDDISKVOLUME1
    2. Validated the HEX version (41CE8261) of the calculated path hash was correct with another script.

    But - what if HARDDISKVOLUME1 isn’t the correct device? (refer back). In this situation, instead of supplying a device path we can use the --brute_force option in my p.o.c script and generate various path hash values for multiple SCCA hashing algorithms & multiple (known) device paths. While the script may not be perfect, the thought process is on track. (show details) Hide

    ~/Desktop/

  • generate_prefetch_hashes.py -b -i 'Users\User\AppData\Local\Temp\svchost.exe'
    • {
    • "390E4197": [
    • "xp_gleeda",
    • "xp_libyal"
    • ],
    • "81D3D7CC": [
    • "vista_gleeda",
    • "vista_libyal",
    • "2008_libyal"
    • ],
    • "device_used": "\\DEVICE\\HARDDISKVOLUME0\\",
    • "filepath": "\\DEVICE\\HARDDISKVOLUME0\\USERS\\USER\\APPDATA\\LOCAL\\TEMP\\SVCHOST.EXE"
    • }
    • {
    • "2BF01587": [
    • "xp_gleeda",
    • "xp_libyal"
    • ],
    • "41CE8261": [
    • "vista_gleeda",
    • "vista_libyal",
    • "2008_libyal"
    • ],
    • "device_used": "\\DEVICE\\HARDDISKVOLUME1\\",
    • "filepath": "\\DEVICE\\HARDDISKVOLUME1\\USERS\\USER\\APPDATA\\LOCAL\\TEMP\\SVCHOST.EXE"
    • }
    • ...
    • {
    • "22D6F8E6": [
    • "xp_gleeda",
    • "xp_libyal"
    • ],
    • "4ECE2F8": [
    • "vista_gleeda",
    • "vista_libyal",
    • "2008_libyal"
    • ],
    • "device_used": "\\DEVICE\\LANMANREDIRECTOR\\X\\",
    • "filepath": "\\DEVICE\\LANMANREDIRECTOR\\X\\USERS\\USER\\APPDATA\\LOCAL\\TEMP\\SVCHOST.EXE"
    • }
    • ...

    In the above output, the --brute_force switch allows us to iterate known device paths and concatenate them to our known file directory\filename. As you can see, we got the same result as our previous attempt “41CE8261”.

    Since this route produces a lot of path hash values, one possible option afterwards would be to scan whatever evidence/artifacts are available to you for any of the newly generated prefetch files and if you have a hit then you’ll know the original path.

    Scenario Dos

    A keyword search conducted on a physical image of the system yielded hits for various “svchost.exe” related prefetch files.

    Q. What do we know?

    We know that there were multiple hits for “svchost.exe” prefetch files but only have their filenames (e.g. - SVCHOST.EXE-41CE8261.pf)

    Since this is a commonly used application with both malicious and legitimate use cases, knowing the said Prefetch file once resided on the system isn’t overly useful by itself. (e.g. - did it executed from %windir%\System32 or somewhere else?)

    Q. Solution

    In this situation, we can:

    1. Build a list of known device paths (shares, virtual machines etc.)
    2. Build a list of directories on the system being investigated, from a golden image system etc. (refer here for guidance)

    In short, we need some device paths and some file directories so we can build possible kernel paths with out filename.

    Thinking Outside the Box

    Enumerating Directories

    Keeping a list of device paths, directories and original paths -or- knowing how to quickly generate them can be a handy thing to have in a pinch situation.

    While some original paths are more constant, you should ensure your list contains any third party or system/environment specific directories/original paths not traditionally known (e.g. - special applications installed or mapped shares means additional directories/original paths need to be acounted for)

    Disk

    One universal option we can use in this situation is leveraging TSK’s fls to recusivlely list the full paths to each directory of a given file system. (show details) Hide

    ~/Desktop/

  • fls -o 2048 -Drp /mnt/vmdk1 | awk -F'\t' '{print $2}' | sed 's/\//\\/g
    • $Extend
    • $Extend\$RmMetadata
    • $Recycle.Bin
    • $Recycle.Bin\S-1-5-21-3670647999-409174923-3062832813-1000
    • Boot
    • Boot\cs-CZ
    • ...
    • Config.Msi
    • Documents and Settings
    • PerfLogs
    • PerfLogs\Admin
    • Program Files
    • Program Files\7-Zip
    • ...
    • \Users\foo\Application Data
    • ...

    Standalone Artifact

    For this post, I’m just going to leverage the $MFT but there are certainly a number of other artifacts one could enumerate directories/original paths from as well.

    We can use the default settings of list_mft.py and create a bodyfile which will contain the data we’re looking for. (show details) Hide

    ~/Desktop/

  • python INDXParse/list_mft.py \$MFT
    • 0|\\\$MFT|0|0|256|0|196870144|1318062771|1318062771|1318062771|1318062771
    • 0|\\\$MFT (filename)|0|0|256|0|196870144|1318062771|1318062771|1318062771|1318062771
  • python INDXParse/list_mft.py \$MFT | sort -u | wc -l
  • 421049
  • python INDXParse/list_mft.py \$MFT | awk -F '|' '{print $2}' | grep -v "(filename)" | sort -u | wc -l
  • 231010

    …but, as indicated above, this will also include stuff we’re not interested in.

    Have no fear, we still don’t need to rewrite anything because we can leverage jinja2 templating. You can see an example of how to provide a format, and in this instance, what variable to use here.

    One option is to leverage jinja2 templating and only print the filepaths from the $MFT. (show details) Hide

    ~/Desktop/

  • python INDXParse/list_mft.py --format "{{ record.path }}" \$MFT
    • \$MFT
    • \$MFTMirr
    • \$LogFile
    • ...
    • \$Extend
    • \$Extend\$Quota
    • ...
    • \dell
    • ...

    …but that still means we have to sline-n-dice the output later since we just need unique directories. Did you know you could rock some more complex statements?

    By looking into the code a bit more, we can provide an if test in the format so we only get directories (saves us slicing and dicing later); this will add some more processing time initially. (show details) Hide

    ~/Desktop/

  • python INDXParse/list_mft.py --format "{% if record.is_directory == 2 %} {{ record.path }} {% endif %}" \$MFT
    • \$Extend
    • \$Extend\$RmMetadata
    • \$Extend\$RmMetadata\$TxfLog
    • \$Extend\$RmMetadata\$Txf
    • \dell
    • \Users\user\AppData\LocalLow\Microsoft
    • \Users\user\AppData\LocalLow\Microsoft\CryptnetUrlCache
    • ...

    sa-weet. In my testing this took a bit longer to parse the $MFT and provide those filtered results, but it’s geek-tastic.

    Memory

    Contained within the community plugins repository is a copy of the prefetch plugin for volatility. This plugin leverages volatility’s built in scanning to look for the prefetch signature SCCA across pages of memory.

    When a potential prefetch file is found, based on the profile assigned to the memory dump (currently supports XP -> 7), the plugin attempts to validate if it’s truly a prefetch file by looking at some of the PF_HEADER information. Based on the Windows Prefetch File format, parsing this initial information, which this plugin does, is simple. Unfortunately, however, this plugin doesn’t provide the full path of the file.

    This may be due to:

    1. a limitation of the data resident in memory
    2. a result of it being much easier just to present this initial information (jumping around offsets and parsing the various sections data to get all the details is a PITA).

    Regardless, we can overcome both by leveraging volatility’s filescan.

    While having the basic prefetch details are useful, having the original path is also important (if this is news to you, re-read everything). I didn’t see my issue getting any love so I created a PR. (show details) Hide

    ~/Desktop/

  • vol.py -f ENG-USTXHOU-148/memdump.bin prefetchparser --full_paths --output=json --output-file=prefetch.json
    • Volatility Foundation Volatility Framework 2.5
    • Outputting to: prefetch.json
  • cat prefetch.json | jq .
    • {
    • "columns": [
    • "Prefetch File",
    • "Execution Time",
    • "Times",
    • "Size",
    • "File Path"
    • ],
    • "rows": [
    • [
    • "IPCONFIG.EXE-2395F30B.pf",
    • "2012-11-26 23:07:31 UTC+0000",
    • "2",
    • "26602",
    • "\\DEVICE\\HARDDISKVOLUME1\\WINDOWS\\SYSTEM32\\IPCONFIG.EXE"
    • ],
    • ...

    While there wasn’t a large addition to processing time with the modified prefetch plugin, remember that the possibility exists that we won’t be able to determine the original path that maps to a prefetch file. This could be due to a few things but most obvious is that it wasn’t contained within one of the file paths enumerated (resident) via FileScan.

    Happy forensicating.

    Additional Reading