Wednesday, March 16, 2016

Jump List Forensics: AppID Master List (400+ AppIDs)

TL;DR: The list of 400+ manually generated Jump List application IDs can be found HERE.

About 5 years ago, I wrote two blog posts related to Windows Jump Lists [1] [2]. These two posts covered jump list basics and focused mainly on how each application that is run on a Windows machine has the potential to generate a %uniqueAppID%.automaticDestinations-ms file in the C:\Users\%user%\AppData\Roaming\Microsoft\Windows\Recent\AutomaticDestinations\ directory. The AppID lists I created in 2011 have been useful to me in the past, so I decided to expand them. Many tools use these lists, as well. With that, I've recently added over 100 more unique application AppIDs and combined them into one list.

As a refresher, each application (depending on the executable's path and filename) will have a unique Application ID (i.e. AppID) that will be included in the name of the .automaticDestinations-ms jump list file. Jump lists provide additional avenues in determining which applications were run on the system, the paths from which they were run, and the files recently used by them.

The catch is that you need to know which AppIDs will be generated for certain applications. And, at this point in the game, the only way to know that is to either (a) manually generate the .automaticDestinations-ms files or (b) know the executable's absolute path and use Hexacorn's AppID Calculator. Either way, you need to have some kind of starting information to come back with an answer.

As we already know, two ways in which the .automaticDestinations-ms files are generated are:


Both of these methods will show you the application's jump list, thereby generating/modifying the application's .automaticDestinations-ms file. In this case, that file is named:


...with 9b9cdc69c1c24e2b being the AppID for 64-bit Notepad.

In the AppID list, you will notice a few entries containing multiple versions of applications. Many of these applications retain their default installation location as they are updated to new versions. This essentially means that the AppID will stay the same. As an example, if we take a look at iTunes, we'll see that iTunes has an AppID of 83b03b46dcd30a0e; I tested and verified that in 2011. If we take a look at a more recent version (, we can see that the AppID has remained the same. This is because when the newer version is installed (and then run), it is doing so from the same location as the old version was, which causes the AppID to remain the same among different versions. If you want to learn more about how the AppID is actually generated, I highly recommend that you read through Hexacorn's blog post here.

With that, you can find the AppID master list at the following location:

Note that with the release of Eric Zimmerman's JLECmd (Jump List Explorer Command Line), an investigator can gain better insight into the applications for which the jump list files were generated.

As Eric explains in his Jump Lists In-Depth post, jump lists are (more or less) collections of LNK files. So, for example, if you have a jump list .automaticDestinations-ms file that has an unknown AppID and you see that the LNK files contained within it all point to a specific file type (say, AutoCAD .dwg drawing files), you might be able to conclude that the jumplist belongs to an AutoCAD-related program. Obviously, this is a very simple example, but you get the idea. You have more information to work with now.

The AppID master list is a work in progress and will likely be updated occasionally throughout its life cycle.

-Dan (@4n6k)

1. Jump Lists In Depth (by Eric Zimmerman)
2. Introducing JLECmd! (by Eric Zimmerman)
3. JumpLists file names and AppID calculator (by Hexacorn)
4. Jump List Forensics: AppIDs Part 1 (by 4n6k)
5. Jump List Forensics: AppIDs Part 2 (by 4n6k)

Saturday, May 23, 2015

Forensics Quickie: NTUSER.DAT Analysis (SANS CEIC 2015 Challenge #1 Write-Up)

FORENSICS QUICKIES! These posts will consist of small tidbits of useful information that can be explained very succinctly.

SANS posted a quick challenge at CEIC this year. I had some downtime before the conference, so I decided to take part. In short, SANS provided an NTUSER.DAT hive and asked three questions about it. Below is a look at my process for answering these questions and ultimately solving the challenge. It's time to once again refresh our memories with the raw basics.

The Questions

Given an NTUSER.DAT hive [download], the questions were as follows:
  1. What was the most recent keyword that the user vibranium searched using Windows Search for on the nromanoff system?
  2. How many times did the vibranium account run excel.exe on the nromanoff system?
  3. What is the most recent Typed URL in the vibranium NTUSER.DAT?

The Answers

Right off the bat, we can see that these questions are pretty standard when it comes to registry analysis. Let's start with the first question.

Question #1: Find the most recent keyword searched using Windows Search.

First, we must understand what the question is asking. "Windows Search" refers to searches run using the following search fields within Windows:

Windows Search via the Start Menu.


Windows Search via Explorer.

The history of terms searched using Windows Search can be found in the following registry key:


For manual registry hive analysis, I use Eric Zimmerman's Registry Explorer. Once I open up the program and drag/drop the NTUSER.DAT onto it, I typically click the root hive (in the left sidebar) and just start typing whichever key I'd like to analyze. In this case, I started to type "wordwheel" and Registry Explorer quickly jumped to the registry key in question. Note that you can also use the "available bookmarks" tab in the top left to find a listing of some common artifacts within the loaded hive (pretty neat feature; try it out).

Registry Explorer displaying the WordWheelQuery regkey (MRUListEx value selected).

In the screenshot above, notice that the MRUListEx value is highlighted (Sound familiar? We also saw use of the MRUListEx value upon analyzing shellbags and recentdocs artifacts). This value will show us the order in which the Windows Search terms were searched. The first entry in the MRUListEx value is "01 00 00 00." This means that the registry value that is marked as "1" is the most recently searched item. If we analyze the MRUListEx value further, we notice that the next entry is "05 00 00 00," indicating that the value marked as "5" is the term that was searched before the most recently searched item marked as "1." But we're only concerned with the most recently searched term, so let's look at what the value marked as "1" contains:

Registry Explorer displaying the WordWheelQuery regkey (value "1" selected).

We note that the Unicode representation of the hex values is "alloy." And just like that, we have our answer to question #1. The most recent Windows Search term is "alloy."
Note: MRUListEx Item Entries

Each entry in the MRUListEx value will be 4 bytes in length stored in little endian. That is, each entry is going to be a 32-bit integer with the least significant byte stored at the beginning of the entry. E.g. an entry for "7" would be shown as "07 00 00 00."

Question #2: Find the number of times excel.exe was run.

For question #2, we are concerned with program execution. And, as we already know, there is no shortage of artifacts that can be used to determine this (.lnk files, Windows Error Reporting crash logs, Prefetch, AppCompatCache, etc.). However, we are limited to only the NTUSER.DAT hive for this challenge. As such, the artifact we will want to look at will be UserAssist.

Remember that unlike Prefetch, UserAssist artifacts will show us run counts per user instead of globally per system. Since we would like to determine how many times excel.exe has been run by a specific user, UserAssist is the perfect candidate.

UserAssist artifacts can be found in the following registry key:


Just as with question #1, let's open up Registry Explorer and start typing "userassist" to quickly find our way to this key.

Registry Explorer displaying the UserAssist regkey. ROT13'd EXCEL.EXE, run counter, and last run time highlighted. 

Within the UserAssist key, there will be two subkeys that each contain a "Count" subkey. For this challenge, we will be looking in the {CEBFF5CD-ACE2-4F4F-9178-9926F41749EA} subkey. Each value's name within the "Count" subkey is ROT13 encoded, so let's decode the value for Excel.

{7P5N40RS-N0SO-4OSP-874N-P0S2R0O9SN8R}\Zvpebfbsg Bssvpr\Bssvpr14\RKPRY.RKR

  ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕     ↕

{7C5A40EF-A0FB-4BFC-874A-C0F2E0B9FA8E}\Microsoft Office\Office14\EXCEL.EXE

The first part of the decoded path (in bold) is the Windows KnownFolderID GUID that translates to the %ProgramFiles% (x86) location.

We've now pinpointed the application in question. Next, we need to find out how many times it has been run.

The run counter for the EXCEL.EXE UserAssist entry can be found at offset 0x04. In this instance, the run counter happens to be 4, indicating that EXCEL.EXE was run four times. Note that on an XP machine, this counter would start at 5 instead of 1. So if you are manually parsing this data, remember to subtract 5 from the run counter when dealing with XP machines.

We've successfully answered question #2. EXCEL.EXE was run 4 times.

But wait, there's more! Check the 8-byte value starting at offset 0x3C (60d). That's the last time the program was run. Convert this 64-bit FILETIME value to a readable date/time using DCode.

DCode showing the decoded last run time of EXCEL.EXE

EXCEL.EXE was last run Wed, 04 April 2012 15:43:14 UTC. On to question #3.

Note: Determining OS version with NTUSER.DAT only

As a side note, we can now tell that the machine housing this NTUSER.DAT was a post XP/Server 2003 machine. How? Well, there are a few indicators: UserAssist entries on Windows XP are 16 bytes in length while Windows 7 UserAssist entries are 72 bytes in length; the two subkeys under the root UserAssist key ({CEBFF5CD-ACE2-4F4F-9178-9926F41749EA} and {F4E57C4B-2036-45F0-A9AB-443BCFE33D9F}) are typically found on machines running Windows 7; we see references to the "C:\Users" folder in other UserAssist entries instead of the "Documents and Settings" folder that is typically found on XP machines; the run counter for EXCEL.EXE is less than 5 -- on an XP machine, this counter would be at LEAST 5.

Question #3: Find the most recent TypedURL.

The TypedURLs registry key stores URLs that are manually typed/pasted into Internet Explorer. Clicked links are not stored here.

Using our tried and true Registry Explorer process, let's look at what the TypedURLs registry key has to offer. Navigate to the NTUSER.DAT\Software\Microsoft\Internet Explorer\TypedURLs key by typing "typedurls" in Registry Explorer.

Registry Explorer showing the TypedURLs regkey.

As we can see above, the most recent TypedURL is The value labeled "url1" will always be the most recent TypedURL. Also, check the LastWrite time on the root TypedURL key to determine the last time "url1" was typed.

Question #3: answered. Challenge complete.


Now, let's assume you've actually got some work to finish. You've gone through this stuff manually at least once and understand it inside and out. It's time to automate.

Harlan Carvey's RegRipper has plugins to quickly pull/parse the registry keys covered here and much, much more. Yes, we can answer all three of these questions with a one-liner.

C:\RegRipper2.8>rip.exe -r Vibranium-NTUSER.DAT -p wordwheelquery >> ntuser.log && rip.exe -r Vibranium-NTUSER.DAT -p userassist >> ntuser.log && rip.exe -r Vibranium-NTUSER.DAT -p typedurls >> ntuser.log

Remember, though: it's one thing to be able to run RegRipper. It's another to know where the output is coming from and why you're seeing what you're seeing.

Again, this is nothing new; this challenge is actually on the easier side of analysis. But, if at any point you had doubts about the artifacts covered here, it's worth going back and refreshing your memory.

-Dan Pullega (@4n6k)

1. UserAssist Forensics (by 4n6k)
2. INSECURE Magazine #10 (by Didier Stevens)
3. ROT13 is used in Windows? You’re joking! (by Didier Stevens)
4. KNOWNFOLDERID (by Microsoft)
5. FILETIME structure (by Microsoft)

Tuesday, August 26, 2014

Forensic FOSS: - Install Volatility For Linux Automatically

Introducing FORENSIC FOSS! These posts will consist of open source software for use in everyday forensic investigations.

[UPDATE #01 11/12/2015]: Volatility 2.5 was released recently. A standalone Linux executable is included with the 2.5 release. This installer is for Volatility 2.4. If you want to work with source code and get an idea of the dependencies needed by Volatility, review this installer's code. Otherwise, I'd recommend using the official Linux standalone executable. If you really want a working 2.5 installer update, see the fork of this project by @wzod).

What Is It? is a bash script that installs Volatility 2.4 (and all dependencies) for Ubuntu Linux with one command.

Why Do I Need It?
Compiling source on Linux can be a pain. Dependency hunting wastes time and drives people away from considering Linux builds of cross-platform software. With this script, you can (1) skip all of the dependency frustration, (2) get right into using the newest version of Volatility, and (3) leverage the arguably more powerful and versatile *nix shell. No longer do you have to worry about whether or not you "have everything."

What's Required?
An internet connection and an APT-based Linux distribution [for the time being]. This script has been tested on stock Ubuntu 12.04 and Ubuntu 14.04. Some testing has been done to support SIFT Workstation 3.0.

What Does It Do?
Specifically, does the following:
  • Downloads, verifies, extracts, and installs source archives for everything you will need to complete a full installation of Volatility 2.4:
    • Volatility 2.4
    • diStorm3
    • Yara (+ magic module) + Yara-Python
    • PyCrypto
    • Python Imaging Library + Library Fixes
    • OpenPyxl
    • IPython
    • pytz
  •  Adds "" to your system PATH so that you can run Volatility from any location.
  • Checks to see if you are using SIFT 3.0 and applies some fixes. 
How Do I Use It?
Volatility will be installed to the directory you specify.
  • From a terminal, run: 
    • sudo bash /home/dan
In the above example, the following directories will be created:
  • /home/dan/volatility_setup 
    • Contains dependency source code and the install_log.txt file.
  • /home/dan/volatility_2.4
    • Contains the Volatility 2.4 install.
You must enter a full path for your install directory.
Where Can I Download It?
You can download the script from my Github page.

Check the Github page for the script's SHA256 hash.

Bottom Line?
Don't be afraid of the terminal. Read the source for this script and understand how it works. Automation is acceptable only after you understand what is happening behind the scenes.

I'm open to making this script better. If you see a problem with the code or can suggest improvements, let me know and I'll see what I can do.

Thanks to the Volatility team for writing an amazing tool. Go to for more info. Thanks to @The_IMOL, Joachim Metz, @dunit50, and @iMHLv2 for feedback.

Wednesday, April 16, 2014

Forensics Quickie: Merging VMDKs & Delta/Snapshot Files (2 Solutions)

FORENSICS QUICKIES! These posts will consist of small tidbits of useful information that can be explained very succinctly.

I had a VM that was suspended. I needed to see the most recent version of the filesystem. Upon mounting the base .vmdk file, I was presented with the filesystem that existed before the snapshot was taken.

Solution #1
(turns out I ran into a similar problem before...see my post on Mounting Split VMDKs).

The issue lies within the fact that when you create a VM snapshot, "all writes to the original -flat.vmdk are halted and it becomes read-only; changes to the virtual disk are then written to these -delta files instead"[2]. So basically, when I was mounting the base .vmdk file (the "someVM-flat.vmdk"), I wasn't seeing anything that was written to the disk after the snapshot was created. I needed a way to merge the delta files into the -flat file.

To further explain what I was working with, I had three .vmdk files:
  1. Win7.vmdk (1KB in size; disk descriptor file; simply pointed to the -flat.vmdk)
  2. Win7-flat.vmdk (large file size; the base .vmdk file)
  3. Win7-000001.vmdk (delta/snapshot file; every write after the snapshot is stored here)
*Note that this image was not split. If it was, I would just use the method detailed in my other post

As mentioned, I needed to merge these all into one to be able to mount the .vmdk and see the most recent version of the filesystem. VMware includes a CLI tool for this. It is stored in the "Program Files\VMware" folder. Run this command.

vmware-vdiskmanager.exe –r Win7-000001.vmdk –t 0 singleFileResult.vmdk

Note that the .vmdk file being used as input should be the .vmdk for the latest snapshot. You can confirm which .vmdk file this is by checking the VM's settings.

You can also define what kind of disk you want to output, as well. I have never found it necessary to use anything other than 0.

You can now mount the new .vmdk to see the most recent version of the file system. I *imagine* you could do this for previous snapshots if you define the proper .vmdk. But I have not tested that.

Solution #2
The other solution, which I wound up using after finding out how to do it correctly, is to import the .vmdk files in order within X-Ways. If you try to import a delta file before the base .vmdk, X-Ways will throw an error saying:

"In order to read a differencing VMDK/VHD image, the corresponding parent must be (and stay) opened first. They should be opened in order of their last modified dates - oldest first, skipping none."

So, I did as it said; I imported Win7.vmdk, then Win7-000001.vmdk. It's that easy.

Though this method may become cumbersome with many snapshots/delta files, you would be able to incrementally see what writes had been made to each snapshot. Just be careful when adding delta files for snapshots that depend on previous snapshots (see below).

VMware's Snapshot Manager showing snapshot dependencies.

Thanks to Eric Zimmerman, Jimmy Weg, and Tom Yarrish for helping with the X-Ways method.

-Dan Pullega (@4n6k)

1. Consolidating snapshots in VMware Fusion
2. Understanding the files that make up a VMware virtual machine

Saturday, March 29, 2014

Forensics Quickie: Verifying Program Behavior Using Source Code

FORENSICS QUICKIES! These posts will consist of small tidbits of useful information that can be explained very succinctly.

Recently, I stumbled upon a question related to the behavior of a given program. That program was Mozilla Firefox and the behavior in question was how profile directory names were generated. Through this post, I will cover how to approach this question and how to solve it with available resources (source code).

The Question
How are Firefox profile directory names generated?

The Answer (and the road to get there)
To answer this question, we first have to understand which artifacts we are examining. In this case, we are dealing with Firefox profiles. Those are located in a user's AppData folder. By navigating AppData, we eventually find C:\Users\someUser\AppData\Roaming\Mozilla\Firefox\Profiles.

The Firefox 'Profiles' folder showing the directory for the profile named "default."

We see a folder for the only Firefox profile used on the system. Reasonably, this account is named "default" by default. But what about that seemingly random prefix?

Let's see if we can glean anything from trying to create a profile in Firefox. First, let's open up the Profile Manager by typing "firefox.exe -p" in the 'Run...' dialog (Windows key + R).

We can confirm that there is only one profile and it is named "default."

When we try to create a profile, we see the following window:

Great. We can actually see where it is going to be saved. And no matter what we enter in the text input field, the prefix stays the same. This tells us that the folder for this new profile isn't generated based on the profile name we enter. But there are other possibilities, such as the folder name being based on the current time.

There are many other tests we could run, but we actually don't need to -- the source code for Firefox is freely available online. Once we download and extract the source code, we can try to find the function that handles the generation of the profile's folder name.

Uncompressed, the Firefox source code is about 585MB. That's a lot of code to review. A better way to sift through all of this data is to either index it all and search it or to just recursively grep the root folder. I decided on the latter.

To find out where to look first, we can try to find a constant, unique phrase around the text in question. In the above image, the string "Your user settings, preferences and other user-related data will be stored in:" is right before the path name with which we are concerned. So let's grep for that and see if we can find anything.

There are many ways to grep data, but this was a quick and dirty job where I wasn't doing any parsing or modifying, so I used AstroGrep. I went ahead and searched for the a watered down version of the aforementioned unique phrase: "will be stored in." The results showed that the file named CreateProfileWizard.dtd contained this unique phrase (there were many files that were responsive, but based on the phrase's context and the filename for the file in which the phrase was found, we can determine relevancy).

A snippet of the responsive "CreateProfileWizard.dtd"  file containing our unique phrase.

Now, it's just a matter of tracing this phrase back to its source. So we grab another unique phrase that is relevant to our discovery, such as "profileDirectoryExplanation," and see if we can find any more relevant files. Grepping for it comes back with more results -- one of which is createProfileWizard.xul. I didn't see much in this file at first glance (though there is quite bit of good info such as "<wizardpage id="createProfile" onpageshow="initSecondWizardPage();">" -- which will be seen soon), so I decided to see what else was in same directory as this file. There, I found createProfileWizard.js.

A snippet of the "createProfileWizard.js" file showing the functions used to generate the profile directory names.

Skimming the code, we can see that a salt table and Math.random() are used to determine the 8 character prefix. At line 68, we see the concatination of kSaltString ('0n38y60t' in the animated gif), a period ('.'), and aName (our text input field variable -- in our example, "abcdefg" or "4n6k").

In the end, the point is that having source code available makes research more streamlined. By looking at the Firefox source, we were able to determine that the profile directory names aren't based on a telling artifact.

Using what is available to you is invaluable. By having a program's source code [and approaching an issue with an analytic mindset], verifying program behavior is made so much easier.

I'll end with this: open source development is GOOD -- especially for forensics. Whether it's a tool meant to output forensic artifacts or just another Firefox, we are able to more confidently prove why we are seeing what we're seeing.

-Dan Pullega (@4n6k)

* Special thanks to Tom Yarrish (@CdtDelta) for spawning this post.

Monday, February 17, 2014

Forensics Quickie: PDF Metadata Forensics (Sunday Funday Answer)

FORENSICS QUICKIES! These posts will consist of small tidbits of useful information that can be explained very succinctly.

David Cowen's Sunday Funday contests are great. In short, a question is posed about a subject, and readers get the opportunity to answer and explain how it is relevant in a forensic investigation. Much can be learned from these contests, as a great answer is almost always delivered and posted publicly. I highly recommend checking it out and submitting answers (it's a weekly thing, so the subjects vary widely).

This week's (2/16/14) subject was PDF Metadata. I took a stab at answering and wound up winning, so I figured it wouldn't hurt to post my answer here. The original answer, along with a preface by David Cowen, can be found here.

The Questions
The subject's prompt was as follows:

1. What metadata can be present within a PDF document?
2. What affects the types of metadata that will be present within a PDF document?

The Answer
The following is my answer to the prompt. Disclaimer: this was written in the wee-hours of one night, so if there are any typos or errors, please feel free to contact me.

Much can be gleaned from PDF metadata. Of course, there are the standard fields that will provide you with relatively common metadata...but depending on the program you use to create the PDF, there could be much, much more.

Let's start with the most common metadata. Most PDFs will have embedded metadata showing the PDF version, creation date, creation program, document ID, and last modified date. There are definitely more, but I have left them out as they wouldn't be very useful in an investigation (e.g. page count). The use for the aforementioned metadata is fairly obvious, but I will explain nonetheless.

The more obvious PDF metadata entries are:
  • Creation program: program used to create the PDF (was it through desktop software, was it scanned, etc.)
  • Creator/Author/Producer: Username or full name of the PDF's author OR further details on the program used to create the PDF (is it a previous employer?)
  • Title: the title of the PDF that usually provides an outdated name for the document; good for identifying previous employer documents or documents that have been converted from one format into a PDF (e.g. SecretBluePrint.eps or oldCompanyFinances.doc shows up in the 'title' metadata entry)
Those are the easy ones. But what about the more overlooked metadata? As I mentioned before, the program used to create or modify the PDF may have a huge impact on what information you are given. With that, let's look into it.

First, timestamps. We know that file metadata could potentially serve as a better indicator of when a given document was created. If the PDF has been transferred across various volumes and systems -- and we would like to find the origin of the document -- the creation date in the file metadata is going to be more reliable than the file system creation date (as the filesystem date/time will have been updated with the copies/moves).
A More Reliable Creation Date

The metadata 'creation' date will [usually] preserve the REAL date of the file's creation. That is, if the PDF has been transferred across various volumes and systems, the 'creation' date in the file's metadata is going to give us a better idea of when the document was initially created.

The 'modified' date can be used in a similar way. We might even be able to tell how many programs through which the PDF was modified/saved. Say we have a PDF created using Adobe InDesign. If we were to open this PDF, modify it, and then save it as a new file using 'Save As...' in a program like Adobe Acrobat, we would see that the 'creation' date is still unchanged, but the 'modified' date had been updated (file system creation dates will tell us differently). Pretty standard stuff. Even if the PDF is saved using 'Save As...' (essentially creating a new file altogether with an updated file system creation time) AND it is moved from one system to the next, we will still have a genuine 'creation' date. Not only that, but we will have a metadata 'modified' date AND a new file system creation time to work with. Correlation among file metadata and file system timestamps are beyond the scope of this answer, but you get the point; 'creation' and 'modified' metadata dates are powerful and can be used creatively.
Also, with many PDF timestamps, we will be able to see a timezone offset. For example, a creation timestamp could be 2013:02:22 11:21:34-06:00. We now have a potential indication that the program that produced this PDF was set in Mountain time.

I mentioned that we might be able to determine if a PDF was created and modified through more than one program. As a quick side note, and if we really wanted to dive into the PDF analysis, we could take a look at some of the other telling metadata. The example above suggested creating a PDF in InDesign, opening it up in Acrobat, modifying it, and then saving it as a new file. When this happens, some of the metadata in the new file (like the 'modified' time) is updated while all of the InDesign metadata stays intact. However, there is a significant difference this time around: the 'XMP Toolkit' metadata value is different. Adobe implements their XMP Toolkit in all of their applications and plugins. They even open sourced it, so other programs can use it (and many do). The point is, "the XMPCore component of this toolkit is what allows for the creation and modification of the metadata that follows the XMP Data Model" (more here and here). So we have two PDFs, but the metadata for each was manipulated by two different versions of Adobe's XMP Core.

InDesign used:
"Adobe XMP Core 5.3-c011 66.145661, 2012/02/06-14:56:27" and...

Acrobat used:
"Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03"

But why is this important? Well, we can now more accurately pinpoint the program used to create the PDF. Sure, we will likely already have a metadata entry that tells us the 'Creation Program,' but consider the above example; that tool (InDesign) may have been used to initially create the PDF, but it was NOT used to open, modify, and save a new version of it (Acrobat did that). Let's keep this in mind as we explain some other interesting metadata...

Remember: the amount of metadata that a program uses when creating files is limitless. XMP is built on XML, so any metadata tags can be defined. Let's take a real-world example of how powerful PDF metadata can be when created from certain programs. Download Trustwave's Global Security Report PDF from 2013. Run it in exiftool. What do you see? That's right, the "History" metadata fields will show you not only that the document was saved 497 times, but it will also show you the exact times that is was saved, the program used to save it each time, and the Document Instance ID for each save (less exciting).

While you have that open, take a look at the creation date (2013:02:22 11:21:34-06:00) and modify date (2013:05:09 10:47:39-07:00). The modify date is much later, but the last "History" save on the file was 2013:02:22 11:18:06-06:00. What's up with that? This is because the PDF was modified in a different program; one newer than InDesign CS5.5. How do I know this? Well, look at the XMP Core version. The XMP Core version used for InDesign CS5.5 is "Adobe XMP Core 5.4-c005 78.147326, 2012/08/23-13:03:03." I just so happen to have a PDF created with InDesign CS6 and that PDF uses "Adobe XMP Core 5.3-c011 66.145661, 2012/02/06-14:56:27." How can it be that CS5.5 is using a later XMP Core version than CS6?! Because another program was used to modify the CS5.5 PDF after the last save. On 2013:05:09 10:47:39-07:00 (the modify date), some program (let's just say it's Acrobat to satisfy my example from before) modified the PDF. The XMP Core version shown in the metadata is NOT from CS5.5.

Also from the 'History' metadata, we can tell that the creation date is actually "2012:12:29 11:20:49-06:00." and NOT "2013:02:22 11:21:34-06:00." My guess is that InDesign was keeping track of the saves, but when it came down to exporting the PDF, it tacked on the export date as the "Create Date" (as the last 'History' save of the file is 3 minutes before the alleged "Create Date").

If we really wanted to, we could use another metadata field (the PDF version) to further pinpoint the program used. If the PDF version is 1.7, we could look for programs on a suspect computer that save PDFs to version 1.7 by default. Believe it or not, many programs still save PDFs as version 1.4, 1.5, and 1.6.

After all of this, I think it's safe to say that PDF metadata can be pretty valuable. You just need to know what's available to you and how to interpret it.

(Thanks to David Cowen for posing the question and offering up the G-C Partners Multi-Boot USB 3.0 flash drive prize!)

-Dan Pullega (@4n6k)

Thursday, February 6, 2014

Forensics Quickie: Pinpointing Recent File Activity - RecentDocs

FORENSICS QUICKIES! These posts will consist of small tidbits of useful information that can be explained very succinctly.

[UPDATE #01 05/01/2015]: A Python script has been created to automate the process detailed in this post. You can download Eric Opdyke's script here.

Let's revisit the basics...

A suspicious employee left your company on January 28, 2014. You'd like to know which files were most recently used (opened, saved) on the employee's system right before he/she left.

The Solution
Pull the user's NTUSER.DAT. Run RegRipper to easily output the the values within the Software\Microsoft\Windows\CurrentVersion\Explorer\RecentDocs subkey.

To give some background, the RecentDocs subkey will contain a few things:
  • Subkeys named after every file extension used on the system (e.g. a subkey for .zip, .doc, .mp4, etc.). 
    • Each of these subkeys will contain its own MRUListEx value that keeps track of the order in which files were opened. (Sound familiar? We also saw use of the MRUListEx value upon analyzing shellbags artifacts). 
    • Each subkey will store up to 10 numbered values; each numbered value represents a recently opened file with the extension found in the subkey's name.

    The .gif subkey showing the 10 most recently opened .gif files for a specific user. The order is stored in MRUListEx.

  • A subkey for folders most recently opened.
    • Can store up to 30 numbered values (i.e. the 30 most recently opened folders).
  • A root MRUListEx value
    • Stores the 149 most recently opened documents and folders across all file extensions.

With that, take note that MRU times are powerful; here's why:

Since every extension has its own subkey, we will have quite a few registry key LastWrite times with which to work. But so what? When dealing with RecentDocs, the LastWrite time only applies to the most recently opened file within that subkey. While this is certainly the case, we mustn't forget that we also have an all-encompassing MRUListEx in the root of the RecentDocs subkey.

So what does this mean? By using the LastWrite times of each subkey and the root MRUListEx list, we can more accurately determine when certain files were last opened. While we won't be able to get an EXACT time for non-MRU (i.e. files that are not the first entry in the MRUListEx value), we may be able to determine a time range in which these files were opened. Consider the following example:

Our root MRUListEx shows:

**All values printed in MRUList\MRUListEx order.
LastWrite Time Thu Feb  6 14:08:49 2014 (UTC)

  26 = smiley.gif
  99 = strange_bird.jpg
  42 = books_to_read.csv
  130 = secret_clients.csv
  55 = zip_passwords.rtf
  87 =
  3 = contacts.csv
  74 = phillip.csv
  72 = notes.txt


At this point, if we look at the first few entries of the root MRUListEx, we can really only tell that smiley.gif was last opened on Feb 6, 2014. So let's mark it:

  26 = smiley.gif ............................Feb  6 14:08:49 2014
  99 = strange_bird.jpg
  42 = books_to_read.csv
  130 = secret_clients.csv
  55 = zip_passwords.rtf
  87 =
  3 = contacts.csv
  74 = phillip.csv
  72 =

That secret_clients.csv looks pretty interesting. Let's look at the .csv subkey to see if we can find out when it was opened.

The .csv subkey's MRUListEx shows:

LastWrite Time Sat Jan 18 19:19:22 2014 (UTC)
MRUListEx = 6,2,1,7,3,5,4

  6 = books_to_read.csv  
  2 = secret_clients.csv
  1 =
  7 = contacts.csv
  3 = phillip.csv

Dang. Looks like the books_to_read.csv file is the most recently opened .csv, which means that the Jan 18 19:19:22 2014 LastWrite time of the .csv subkey is going to represent the last time that that file was opened -- not secret_clients.csv. I guess we can give up on this one...

...or...we could look at the surrounding files to glean some more information. First, let's mark the known LastWrite time in our root MRUListEx list.

  26 = smiley.gif ............................Feb  6 14:08:49 2014
  99 = strange_bird.jpg
  42 = books_to_read.csv...............Jan 18 19:19:22 2014
  130 = secret_clients.csv
  55 = zip_passwords.rtf
  87 = my_diet.csv
  3 = contacts.csv
  74 = phillip.csv
  72 =

Looking at our list, we can see that there are two open times that we know for sure. We can also see that there is another file that was opened between these times. It looks like strange_bird.jpg was opened after smiley.gif, but before books_to_read.csv. Since this list is an all-encompassing list, it would be safe to say that between January 18 and Feb 6, strange_bird.jpg was opened. But let's go ahead and confirm that.

The .jpg subkey's MRUListEx shows:

LastWrite Time Sat Jan 18 21:54:57 2014 (UTC)
MRUListEx = 4,2,3,1

  4 = strange_bird.jpg 
  2 = stone_tanooki.jpg
  3 = camel.jpg

  1 = fighting_frogs.jpg

It looks like strange_bird.jpg is the most recently opened .jpg on the system. That means that the Jan 18 21:54:57 2014 LastWrite time of the .jpg subkey represents the the time at which strange_bird.jpg was opened. Again, let's mark it in our root MRUListEx list:

  26 = smiley.gif ............................Feb  6 14:08:49 2014
  99 = strange_bird.jpg...................Jan 18 21:54:57 2014
  42 = books_to_read.csv...............Jan 18 19:19:22 2014
  130 = secret_clients.csv
  55 = zip_passwords.rtf
  87 =
  3 = contacts.csv
  74 = phillip.csv
  72 =

At this point, we still don't know when secret_clients.csv was last opened. Upon finding its LNK file, we note that it had been first opened in late 2013 (LNK creation time). If this machine was running XP, we would be able to grab the access time of this LNK file to see when it *might* have been last opened. Alas, we are using Windows 7 (disclaimer: there are tons of other things you should be looking at anyway; this is merely one example of why we would want to look at RecentDocs). Let's keep moving.

Taking a second look at our root MRUListEx list, we notice that the next item after secret_clients.csv is zip_password.rtf. That also looks interesting, so let's see when that was opened.

The .rtf subkey's MRUListEx shows:

LastWrite Time Sat Jan 17 14:15:31 2014 (UTC)
MRUListEx = 2,1

  2 = zip_passwords.rtf 
  1 = fiction_novel.rtf

It looks to be the first one, so we can use the LastWrite time to mark it in our list.

  26 = smiley.gif ............................Feb  6 14:08:49 2014
  99 = strange_bird.jpg...................Jan 18 21:54:57 2014
  42 = books_to_read.csv...............Jan 18 19:19:22 2014
  130 = secret_clients.csv
  55 = zip_passwords.rtf.................
Jan 17 14:15:31 2014
  87 = my_diet.csv
  3 = contacts.csv
  74 = phillip.csv
  72 =

With that, we have found a pretty tight time frame as to when secret_clients.csv was last opened. We don't have the exact date and time, but we can now say that is was last opened somewhere between Jan 17 and Jan 18.

It looks as though there are some more .csv files that were recently opened, and by the looks of it, that .txt file might be able to help us hone in on a time frame.

The .txt subkey's MRUListEx shows:

LastWrite Time Sat Jan 10 18:23:08 2014 (UTC)
MRUListEx = 4,5,2,1,3

  4 = notes.txt 
  5 = groceries.txt
  2 = how_to_make_toast.txt
  1 = my_novel.txt
  3 = address.txt

Just as before, let's mark the LastWrite time in our root MRUListEx list.

  26 = smiley.gif ............................Feb  6 14:08:49 2014
  99 = strange_bird.jpg...................Jan 18 21:54:57 2014
  42 = books_to_read.csv...............Jan 18 19:19:22 2014
  130 = secret_clients.csv
  55 = zip_passwords.rtf.................
Jan 17 14:15:31 2014
  87 = my_diet.csv
  3 = contacts.csv
  74 = phillip.csv
  72 =
notes.txt...............................Jan 10 18:23:08 2014

Again, we don't have an exact date and time as to when the my_diet.csv, contacts.csv, and phillip.csv files were last opened. But, based on the LastWrite time of the .txt subkey, we do know the time frame in which these files were opened.

Now imagine that a USB device was also plugged in at some point within this time frame. It is possible that the suspicious employee may have done a "Save As..." on contacts.csv and saved it to the external device. It is also possible that the employee opened the file and copy/pasted the information into an email. Of course, the possibilities are endless, but you get the point.

And it goes without saying that other artifacts, such as LNK files, jump lists, ComDlg32, OfficeDocs, and more should also be used to complement RecentDocs analysis. In particular, ComDlg32 would be of interest in this scenario, as it will show you files recently opened/saved. It offers more information (e.g. the executable used to open the file, the file's path, etc.) and stores saves/opens by extension, as well. As such, we can use this same technique to pinpoint file open/save times (though, it is a bit limited, as the * extension subkey will only show up to 20 entries).

This post quickly became longer than expected, but it's still a quickie. I'll end with this:

This is nothing new. All of this data is available to you in the Windows registry. This is a fairly basic technique, but it is also one that shouldn't be overlooked. And as always, use more than one artifact to make your case.

-Dan Pullega (@4n6k)