EC2 DFIR Workshop
Module Overview: File System Forensics - Part 1
- Mounting Additional Volumes
- Creating a Hash Database
- Using Hashes to Identify Changes
- Using “sorter”
- Recover the Unallocated Space
Mounting Additional Volumes
In addition to the EVIDENCE Volume that was attached to the SIFT Workstation we will want to attach two additional volumes:
- BASELINE – This is made from a snapshot that was taken of an EC2 Instance that
has just booted (but before it has been logged into).
- It needs to be made from the same AMI as the EVIDENCE Volume.
- Mounted Read-Only
- DATA – This is a storage volume that will contain the evidence collected
during the analysis.
- This is mounted Read/Write.
Understanding Hash Databases
Brian Carrier has created the Slueth Kit suite of tools for filesystem forensics
One such tool is hfind to create an indexed hash database and to look up files in the hash database
A hash list of every file on the BASELINE volume will be created using MD5 and added to the known_files hash database
Next a hash list will be created for all of the files on the EVIDENCE Volume and these hashes will be compared with the hashes in the known_files hash database made from the BASELINE volume.
Any files that are not found in the known_files hash database are either new or changed.
The Sleuth Kit “sorter” Command
sorter is a Perl script that interacts with other The Sleuth Kit tools. It starts by reading the configuration files from the installation directory. There is a general configuration file and a specific one for each operating system.
Each configuration file contains rules for processing the output of the ‘file’ command.
- One type of line identifies which category (i.e. ‘images’) a given ‘file’ output belongs to (i.e. ´image data´) (using regular expressions).
- Another rule shows the file extensions (i.e. .txt) that belong to a ‘file’
output (i.e.
ASCII(.*?)text)
.
The program then runs the ‘fls’ tool in The Sleuth Kit to identify the files in the file system image. Each identified file is viewed using the ‘icat’ tool. If a hash database is given, the hash of the file is calculated and looked up. If it is found in an ‘alert’ database, then it is added to a special ‘alert.txt’ file.
If it is found in the NSRL or ‘exclude’ database, then it is ignored as a known good file. Excluded files are recorded in an ‘exclude’ file for future reference but it is not saved in the category files.
The Sleuth Kit “sorter” Command (2)
The ‘file’ command is then run to identify the file type (based on header
information). The configuration file rules are used to identify which category
it belongs to. An entry is added to the corresponding category file (in the
‘-d dir’ directory).
If the ‘-s’ flag is given, then a copy of the file is saved in a subdirectory of the same name as the category. If the HTML format is used, then hyper-links will allow one to easily view saved files and view what is in each category.
Files that do not have a category are recorded in the ‘unknown’ category and the ‘data’ category. ‘data’ is for files with a structure that ‘file’ does not know and ‘unknown’ is for files with a structure that ‘file’ knows about.
The Sleuth Kit “sorter” Command (3)
A copy of the files can be saved by using the ‘-s’ flag. If so, then the files are saved in a subdirectory that is named with the category name. Each file is named using the file system image name followed by the meta data address and the original file extension. The category index file can be used to translate the actual name to the saved name. The HTML format makes viewing easier as there are links to each file from the category index file.
The program will also consult the rules about the file extension. If the file has an extension at the end of it (anything after a ´.´), it will be compared to the rules. If the extension is not found in the rules as a valid extension for the file type, it will be added to the file of ‘mismatch’.
If the file does not have an extension it will not be entered even if the file type has valid extensions. This check is done even if the file is found in one of the known good hash databases. If it is found in one of those, it will be added to a special file.
Files of type ‘data’ have no extension checks done by default (as they have an unknown structure).
File Carving
Foremost is another tool for recovering files from unallocated space. While sorter uses the output of the fls command, Foremost uses signatures of files that consist of distinct byte patterns within the header and footer of each file type it can extract.
This technique is called data carving.
The results depend on the presence of accurate signatures for the file system to be carved. The current foremost.conf file that is distributed with the SIFT Workstation is focused on media files and Microsoft Windows