I mentioned in Gathering Hard Drive Serial Number and Information that I wanted a method to automatically document a hard drive’s information in Linux. I did come across a solution called Automated Image and Restore (AIR) that provides a GUI front-end to DC3DD and DD. I haven’t tested this as I really would prefer a command line utility that provides this functionality (let us know your comments on your experiences with AIR). Also, I thought that I preferred DCFLDD. I actually did until I started developing this script and I discovered some “unwanted functionality” which I will cover at the end of this post.
With all of this in mind, the Forensic Acquisition Information and Drive Data Script (faidds) was born. FAIDDS (available at Google Code) leverages the Python subprocess methods to run parted, hdparm, and sdparm to gather information about a specified hard drive (Python subprocess runs other executables, so MIND your file integrity). Once this information is gathered the hard drive is acquired using either DC3DD (default) or DCFLDD. The start and end times are recorded. All of the acquisition information is written to a log file. The script also generates a hash file associated with the acquired image leveraging the acquisition tool’s hashing functionality.
faidds.py -h
faidds.py [-h] [-D] [-dcfldd] [-m hash0,hash1] [-c size in G (1024*1024*1024)] [-s \"serial number\"] -d \”drive location\”‘
-h: help’
-D: debug information’
-s: user specified serial number. Default is to find serial number in drive info.’
-d: device file to acquire’
-c: size to split file in G (1024*1024*1024)’
-dcfldd: use dcfldd (default: dc3dd)’
-m: list of hash algorithms to use. Comma separated with no spaces. (default: md5)’
Here is how I use the script.
First I locate the hard drive I want to acquire. For this example I will attach a USB thumb drive to my system simulating that I am attaching a hard drive to the system using a write-blocker connected with a USB cable. To detect the newly attached drive I run the “dmesg | tail” command:
cutaway> dmesg | tail
[526855.404692] scsi 21:0:0:0: Direct-Access LEXAR JUMPDRIVE SECURE 2000 PQ: 0 ANSI: 0 CCS
[526855.406731] sd 21:0:0:0: Attached scsi generic sg6 type 0
[526855.409391] sd 21:0:0:0: [sdf] 502880 512-byte logical blocks: (257 MB/245 MiB)
[526855.415142] sd 21:0:0:0: [sdf] Write Protect is off
[526855.415152] sd 21:0:0:0: [sdf] Mode Sense: 43 00 00 00
[526855.415159] sd 21:0:0:0: [sdf] Assuming drive cache: write through
[526855.418722] sd 21:0:0:0: [sdf] Assuming drive cache: write through
[526855.418737] sdf: sdf1
[526855.471513] sd 21:0:0:0: [sdf] Assuming drive cache: write through
[526855.471520] sd 21:0:0:0: [sdf] Attached SCSI disk
cutaway>
As you can see, I attached a 256 MB LEXAR USB thumb drive and it was assigned the device file “sdf”. Next, I move into the directory where I want to acquire the drive image. This could be the local hard drive but more than likely it will be some other media such as an external USB hard drive or a shared drive. To acquire this drive I decided to have it output the MD5 and SHA1 hashes of the drive I am acquiring.
cutaway> sudo python ../faidds/trunk/faidds.py -D -m md5,sha1 -d /dev/sdf
[sudo] password for cutaway:
Enter YES to acquire/dev/sdf: YES
/sbin/parted /dev/sdf print
/sbin/hdparm -I /dev/sdf
/usr/bin/sdparm –inquiry /dev/sdf
_ �Nk
System Time Zone Is: CDT
Start Time: March 18 2012 17:49:36 UTC
Acquisition command: /usr/bin/dc3dd log=./_ �Nk_20120318174936_hash.txt rec=on if=/dev/sdf hash=md5 hash=sha1 of=./_ �Nk_20120318174936.dd
[!!] opening log `./_\001 \205N\024\027k_20120318174936_hash.txt’: Invalid or incomplete multibyte or wide character
Try `/usr/bin/dc3dd –help’ for more information.
dc3dd aborted at 2012-03-18 12:49:36 -0500Stop Time: March 18 2012 17:49:36 UTC
Traceback (most recent call last):
File “../faidds/trunk/faidds.py”, line 192, in <module>
ONF = open(onf,’w')
IOError: [Errno 84] Invalid or incomplete multibyte or wide character: ‘drive_data__dev_sdf__\x01 \x85N\x14\x17k_20120318174936.txt’
cutaway>
Lucky for me I selected this thumb drive for testing. Apparently, LEXAR did not set up the serial number properly (easily readable). In this case it appears that the device returns serial number information that is “multibyte or wide character.” This situation is not unique for this USB thumb drive and could occur when acquiring older USB and Flash drives. Thus the “user specified serial number” option was born.
cutaway> sudo python ../faidds/trunk/faidds.py -D -m md5,sha1 -d /dev/sdf -s testing
Enter YES to acquire/dev/sdf: YES
/sbin/parted /dev/sdf print
/sbin/hdparm -I /dev/sdf
/usr/bin/sdparm –inquiry /dev/sdf
System Time Zone Is: CDT
Start Time: March 18 2012 18:03:07 UTC
Acquisition command: /usr/bin/dc3dd log=./testing_20120318180307_hash.txt rec=on if=/dev/sdf hash=md5 hash=sha1 of=./testing_20120318180307.dddc3dd 7.0.0 started at 2012-03-18 13:03:07 -0500
compiled options:
command line: /usr/bin/dc3dd log=./testing_20120318180307_hash.txt rec=on if=/dev/sdf hash=md5 hash=sha1 of=./testing_20120318180307.dd
device size: 502880 sectors (probed)
sector size: 512 bytes (probed)
257474560 bytes (246 M) copied (100%), 30.2548 s, 8.1 M/sinput results for device `/dev/sdf’:
502880 sectors in
0 bad sectors replaced by zeros
eaa52cfb7b1b37d2b94b8d371e0e47a8 (md5)
602c713c00c9a05c8f0ec76f9c3f2f7581da7edd (sha1)output results for file `./testing_20120318180307.dd’:
502880 sectors outdc3dd completed at 2012-03-18 13:03:37 -0500
Stop Time: March 18 2012 18:03:37 UTC
cutaway> ls -al
total 251452
drwx—— 1 cutaway cutaway 328 2012-03-18 13:03 .
drwx—— 1 cutaway cutaway 4096 2012-03-18 12:18 ..
-rwx—— 1 cutaway cutaway 2529 2012-03-18 13:03 drive_data__dev_sdf_testing_20120318180337.txt
-rwx—— 1 cutaway cutaway 257474560 2012-03-18 13:03 testing_20120318180307.dd
-rwx—— 1 cutaway cutaway 643 2012-03-18 13:03 testing_20120318180307_hash.txt
cutaway>
DC3DD provides us with an excellent record of our acquisition method and the hash values for the acquisition.
cutaway> cat testing_20120318180307_hash.txt
dc3dd 7.0.0 started at 2012-03-18 13:03:07 -0500
compiled options:
command line: /usr/bin/dc3dd log=./testing_20120318180307_hash.txt rec=on if=/dev/sdf hash=md5 hash=sha1 of=./testing_20120318180307.dd
device size: 502880 sectors (probed)
sector size: 512 bytes (probed)
257474560 bytes (246 M) copied (100%), 30.2548 s, 8.1 M/sinput results for device `/dev/sdf’:
502880 sectors in
0 bad sectors replaced by zeros
eaa52cfb7b1b37d2b94b8d371e0e47a8 (md5)
602c713c00c9a05c8f0ec76f9c3f2f7581da7edd (sha1)output results for file `./testing_20120318180307.dd’:
502880 sectors outdc3dd completed at 2012-03-18 13:03:37 -0500
cutaway>
Of course, if you prefer (as I thought I did) you could use DCFLDD. The command line options are the same with the exception that the user needs to include the “-dcfldd” option.
cutaway> sudo python ../faidds/trunk/faidds.py -D -m md5,sha1 -d /dev/sdf -s testing -dcfldd
Enter YES to acquire/dev/sdf: YES
/sbin/parted /dev/sdf print
/sbin/hdparm -I /dev/sdf
/usr/bin/sdparm –inquiry /dev/sdf
System Time Zone Is: CDT
Start Time: March 18 2012 18:07:00 UTC
Acquisition command: /usr/bin/dcfldd hashlog=./testing_20120318180700_hash.txt conv=noerror,sync if=/dev/sdf of=./testing_20120318180700.dd
7680 blocks (240Mb) written.
7857+1 records in
7858+0 records out
Stop Time: March 18 2012 18:07:31 UTC
cutaway>
You can run this command in the same directory as the other acquisition because the output files are all time stamped. The hashlog file for DCFLDD is not as robust as DC3DD, but don’t worry, I capture similar information in the drive acquisition data file.
cutaway> ls -al
total 502913
drwx—— 1 cutaway cutaway 328 2012-03-18 13:07 .
drwx—— 1 cutaway cutaway 4096 2012-03-18 12:18 ..
-rwx—— 1 cutaway cutaway 2529 2012-03-18 13:03 drive_data__dev_sdf_testing_20120318180337.txt
-rwx—— 1 cutaway cutaway 2519 2012-03-18 13:07 drive_data__dev_sdf_testing_20120318180731.txt
-rwx—— 1 cutaway cutaway 257474560 2012-03-18 13:03 testing_20120318180307.dd
-rwx—— 1 cutaway cutaway 643 2012-03-18 13:03 testing_20120318180307_hash.txt
-rwx—— 1 cutaway cutaway 257490944 2012-03-18 13:07 testing_20120318180700.dd
-rwx—— 1 cutaway cutaway 46 2012-03-18 13:07 testing_20120318180700_hash.txt
cutaway> cat testing_20120318180700_hash.txt
Total (md5): eaa52cfb7b1b37d2b94b8d371e0e47a8
cutaway> tail drive_data__dev_sdf_testing_20120318180731.txt
Integrity word not set (found 0x7c3c, expected 0x91a5)/usr/bin/sdparm –inquiry /dev/sdf
/dev/sdf: LEXAR JUMPDRIVE SECURE 2000System Time Zone Is:CDT
Start Time: March 18 2012 18:07:00UTC
Acquisition command: /usr/bin/dcfldd hashlog=./testing_20120318180700_hash.txt conv=noerror,sync if=/dev/sdf of=./testing_20120318180700.dd
Stop Time: March 18 2012 18:07:31UTC
cutaway>
Okay, who sees what I see? Notice anything different? What about the MD5 sums recorded for the captured data (the files with the “.dd” extension)? They are the same: eaa52cfb7b1b37d2b94b8d371e0e47a8. Hmm. Okay, what about the file sizes? The image created by DC3DD is 257474560 and the image created by DCFLDD is 257490944. Interesting. Let’s run the MD5SUM command against these files.
cutaway> md5sum testing_20120318180307.dd testing_20120318180700.dd
eaa52cfb7b1b37d2b94b8d371e0e47a8 testing_20120318180307.dd
a3aa87600be5559117c7450b5475cd0c testing_20120318180700.dd
cutaway>
Wait a minute!!!! That is not right. And it is not the same hash value recorded by DCFLDD during acquisition. What could be the issue here? Could the second image be corrupted? Let’s check their similarity using SSDEEP (space added in the following output to prevent the stupid emoticon).
cutaway> ssdeep testing_20120318180307.dd testing_20120318180700.dd
ssdeep,1.1–blocksize:hash:hash,filename
6291456:Io/uDcy8TiPJY+Mj7C6llllllllllllaG3KfJh8rliaS6H: DOcrTiPJY+Mj7CGKfJolNH,”/media/InG-Storage/Research/Dev/data_acquisition_test/testing_20120318180307.dd”
6291456:Io/uDcy8TiPJY+Mj7C6llllllllllllaG3KfJh8rliaS6: DOcrTiPJY+Mj7CGKfJolN,”/media/InG-Storage/Research/Dev/data_acquisition_test/testing_20120318180700.dd”
cutaway>
These files appear to be exactly the same except for something at the end of the file. What could that be?

Ahh, it does appear to be exactly the same except that there are 16,384 null bytes attached to the end of the file created by DCFLDD. Well, we could remove those if we REALLY needed to. I am guessing that this was caused by the fact that I used the “conv=noerror,sync” option to handle bad blocks during acquisition. To test I simply ran the command by itself without this option.
cutaway> sudo /usr/bin/dcfldd hashlog=./testing_20120318180700_hash_noerror.txt if=/dev/sdf of=./testing_20120318180700_noerror.dd
7680 blocks (240Mb) written.
7857+1 records in
7857+1 records out
cutaway> ls -al
total 754653
drwx—— 1 cutaway cutaway 4096 2012-03-18 13:40 .
drwx—— 1 cutaway cutaway 4096 2012-03-18 12:18 ..
-rwx—— 1 cutaway cutaway 2529 2012-03-18 13:03 drive_data__dev_sdf_testing_20120318180337.txt
-rwx—— 1 cutaway cutaway 2519 2012-03-18 13:07 drive_data__dev_sdf_testing_20120318180731.txt
-rwx—— 1 cutaway cutaway 300656 2012-03-18 13:27 oketa_image_diff.png
-rwx—— 1 cutaway cutaway 257474560 2012-03-18 13:03 testing_20120318180307.dd
-rwx—— 1 cutaway cutaway 643 2012-03-18 13:03 testing_20120318180307_hash.txt
-rwx—— 1 cutaway cutaway 257490944 2012-03-18 13:07 testing_20120318180700.dd
-rwx—— 1 cutaway cutaway 46 2012-03-18 13:40 testing_20120318180700_hash_noerror.txt
-rwx—— 1 cutaway cutaway 46 2012-03-18 13:07 testing_20120318180700_hash.txt
-rwx—— 1 cutaway cutaway 257474560 2012-03-18 13:40 testing_20120318180700_noerror.dd
cutaway> md5sum testing_20120318180307.dd testing_20120318180700.dd testing_20120318180700_noerror.dd
eaa52cfb7b1b37d2b94b8d371e0e47a8 testing_20120318180307.dd
a3aa87600be5559117c7450b5475cd0c testing_20120318180700.dd
eaa52cfb7b1b37d2b94b8d371e0e47a8 testing_20120318180700_noerror.dd
cutaway>
Yes, that is it. DCFLDD adds extra bytes to account for the rest of the file out to the end of the last block. The options that tells DC3DD to manage bad blocks is the “rec=on” option. As I have this selected for the DC3DD command it appears that it does not add these additional null bytes to the end of the file.
In other words, “Know Your Tools.”
Go forth and do good things,
Don C. Weber










