Pointers to LINUX Tools for Rescuing Data from defect Hard Drives
Tilo Sloboda, Aug 2005
After the unfortunate loss of two hard-drives in my RAID5 array, and a friend of mine loosing his external drive,
I needed to do some research on tools for rescuing data from defect hard drives and defect filesystems..
I was using a LINUX software RAID5 at the time, and while trying to repair a one-disk error, the
system would suddenly completely hang-up, caused by uncorrectable DMA-errors on one of the other
drives..
The best advice in this case is:
- don't write to the drive! mount it as read-only!!!
- think hard before running a fsck on a filesystem with a defect drive..
- rather rescue the data from the drive, by creating a disk-image of it, and then fsck the image..!
To create a disk-image, you can not use dd, because it's designed to abort on errors.. That's why you need tools
like dd_rescue / dd_rhelp , or ddrescue (GNU), which don't abort on errors -- these tools try to create
a disk-image of the defect drive, salvaging as many blocks as possible from the bad drive..
Once you have created the disk-image, make a copy of it!! Creating the image takes a lot of time, and you don't
want to run a fsck on the copy itself, because you may want to try several different tools to salvage your data!
The resulting disk-image file can be mounted via "mount -o loop" and then analyzed..
I also want to mention one trick, I learned a long time ago: you can try to swap the controller-board of a defect hard-drive with one of an identical functioning drive -- this may help you if the controller board went bad, but not if the disk itself has bad blocks..
Temporary Data:
This is the time to purchase a large harddrive for storing the temporary data!! You'll probably need at least 2..3 times as much temporary space as the size of your defect harddrive partition(s).. the more the better..
It might be a good idea to put the temporary disk space on a different server, and cross-mount it -- this way
you avoid doing a fsck on your (huge) temp-partition every time the system doing the rescue hangs-up and needs to be rebooted.... ;-)
DMA Errors:
First step, if you see DMA errors, which hang-up your system: you need to do a 'hdparm -d0 -r1 /dev/hdX'
on the raw-device of your defect drive before using any of the tools below.. this will disable DMA for that drive, and set the drive to read-only.
Other Errors:
Once dd_rhelp has narrowed down the area where the error on the disk is, it will most likely hang your system every time it tries to access the location of the error.. To reduce the pain of having to do reboots, I did a 'hdparm -d0 -r1 -m0 -P0 -A0 -a0 /dev/hdg' , to switch-off all kinds of read-ahead on the defect disk (read the man page of hdparm!).
PREVENTION:
Hard disks die! They always(!) do that sooner or later... sooner, if they are not properly cooled(!)... and they
tend to die all at the same time, if they were purchased around the same time...
To prevent being surprised by the death of a hard-drive, it's highly advisable
to monitor the S.M.A.R.T status of disks and to run tests on your disks on a regular basis (e.g. run smartd).. this way you
can see problems before they become fatal..
MOST IMPORTANT: Make backups of your valuable data! Don't trust just one hard drive!
Rescue Tools:
Here's a list of the tools, I found -- In no particular order(!) -- I hope it will help somebody out there..
- SmartMonTools : The smartmontools package contains two utility programs (smartctl and smartd) to control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology System (SMART) built into most modern ATA and SCSI hard disks. In many cases, these utilities will provide advanced warning of disk degradation and failure.
- SMART LINUX : a bootable Linux image which contains forensic tools.. not to be confused with S.M.A.R.T. or SmartMonTools..
- dd_rescue and dd_rhelp : dd_rhelp is a shell-wrapper around dd_rescue, to automate the process of creating a disk-image of your defect drive..
- ddrescue a GNU tool, not to be confused with dd_rescue, which fits the same purpose..
- The Sleuth Kit and The Autopsy Browser : a collection of tools for data analysis and disaster recovery including the GUI Autopsy Browser
- The Coroner's Toolkit (TCT) : contains a whole lot of useful tools, such as unrm, grave-robber, lazarus -- google "Coroner's Toolkit" for articles!
- Foremost : A data recovery tool based on data carving, a process in which the disk (or image) is searched for the begining and end of specific file formats (instead of analyzing the structure or meta data of the filesystem). More info in this Sys Admin Magazine article.
- e2salvage : a utility which tries to do in-place data recovery a from damaged ext2 filesystems. Unlike e2fsck, it does not look for the data at particular places and it don't tend to believe the data it finds; thus it can handle much more damaged filesystem.
- e2retrieve : a data recovery tool for Ext2 filesystem. This means that e2retrieve will not try to repair the filesystem but will extract data to "copy" it to another place (another disk, NFS, Samba, ...).
- e2extract : a colection of scripts that function similar to e2retrieve.
- Linux Disk Editor : lde is a disk editor for linux, originally written to help recover deleted files. It has a simple ncurses interface that resembles an old version of Norton Disk Edit for DOS.
- testdisk : Tool to check and undelete partitions
A big thank you to Kurt Garloff and Antonio Diaz Diaz!