E-discovery is all about the collection of electronically stored information (ESI) as part of the discovery process. The volume of ESI that exists in today’s world is enormous. Because of the volume of ESI and how easily ESI can be modified it is important that there be a way to copy ESI in a way that we can prove the original ESI and the copy of the ESI are unchanged. This is where forensic imaging becomes critical. A qualified person can create an image of ESI – that is a forensically sound copy of the ESI that oft en is self-authenticating.
One of the most common ways to certify that a copy of ESI is authenticated with the original ESI is by using a hash algorithm. A hash algorithm is a mathematical process that results in a hash value that represents a file.
Federal Rules of Evidence, Article IX, Rule 902 (FRE 902), Committee Notes on Rules – 2017 Amendment states:
This amendment allows self-authentication by a certification of a qualified person that she checked the hash value of the proffered item and that it was identical to the original. The rule is flexible enough to allow certifications through processes other than comparison of hash value, including by other reliable means of identification provided by future technology.
Based on that note, the process of authenticating ESI should be direct. You hash the two files and compare the values. If they are the same, then they are the same and if you are qualified to hash a file and compare hash values then the ESI is self-authenticating. Not quite.
Not all hash algorithms produce a hash value that is as unique as the hash values other hash algorithms produce. According to the National Institute of Standards and Technology (NIST), the hash algorithms MD5 and SHA-1 were broken in 2004 and 2005, respectively. While research performed by Xiayung Wang in 2005, showed it was possible to create two different files with the same SHA-1 hash, it was not until 2017, that researchers successfully modified a file and were able to produce the same hash value for the original file and the modified file (see: http://shattered.io/static/shattered.pdf).
While digital forensics experts should know MD5 and SHA-1 should not be used to generate hash values FRE 902 does not identify hash algorithms that should be used during e-discovery or digital forensics.
The NIST Policy on Hash Functions identifies the hash functions federal agencies may use (visit: https://csrc.nist.gov/Projects/Hash-Functions/NIST-Policy-on-Hash-Functions).
This is one reason why it is important to have a qualified person who knows more than just the basics of e-discovery and not someone who just knows how to use the *nix command md5sum, authenticate the ESI.
As cloud computing becomes more available, it is becoming easier and less expensive to increase the processing power of systems. Referring to the 2017 report, “The first collision for full SHA-1,” the researchers’ ability to modify a file without changing the SHA-1 hash, it “took the equivalent processing power as 6,500 years of single-CPU computations.” What history shows us is that the development of technology makes it possible to do things like modifying a file without the hash value of the file changing.
It is important to know what hash algorithms are broken and to cross validate your forensics tools. Many people who work in information technology related fields know how to generate a hash value, but that doesn’t make them a digital forensics expert. It might not make them qualified enough to be a qualified person.
In the case of conducting a forensic acquisition of a mobile device, you will want to work with a digital forensics expert. Oft en obtaining a forensic image of a phone oft en involves a process known as rooting. We won’t go into the details of rooting a phone in this article but if you need to forensically image deleted information on a smartphone you will probably need to root the smartphone and will without any doubt want to work with a digital forensics expert.
Someone who can certify the hash values are the same may not be good enough. In all cases documentation and maintaining a proper chain of custody is imperative. Michael Zinn