Hashing is the process of computing a fixed-length string (called a "message digest") from a data stream usually for the purpose of validating, authenticating, or digitally signing that stream. The stream could be a disk file, an email message, or packets of data in network transport. Hashing is not encryption because the message digest cannot readily be transformed back into the original data from which it was computed. Instead, hashing is a mechanism for representing a block of data in a predictable way by the use of a standard, public algorithm.
The usefulness of hashing arises partly from the ease with which message digests can be computed and partly from the fact that no two data streams should ever produce the same message digest. These characteristics suggest some important uses to which hashing algorithms can be put. Digitally signing an email message, for example, involves computing a cryptographic hash of the body of the message and all attachments that is then encrypted using the sender's private key. (Note: It is the hash that is encrypted and not the message itself when the message is only digitally signed and not encrypted. When the entire message is encrypted, the recipient's public key is used to do so.) The encrypted hash is then attached to the message for transport. On the receiving end, the digital signature accompanying the message is decrypted using the sender's public key, which could have accompanied the message or could have been drawn from a key escrow, and the resulting data, which was the original hash of the message body with attachments, is compared against a newly computed hash of the received message data. If the original hash and the newly computed one are identical, then there is a high degree of probability that the message was not altered in transit and that it did come from the person whose digital signature accompanied the message.
Another important use for cryptographic hashes is in the verification of an acquired data stream against a published hash for that stream. For example, individuals, companies, and organizations often provide file download services on web sites and in online databases. In addition to offering content in the form of downloadable files, hashes of those files are often published that the consumer can validate the downloaded content against. If the consumer's own computation of the hash of the content using his or her own tool, which can be different from the tool used by the content provider, is identical to the published hash, then there is a high degree of probability that the downloaded content is identical to the published data. This is useful for ensuring that the received file is what was published both from the standpoint of malicious alteration and from the standpoint of accidental alteration or truncation in transit, which is much more likely.
What the hash verification does not do is validate that the acquired data stream is harmless. Because of the ease with which hashes can be computed, malicious web site owner's can publish hashes for infected content. Even "dear john" letters, which might be far from harmless to the recipient, can be digitally signed for email transport. The hashing involved in either case says nothing whatsoever about the nature of the content provided. Unsuspecting consumers might infer trust in the content from the existence of the published hash or digital signature when in fact all the hash can do is facilitate validation that the content received matches the content published. Trust in the content itself must be derived from other knowledge that the consumer/recipient possesses about the publisher/sender.
For those who are interested in knowing more about the various hashing algorithms in use, a technical discussion of these algorithms and their possible uses follows.
There are several hashing algorithms in common use having different purposes and varying degrees of reliability for error detection, data validation, and cryptographic security. Common algorithms include Cyclic Redundancy Check (CRC), Message Digest (MD), Secure Hash Algorithm (SHA), RACE Integrity Primitives Evaluation Message Digest (RIPEMD), and Whirlpool. Virtually all algorithms have gone through revisions or replacements to improve their inherent security (SHA-2, for example, being more cryptographically secure than SHA-1, and the final version of Whirlpool more than earlier versions). In addition, some algorithms, such as SHA and RIPEMD, offer variations that reduce the likelihood of accidental collisions (two messages having the same hash). SHA-512, for example, is SHA-2 with a 512 bit (64 byte) message digest size that reduces the likelihood of accidental collisions versus SHA-256, but a larger digest size does not make an otherwise identical hash algorithm more secure. The larger digest sizes satisfy the needs of encryption algorithms that require them. The security of a hashing algorithm, however, is defined by its resistance to certain kinds of attacks such as pre-image attacks and deliberate collision attacks irrespective of its digest size. (Digest size merely refers to the number of bits in the hash produced by the algorithm.)
CRC is a high-performance algorithm that can be implemented in hardware for the validation of data moving through the electronics of a computer or network device at high speed. Its purpose is to provide maximum performance in the detection of errors in the data stream. CRC is not suitable for cryptographic use because of its low collision resistance, but it does provide basic error checking when performance is paramount.
The widely-used MD5 algorithm is the latest of a serious of algorithms in the same family. It produces a 128 bit message digest. It has been shown, however, that MD5 is not collision resistant. In 2007, two Danish researchers demonstrated that it is possible for two executable programs, one benign and the other not, to share the same MD5 message digest. It would be difficult for malicious coders to exploit the researchers' methodology because it would require the coders to insinuate themselves into the publication of the original program, but it is difficult to be sure that this could not lead to a practical attack vector. The researchers' conclusion was that MD5 ought not be used for code signing and cryptographic purposes.
SHA was created by the National Security Agency (NSA) of the United States Government, and it has been used as a general purpose algorithm for cryptographic applications since the mid-1990s. (Although hashing algorithms are not reversible encryption systems, they are used by such systems for various purposes.) Weaknesses in early versions, SHA-0 and SHA-1, led to the creation of SHA-2. (SHA-256 and SHA-512 are both SHA-2 algorithms with differing message digest sizes.) A public competition for a successor to SHA-2, which will become SHA-3, is currently being conducted by the National Institute of Standards and Technology (NIST). Among the current SHA algorithms, SHA-256 provides a good compromise between performance and security having no known collision vectors. One of the chief criticisms of SHA-1 and 2, however, has been that their development was conducted by a secret governmental agency.
Unlike SHA, RIPEMD was created by an open academic community--the COSIC group of Belgium's Katholieke Universiteit Leuven, which is the same group whose Rijndael encryption algorithm won the competition for the U.S. Government's Advanced Encryption Standard in 2001. RIPEMD comes in two versions, RIPEMD-128 (the faster) and RIPEMD-160 (the more secure) each of which has an extension for a larger hash result size (256 and 320 bits respectively). RIPEMD creators caution that the larger hash result sizes of the extensions should not be regarded as more secure than the base algorithms and are merely provided for applications that require larger message digests.
The Whirlpool hash algorithm was created by one of the co-creators, Vincent Rijmen, of the Rijndael encryption system that became the Advanced Encryption Standard. Whirlpool is actually based on Rijndael with certain key differences that make it a one-way hashing algorithm instead of a reversible encryption system. Whirlpool, which has a fixed message digest size of 512 bits, has been revised twice to deal with weaknesses found in early versions. These versions are referred to as Whirlpool-0, Whirlpool-T, and then just Whirlpool for the final published version. All implementations are expected to use the final version.
Programs that use any of these algorithms for file validation purposes ought to at least compute MD5 and SHA-1 hashes as these are the most widely used by software publishers. If both are used, the effects of their respective weaknesses can be canceled because it is extremely unlikely that a given malicious file could simultaneously exploit the weaknesses of both. For non-cryptographic purposes, this would be sufficient. Good supplemental algorithms to these would be SHA-256 and Whirlpool as these currently have no known weaknesses. The inclusion of other algorithms does not necessarily make a given program better. There are some differences in the algorithms used by the various programs reviewed here.
The various versions of SHA and RIPEMD and the latest version of Whirlpool are included in the International Standards Organization (ISO) standard 10118-3:2004 for dedicated hash functions.
One-Way Accumulators (OWA) offer a decentralized alternative to Digital Signatures. Their advantages include the following:
- Most notably, the main advantage of OWAs over Digital Signatures is that no one need know how to authenticate, sign, or time stamp a message, thereby dispensing with the need for a CA.
- More particularly:
- OWAs allow for a straight-forward and efficient method of producing collective signatures.
- 'Forgery' in the utilization of OWAs is infeasible because the putative forger cannot make a valid time-stamp of a document that was not expected at the time recorded on the stamp. For instance, a student who wishes to plagiarize a paper written on a given date (and so time-stamped) would be unable to change the time-stamp in order to misrepresent their (plagiarized) authorship to pre-date the original's authorship.
- OWAs are no less secure than one-way functions, and indeed many cryptographic protocols are based upon the presupposed 'hardness' of reversing one way functions.
- The relationship between OWAs and One-Way Trapdoor Functions is, at present, unknown.
In conclusion, it would appear that these one-way hash functions offer considerable advantages to traditional methods for authentication, membership testing, and time-stamping.
In a Hurry?
Go straight to the Quick Selection Guide
Focusing on the use of hashing for the validation of a data stream against published hashes, there are a number of useful programs that provide this functionality. Essentially, these programs 1) must be easy to use, 2) must accurately compute hashes according to published algorithms, and 3) must present the information in a usable form. It is not important whether hashing is the primary purpose of the software or just an incidental feature of a broader application. What is important is that a useful capability is provided attended with as little "noise" (bugs and fluff) as possible.
The programs reviewed here provide three levels of functionality:
- Programs that compute hashes.
- Programs that also provide hash validation.
- Programs that also include a database of hashes for revalidation.
The reviewed applications implement their user interfaces in one of three ways:
- Windows console application (DOS command line).
- Windows Explorer context menu entry.
- Windows Explorer property page tab.
It cannot really be said that any one of these approaches is better than the others because each provides its own capabilities. A console application, for example, allows for scripting and ad-hoc programming that is not possible with graphical applications, but its user interface is somewhat limited. A Windows Explorer context menu entry provides quick access to a full-scale application, but this also switches the user to a new application context. An Explorer property page tab offers a handy and familiar access to program controls without context switching, but the small physical window size places constraints on application features.
HashTab implements its user interface as a Windows Explorer file property page. To compute the hash of a file, you right-click on the file, select Properties, and then click the tab labeled "File Hashes". There are two zones in the tab panel. The top zone shows the hash values of the selected hashes. An Options link is provided to allow the user to change selected hashes, and the program remembers the selections in future sessions. Message digests of the selected file are automatically computed and displayed.
The bottom zone of the tab panel provides the hash comparison feature. A hash can be pasted into the Hash Comparison field, and it will be automatically compared against the selected hashes. If a match is found, a green checkmark is displayed below the field along with the name of the hash that was matched. If no match is found, a red "x" is shown. You should be sure that the desired hash is selected before concluding that there is a mismatch because the program does not report whether it matches an unselected algorithm.
The comparison zone also provides a button that can be used to select a file to compare the current file against. On clicking the button, a dialog is presented that permits the user to browse to the desired file. The program remembers the last location the Open dialog was used to access, and subsequent dialog sessions return to that location, which may have been from a previous program session.
When comparing a hash that is pasted in, the one that matches is the one used, but when comparing another file, the first algorithm that matches in the alphabetically-sorted list is used. If you want to use a specific hash, you have to change the selected hashes in Options by removing all hashes from the list that alphabetically precede the desired algorithm. After so doing, you will have to reselect the file to compare because the Hash Comparison field will be blanked out on returning to the tab panel.
HashMyFiles is a full-scale Windows application that can be launched directly, or from the Explorer context menu when that feature is enabled. HashMyFiles not only computes hashes of files and compares them against each other or against any MD5 or SHA-1 hash that is in the Windows clipboard, but it can also hash all files in a file system identifying hash duplicates in the process. The program can compute hashes for a single file, a group of files, or an entire file system, but it only does so using CRC, MD5, and SHA-1.
The main program window provides a list of the files selected for hashing, and the hashes are computed automatically. If a file in the list matches an MD5 or SHA-1 hash that has been copied to the clipboard, that file is highlighted. If there are multiple hashes for multiple files in the clipboard, all matches are highlighted. In addition, files in the list that are duplicates of each other are similarly labeled and highlighted. The program can hook into the Windows Explorer context menu by enabling an option to do so. (It is disabled by default.) When enabled, right-clicking the selected files or folders and selecting HashMyFiles in the context menu will bring up the program with the file hashes computed and matches highlighted. Selecting a large number of files, or the base folder of a large tree, can result in a lengthy delay while the hashes are calculated.
This program can be configured to operate from the Windows system tray. Closing the program with this feature enabled--it is disabled by default--will allow quick access to the program window for further use. Selecting another file to hash will restore the program window with the newly selected file and computed hashes added to the bottom of the list.
The NirSoft site for HashMyFiles reports support for all Windows versions since and including 2000. The program has a faulty interaction, however, with a security feature of Windows 7, and as the feature also exists in Windows Vista, presumably with it as well. One of the more recent capabilities of Windows is to keep track of the origin of individual files and request approval to open files that came from an untrusted source (e.g., the internet). Sometimes, this causes HashMyFiles to launch multiple windows with the various selected files distributed among them or to open one window with multiple entries of the selected files present and marked as duplicates. Another problem with the program is the low-contrast highlighting used to identify matched entries. On some monitors, the low-contrast is difficult to distinguish at some visual angles and virtually disappears at others.
HashCheck Shell Extension is an open source program that employs a Windows property page tab as its user interface. It computes hashes using CRC, MD4, MD5, and SHA-1, and it can work on single files, multiple files, and whole file systems. The program window has a text field for displaying computed hashes and a field for pasting in a hash to match. The program can save the computed hashes to a text file in various encodings that it can later use to re-validate the included files.
HashCheck is a sort of hybrid of HashTab and FCIV. It performs the basic single file hash computing and comparing functionality as other programs. To validate a single file, right-click on the file in Windows Explorer and select Properties. Click the Checksums tab to display the hashes for the selected file. Paste the published hash into the field at the bottom of the tab window, and the program will automatically highlight the matching hash or show a text bubble indicating that the hash was not found in the list.
If multiple files had been selected, hashes for all files will be computed and displayed. If a large number of files were selected, or the base folder of a large file system, it may take a long time to compute all the hashes. A convenient progress bar shows the programs progress computing the hashes, and unlike progress bars in some programs including those in software of large companies such as Microsoft and Oracle, the progress bar is accurate reaching the end only when it is actually completed computing hashes. There are two progress bars--one for the overall process and one for the individual files. Very large files can take a minute to hash.
The program goes beyond most others reviewed here in that its results can be saved to a text file for later revalidation. To do this, click the Save button at the bottom of the program window. In the Save As dialog that appears, enter a file name (the program computes a default), a file type, and click Save. The dialog opens in the location that contains the selected file(s) and/or folder(s), so you may want to browse to a different location to save the file. The program chooses which hash to save for the listed file(s) in the file based on the file type selected in the Save As dialog. It saves only one hash type in a given file, and the saved file takes an extension that is consistent with its type.
To re-validate the files that were included in the saved hash file, just double-click on the saved hash file to launch a HashCheck window that will proceed to recompute the hashes for the files and indicate the matches, mismatches, and unreadables--the latter category usually indicating a missing file.
HashCheck has many useful features, but it could be improved. Dropping the CRC and MD4 algorithms, which are not really needed for a program like this, and replacing them with SHA-256 and Whirlpool would be a big improvement. A reporting tool would be useful as well. It would be nice if the program could save its data into a single file containing all hashes computed for each file. As it is, a separate file is required to save the hashes for a given algorithm.
Like HashTab and HashCheck, Febooti fileTweak Hash & CRC works as a tab on the file property page. To compute the hash of a file, you right-click the file, select Properties, and then click the tab labeled "Hash / CRC". The upper portion of the tab panel shows the name of the file being hashed, if just one had been selected, or a count of the total number selected. It also shows the file system location of the selected file(s), although a deep location will be truncated.
The middle portion of the panel lists the available hashes which can be easily selected or excluded using a checkbox next to each one. Two of the algorithms, MD and RIPEMD, have drop-downs to the left that allow you to select the version of the algorithm to use in the main list. The algorithms that are selected when the program starts, which are remembered from the last session, are automatically computed for the selected file. If an algorithm is added, you must click the Compute button in the lower section to recompute the hashes to include the newly selected one.
The lower portion of the tab panel provides a mechanism for switching the file to use when multiple files were selected in Windows Explorer. To compute hashes for a different file from a group, just click the View file drop-down, select the desired file from the list, and click Compute.
Unlike the other programs reviewed here, Febooti Hash & CRC, does not provide a hash comparison feature. There is no way to directly compare the hash of a file against published hash or against the hash of another file. To compare the computed hash against a published hash, you must click the Copy button, select which hash (or all) to place on the clipboard, paste the hash(es) into another window such as an empty Notepad document, paste the published hash into the same window, and then visually compare them. If you need to compare two files, you have to go through this exercise for each file before comparing them. The program does, at least, make it easy to get the computed hashes onto the clipboard.
Microsoft File Checksum Integrity Verifier (FCIV) is a console application, which means that it only runs inside a command window. You might wonder why such a program would be considered here, but there is a unique capability provided by this program that is worth a look. For computing the hash of a single file, this would not be the tool to use, but it can be used to create a database of hashes for many files, including recursively through an entire file system, and then later use that database to re-validate those same files.
FCIV only computes hashes using the MD5 and SHA-1 algorithms. The purpose of this program is to provide a method by which large numbers of files can be validated very quickly thus exposing unauthorized modifications. The following command can be used to generate a database of hashes for all ".exe" and ".dll" files below C:\Program Files using both MD5 and SHA-1:
fciv "c:\program files" -xml c:\temp\pf.xml -r -both -type .exe -type .dll
If you leave out the type argument(s), it will compute hashes for all files that it finds. The following command can be used to validate the database against the same files at a later time:
fciv -v -both -xml c:\temp\pf.xml
The program will report any differences that it finds. It does not report the presence of new files, but it does report any files of the original set that are missing. The setup of the program is entirely manual. After extracting it from the download, the program must be copied to a location that is in the command path or its extracted folder must be added to the path. Once this is done, it can be executed from any command window. To get help information about the program, type "fciv -h". The help information includes examples, but there are some differences between the information provided and the way the program actually behaves.
Here are some more hash programs. I haven't downloaded them yet, but here's the info I got off their websites.
- Hasher is a Windows application that computes MD5, SHA-1/224/256/384/512 hashes of a text string, disk file, or group of files. Hasher can save hash values to disk for future verification. Informative website. VB6 source code available. Visual Basic runtime required.
- HashCalc is a Windows application that computes MD-2/4/5, SHA-1/256/384/512, RIPEMD-160, PANAMA, TIGER, ADLER32, CRC32, and eDonkey/eMule hashes of a text string or disk file. Doesn't look like it supports hash comparison.
- FSUM is a command-line application that computes MD-2/4/5, SHA-1/256/384/512, RIPEMD-160, PANAMA, TIGER, ADLER32, CRC32, and eDonkey/eMule hashes of one or more disk files. It can compare hashes against a list and recurse subdirectories.
- WinHasher is a Windows applet and command-line program that computes MD5, SHA-1/256/384/512, RIPEMD-160, Whirlpool (2003), and Tiger (1995) hashes of a text string, disk file, or group of files. C# source code available. .NET v2 required.
- WinMD5 is a Windows app that computes the MD5 of a disk file. It needs MD5SUM files to automate file verification. .NET required.
- Crypto Hash Calculator is a portable Windows app that computes the MD-2/4/5, SHA, SSL3, MAC and HMAC of a text string or disk file. Doesn't look like it automates hash comparison.
- digestIT 2004 is a Windows Explorer context menu that calculates the MD5 or SHA-1 hash of a file or files. 64-bit version available.
- WinMd5Sum Portable is a portable Windows app that computes the MD5 of a file via drag-and-drop. Looks like you can paste a comparison hash value in the app for an automated verification.
- Hash on click The freeware version of this context menu add-on can calculate the CRC32, MD5 and SHA-1 of a file. Doesn't look like it automates hash comparison.
- MD5Summer is a stand-alone application that computes MD5 and SHA-1 hashes of a disk file or group of files. Can read/write GNU MD5sum files. Source code available. Developer warns that "this [program] may be buggy". Beta software (3/2011).
- Checksum is a context menu add-on that uses the MD5 and SHA1 hash routines. It's a portable app that can create hashes for files, groups of files, and recurse subdirectories. It also supports file masks (*.mp3), custom mask groups (music=*.mp3,*.wav,*.ogg) and ignore lists. Checksum reads and writes .md5 and .sha files to support file verification and can create .m3u and .pls music playlists. Checksum can also be run from the command-line. Logging of the program's actions is supported. Separate "Hashdrop", "Batch Runner" and "Simple Checksum" programs extend Checksum's functionality by adding batch processing and drag-and-drop support.
- Gizmo Hasher (unrelated to this site) is a Explorer context menu entry that computes SHA-1/256/384/512, CRC-16/32, RIPEMD-128/160, MD-2/4/5, MD2, HAVAL-5-256, HASH-32-5, GHASH-32-3, GOST, SizeHash-32, FCS-16/32 and Tiger hashes of a disk file, group of files, or directory. Can read/write hash values to disk for future verification.
- Nero MD5-Checksum computes the MD5 hash for a file. There's not much information about this utility on the Nero website.
- RapidCRC computes CRC32 and MD5 hashes. Supports file names with embedded CRC32, e.g "MyFile [45DEF3A0].avi". Source code available via CVS. Program is in Beta (3/11).
- Easy Hash is a portable application and Explorer context menu that computes over 130 different hash functions for file directories, individual files and text strings. Compares two directories to find duplicates. It can save generated hashes to .CSV, .HTML, .SFV, .MD5 and .SHA1 file extensions. Can associate itself to .sfv, .md5 and .sha1 files. Can install itself in Total Commander, Unreal Commander and/or Free Commander. Easy Hash can also "reverse hash" CRC-32 hash values, which can be used recover passwords. Claims to reset passwords in MaNDOS, RAdmin, Mantis, Joomla, Wordpress, Mambo, vBulletin, TYPO3, phpBB, Drupal, Prestashop and Magento.
- ExactFile is a Windows application that calculates MD-2/4/5, SHA-1/256/384/512, CRC32, Adler32, GOST, RIPEMD-128/160, TIGER-128/160/192 hashes for files, directories, and optionally subdirectories. The program can be associated with .md5, .sha1 and .sfv file extensions and can use these file formats to verify checksums. A command-line version of the program called EXF is available. Programs are in Beta (3/11).
- ilSFV uses .sfv, .ms5 and .sha1 file extensions to verify file hashes. As of 3/11, website has no WOT rating, so be careful. (3/11)
- Kana Checksum supports the CRC32 and MD5 hash algorithms. It can be used as a stand-alone program or integrated into Explorer as a context menu selection. Supports .md5 and .sfv files.
- Hasher supports the SHA1, MD5, CRC32 and ELF hash algorithms. It can calculate the hash of a file or text string.
- File Verifier ++ supports the CRC16/32, BZIP2 CRC, MPEG2 CRC, JamCRC, Posix CRC, ADLER32, MD4/5, EDONKEY2K, RIPEMD-128/160/256/320, SHA-1/224/256/384/512 and WHIRLPOOL algorithms. It's portable, but can also be integrated in to the Explorer context menu. A command-line version of the program is included. The program can calculate the hash for files, directories, subdirectories, and text strings. File selection can also be done using regular expressions. Can compare hashes to previously calculated values. Website has no WOT rating, so be careful. Beta software (3/11).
- Jacksum is a Java application that can work as a Windows or command-line application. It can also be a "send to" context menu option. It supports 58 hash algorithms and can calculate the hash values of text strings, files, directories and sub-directories. It can write hash values to files (e.g. .md5, .sfv). Java Runtime Environment required. Not sure if/how it compares hash values. Source code available. Informative website.
- MultiHasher supports the CRC32, MD5 and SHA-1/256/384/512 hashing functions. It can calculate hashes for files, multiple files and text strings. Support for .md5 and .sfv files. Program is in Beta (3/11).
- FlyingBit Hash Calculator is an Explorer context menu addition that supports CRC16/32/64, eDonkey/eMule, RIPEMD-160, MD5/MD4, Tiger and SHA-1 algorthims. Reads and writes .md5 files. Website has no WOT rating and Siteadvisor warns of a security risk as of 3/2011, so be careful.
- eXpress CheckSum Verifier (XCSV) & eXpress CheckSum Calculator (XCSC) WOT warns that the website has a poor reputation, so I didn't research these products any further (3/11).
Quick Selection Guide
HashCheck Shell Extension
Febooti fileTweak Hash & CRC
Microsoft File Checksum Integrity Verifier
This software review is maintained by volunteer editor CryptoSurfer. Registered members can contact the editor with any comments or questions they might have by clicking here.