Comparing Large Bodies of Text with Hash Codes

Tuesday Jun 22nd 2004 by Tom Archer - MSFT

While most people think of hash codes in relation to cryptography, Tom Archer illustrates how they can be used to quickly and easily compare text values of virtually any size.

Welcome to this week's installment of .NET Tips & Techniques! Each week, award-winning Architect and Lead Programmer Tom Archer demonstrates how to perform a practical .NET programming task.

While most people think of hash codes in relation to security, hash codes actually are a very fast means of comparing large text values. Using the standard Windows CryptoAPI can be very cumbersome, but the various classes defined in the .NET Cryptography namespace make using hash codes—and other cryptographic functions—easier and more accessible than ever. In this article, I illustrate just how easy it is to compare two text values in a .NET application using hash codes.

Creating a hash code for a body of text is as simple as deciding which hashing algorithm you wish to use (for example, MD5, SHA1, and so forth), instantiating the appropriate .NET service provider object, and then calling that object's ComputeHash method. (All hash algorithm classes ultimately derive from the HashAlgorithm class and inherit its ComputeHash method, which is usually overridden.) Other than that, there's just the typical conversion between Byte (or Char) arrays to String objects, and you're done.

Figure 1 contains a screen capture of the demo application included with this article.

Figure 1: Simple C++ Managed Extensions example illustrating the comparison of two text (string) values using hash codes

The application uses the MD5 hash code algorithm to compare two input strings. The two fields below the two input fields are the actual hash codes. Below you'll find the code used to generate those hash codes and compare the results.

The code first uses the Encoding::ASCII::GetBytes method to convert from the String values returned from the input controls to Byte arrays. A MD5CryptoServiceProvider object is then instantiated and its ComputeHash method is called for each Byte array, resulting in a second Byte array containing the hash code for the text value. The hash values are converted to String values and displayed on the demo dialog and compared for equality where the results of the comparison are shown in a message box. That's it—just a few lines of code to compare two text values of virtually any length!

using namespace System::Security::Cryptography;
using namespace System::Text;


private: System::Void btnCompare_Click(System::Object *  sender,
                                       System::EventArgs *  e)
    // Convert the text values into Byte arrays
    Encoding::ASCII->GetBytes(txt1->Text); Byte

    MD5CryptoServiceProvider* md5csp = new MD5CryptoServiceProvider();

    // Get the hash values for each text value using ComputeHash
    Byte baHashCode1[] = md5csp->ComputeHash(ba1);
    Byte baHashCode2[] = md5csp->ComputeHash(ba2);
    // Convert the two hash code arrays into strings for display
    // and comparison
    ASCIIEncoding* encoding = new
    ASCIIEncoding();txtHash1->Text =
    BitConverter::ToString(baHashCode1);txtHash2->Text =

    // Display the results of the comparisons of the two hash codes
      String::Format(S"The two values are {0}",
                     (0 == String::Compare(txtHash1->Text,
                       ? S"the same" : S"different")));
  catch(Exception* e)
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved