.net – GetHashCode for comparison and equality – Education Career Blog

I have a program which i should ensure that a URL exist or not, if exists in the database, i should select the ID if not i should insert it to the database.

I have a question, Is GetHashCode is a good approach to save the hash code in the database and just compare the hash codes? Can I be sure there is no exception which 2 or more URLs has equal hash codes and if not Is it different which .NET Framework is installed?

Thanks

,

  1. Don’t use the out of the box GetHashCode(), it is week and might change in the next version.
  2. Use your own hash function using SHA1/SHA2.
  3. You need to deal with escaping, I.E. ‘A B’== ‘A%20B’
  4. You also need to consider what to-do with case sensitivity.

,

No, it is not a good idea – because the GetHashcode() might return different results the next .net framework version. see msdn remarks

,

Don’t use it as an identity – GetHashCode may result in same value for different strings.

GetHashCode result is an int32, so it may store only 4e9 different values. Since number of webpages is already around these value (http://everything2.com/index.pl?node_id=1268366), you can be almost sure that some different urls generate same hash.

,

If you really want to make sure no duplicates exists, you should just store the URL. The only thing you could do with a hash is use it as an first indicator if the URL might exists, but basically your doing the indexing manually while a good DB could do this for you.

Apart from how to store it, there are different ways to represent the same URL in a different string, it might be a good idea to specify how unique you want the URLs to be?

Leave a Comment