C# SuperFastHash and MurmurHash2 implementations
I’ve been emailed about a SuperFastHash C# implementation, and I felt like doing low level stuff, since I’m knees deep in DDD at the moment. So I looked at my Pascal and BASM implementation of SuperFastHash and figured, I could totally make this in C#. Searching around if nobody else had done it already (then I could just send a link to that site as reply), I saw some articles analyzing SuperFastHash and breaking it. During that search I also found another Hasher called MurmurHash2 which passes some test very nicely and is a lot faster than SuperFastHash. But it seems that this hash is too simple and does have some vurnabilities as well, so I’ve implemented both in c# and left you with the choice. Important to know, these hashes are for hashtables and not meant for verifying or cryptology.
I'll discuss each hash separately and discuss my different implementation, in the end I’ll post the final implementation and let you chose yourself.
SuperFastHash
This one I already knew, so I implemented this one very fast to the following simple implementation. This implementation is very standard .NET code which just a few optimizations.
public class SuperFastHashSimple : IHashAlgorithm { public UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = Convert.ToUInt32(dataLength); Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 Int32 currentIndex = 0; while (numberOfLoops > 0) { hash += BitConverter.ToUInt16(dataToHash, currentIndex); UInt32 tmp = (UInt32)(BitConverter.ToUInt16(dataToHash, currentIndex + 2) << 11) ^ hash; hash = (hash << 16) ^ tmp; hash += hash >> 11; currentIndex += 4; numberOfLoops--; } switch (remainingBytes) { case 3: hash += BitConverter.ToUInt16(dataToHash, currentIndex); hash ^= hash << 16; hash ^= ((UInt32)dataToHash[currentIndex + 2]) << 18; hash += hash >> 11; break; case 2: hash += BitConverter.ToUInt16(dataToHash, currentIndex); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += dataToHash[currentIndex]; hash ^= hash << 10; hash += hash >> 1; break; default: break; } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } }
The most performance was lost with the BitConverter.ToUInt16 which is implemented as a unsafe pointer cast, but does some checking before the cast, the next step was to inline the byte to UInt16 conversion.
public class SuperFastHashInlineBitConverter : IHashAlgorithm { public UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = (UInt32)dataLength; Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 Int32 currentIndex = 0; while (numberOfLoops > 0) { hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8); UInt32 tmp = (UInt32)((UInt32)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8) << 11) ^ hash; hash = (hash << 16) ^ tmp; hash += hash >> 11; numberOfLoops--; } switch (remainingBytes) { case 3: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8); hash ^= hash << 16; hash ^= ((UInt32)dataToHash[currentIndex]) << 18; hash += hash >> 11; break; case 2: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex] << 8); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += dataToHash[currentIndex]; hash ^= hash << 10; hash += hash >> 1; break; default: break; } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } }
This is the fastest no-hacks managed implementation of the SuperFastHash, but the result will change if the Endianness changes. I’ve been looking for a way to cast the byte array to a int array without have to do Marshal.Copy, I found a dirty hack using FieldOffset(0), I have no idea if this hack is going to be supported on new versions of the CLR so using this is a risk. The only strange part is that the UInts.Length is the length of the bytes array, but the index is of the UInt16 array, so the max index for that array is UInts.Length / sizeof(UInt16). Below is the implementation of this hack.
public class SuperFastHashUInt16Hack : IHashAlgorithm { [StructLayout(LayoutKind.Explicit)] // no guarantee this will remain working struct BytetoUInt16Converter { [FieldOffset(0)] public Byte[] Bytes; [FieldOffset(0)] public UInt16[] UInts; } public UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = (UInt32)dataLength; Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 Int32 currentIndex = 0; UInt16[] arrayHack = new BytetoUInt16Converter { Bytes = dataToHash }.UInts; while (numberOfLoops > 0) { hash += arrayHack[currentIndex++]; UInt32 tmp = (UInt32)(arrayHack[currentIndex++] << 11) ^ hash; hash = (hash << 16) ^ tmp; hash += hash >> 11; numberOfLoops--; } currentIndex *= 2; // fix the length switch (remainingBytes) { case 3: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8); hash ^= hash << 16; hash ^= ((UInt32)dataToHash[currentIndex]) << 18; hash += hash >> 11; break; case 2: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex] << 8); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += dataToHash[currentIndex]; hash ^= hash << 10; hash += hash >> 1; break; default: break; } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } }
The last stap was to go to unsafe land and do real pointer stuff.. This implementation looks a lot like the original c implementation
public class SuperFastHashUnsafe : IHashAlgorithm { public unsafe UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = (UInt32)dataLength; Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 fixed (byte* firstByte = &(dataToHash[0])) { /* Main loop */ UInt16* data = (UInt16*)firstByte; for (; numberOfLoops > 0; numberOfLoops--) { hash += *data; UInt32 tmp = (UInt32)(*(data + 1) << 11) ^ hash; hash = (hash << 16) ^ tmp; data += 2; hash += hash >> 11; } switch (remainingBytes) { case 3: hash += *data; hash ^= hash << 16; hash ^= ((UInt32)(*(((Byte*)(data))+2))) << 18; hash += hash >> 11; break; case 2: hash += *data; hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += *((Byte*)data); hash ^= hash << 10; hash += hash >> 1; break; default: break; } } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } }
MurmurHash2
The next algorithm was MurmurHash2, the code is very simple, the only dificult part was the fall-trough case which luckely isn't supported in c#, below is the first implementation.
public class MurmurHash2Simple : IHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; public UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 currentIndex = 0; while (length >= 4) { UInt32 k = BitConverter.ToUInt32(data, currentIndex); k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; currentIndex += 4; length -= 4; } switch (length) { case 3: h ^= BitConverter.ToUInt16(data, currentIndex); h ^= (UInt32)data[currentIndex + 2] << 16; h *= m; break; case 2: h ^= BitConverter.ToUInt16(data, currentIndex); h *= m; break; case 1: h ^= data[currentIndex]; h *= m; break; default: break; } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } }
I've applied the same optimalizations as discussed with SuperFastHash so here is the fastest no-hacks managed implementation.
public class MurmurHash2InlineBitConverter : IHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; public UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 currentIndex = 0; while (length >= 4) { UInt32 k = (UInt32)(data[currentIndex++] | data[currentIndex++] << 8 | data[currentIndex++] << 16 | data[currentIndex++] << 24); k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; length -= 4; } switch (length) { case 3: h ^= (UInt16)(data[currentIndex++] | data[currentIndex++] << 8); h ^= (UInt32)(data[currentIndex] << 16); h *= m; break; case 2: h ^= (UInt16)(data[currentIndex++] | data[currentIndex] << 8); h *= m; break; case 1: h ^= data[currentIndex]; h *= m; break; default: break; } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } }
The dirty hack which I’ve already discussed worked miracles here as well. I've also added a different looping logic (stolen from SuperFastHash), this added a nice speed increase for the unsafe version.
public class MurmurHash2Unsafe : IHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; public unsafe UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 remainingBytes = length & 3; // mod 4 Int32 numberOfLoops = length >> 2; // div 4 fixed (byte* firstByte = &(data[0])) { UInt32* realData = (UInt32*)firstByte; while (numberOfLoops != 0) { UInt32 k = *realData; k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; numberOfLoops--; realData++; } switch (remainingBytes) { case 3: h ^= (UInt16)(*realData); h ^= ((UInt32)(*(((Byte*)(realData)) + 2))) << 16; h *= m; break; case 2: h ^= (UInt16)(*realData); h *= m; break; case 1: h ^= *((Byte*)realData); h *= m; break; default: break; } } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } }
Measurements
I've implemented the same test case for the native version of SuperFastHash and MurmurHash2, the measured speed will be used as reference speed. SuperFastHash was clocked at a rate of 1611 MB/s and MurmurHash2 was clocked at 2312 MB/s.
I've based the tests on the speed test used on the MurmurHash2 page, I'm not sure they test every property of the hash function, but it's a nice way to compare the performance.
var data = new Byte[256 * 1024]; new Random().NextBytes(data); Thread.CurrentThread.Priority = ThreadPriority.Highest; Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.RealTime; if (Environment.ProcessorCount > 1) { Process.GetCurrentProcess().ProcessorAffinity = new IntPtr(1 << (Environment.ProcessorCount - 1)); } foreach (var testSubject in toTest) { Stopwatch timer = Stopwatch.StartNew(); for (int i = 0; i < 9999; i++) { testSubject.Value.Hash(data); } timer.Stop(); Console.WriteLine("{0}:\t\t{1:F2} MB/s ({2})", testSubject.Key, (data.Length * (1000.0 / (timer.ElapsedMilliseconds / 9999.0))) / (1024.0 * 1024.0), timer.ElapsedMilliseconds); }
The test is pretty simple, test 256k random data 9999 times and extract the MB/s from it. (A test with 1k random data 99999 times showed the same speeds, so my implementations are stable enough). Below are the results for the different c# implementations.
Function | Speed | Relative to native |
---|---|---|
SuperFastHashSimple | 281 MB/s | 0.17x |
SuperFastHashInlineBitConverter | 780 MB/s | 0.48x |
SuperFastHashUInt16Hack | 1204 MB/s | 0.75x |
SuperFastHashUnsafe | 1308 MB/s | 0.82x |
MurmurHash2Simple | 486 MB/s | 0.21x |
MurmurHash2InlineBitConverter | 759 MB/s | 0.32x |
MurmurHash2UInt32Hack | 1430 MB/s | 0.62x |
MurmurHash2Unsafe | 2196 MB/s | 0.95x |
In conclusion the managed SuperFastHash implementation is only 25% slower than the unmanaged implementation, and if you really like speed you can get the Unsafe implementation at only 18% slower than unmanged. The MurmurHash2 managed implementation is 38% slower than the unmanaged (but still as fast as the unmanaged SuperFastHash) and the unsafe implementation is only 5% slower than the unmanaged version, I call that a nice result!
Final code
Here is the complete unit with all the above functions, as always you can also download it directly.
IHashingAlgorithm.cs
using System; namespace HashTableHashing { public interface IHashAlgorithm { UInt32 Hash(Byte[] data); } public interface ISeededHashAlgorithm : IHashAlgorithm { UInt32 Hash(Byte[] data, UInt32 seed); } }
SuperFastHash.cs
/***** BEGIN LICENSE BLOCK ***** * Version: MPL 1.1/GPL 2.0/LGPL 2.1 * * The contents of this file are subject to the Mozilla Public License Version * 1.1 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * http://www.mozilla.org/MPL/ * * Software distributed under the License is distributed on an "AS IS" basis, * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License * for the specific language governing rights and limitations under the * License. * * The Original Code is HashTableHashing.SuperFastHash. * * The Initial Developer of the Original Code is * Davy Landman. * Portions created by the Initial Developer are Copyright (C) 2009 * the Initial Developer. All Rights Reserved. * * Contributor(s): * * * Alternatively, the contents of this file may be used under the terms of * either the GNU General Public License Version 2 or later (the "GPL"), or * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"), * in which case the provisions of the GPL or the LGPL are applicable instead * of those above. If you wish to allow use of your version of this file only * under the terms of either the GPL or the LGPL, and not to allow others to * use your version of this file under the terms of the MPL, indicate your * decision by deleting the provisions above and replace them with the notice * and other provisions required by the GPL or the LGPL. If you do not delete * the provisions above, a recipient may use your version of this file under * the terms of any one of the MPL, the GPL or the LGPL. * * ***** END LICENSE BLOCK ***** */ using System; using System.Runtime.InteropServices; namespace HashTableHashing { public class SuperFastHashSimple : IHashAlgorithm { public UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = Convert.ToUInt32(dataLength); Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 Int32 currentIndex = 0; while (numberOfLoops > 0) { hash += BitConverter.ToUInt16(dataToHash, currentIndex); UInt32 tmp = (UInt32)(BitConverter.ToUInt16(dataToHash, currentIndex + 2) << 11) ^ hash; hash = (hash << 16) ^ tmp; hash += hash >> 11; currentIndex += 4; numberOfLoops--; } switch (remainingBytes) { case 3: hash += BitConverter.ToUInt16(dataToHash, currentIndex); hash ^= hash << 16; hash ^= ((UInt32)dataToHash[currentIndex + 2]) << 18; hash += hash >> 11; break; case 2: hash += BitConverter.ToUInt16(dataToHash, currentIndex); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += dataToHash[currentIndex]; hash ^= hash << 10; hash += hash >> 1; break; default: break; } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } } public class SuperFastHashInlineBitConverter : IHashAlgorithm { public UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = (UInt32)dataLength; Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 Int32 currentIndex = 0; while (numberOfLoops > 0) { hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8); UInt32 tmp = (UInt32)((UInt32)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8) << 11) ^ hash; hash = (hash << 16) ^ tmp; hash += hash >> 11; numberOfLoops--; } switch (remainingBytes) { case 3: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8); hash ^= hash << 16; hash ^= ((UInt32)dataToHash[currentIndex]) << 18; hash += hash >> 11; break; case 2: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex] << 8); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += dataToHash[currentIndex]; hash ^= hash << 10; hash += hash >> 1; break; default: break; } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } } public class SuperFastHashUInt16Hack : IHashAlgorithm { [StructLayout(LayoutKind.Explicit)] // no guarantee this will remain working struct BytetoUInt16Converter { [FieldOffset(0)] public Byte[] Bytes; [FieldOffset(0)] public UInt16[] UInts; } public UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = (UInt32)dataLength; Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 Int32 currentIndex = 0; UInt16[] arrayHack = new BytetoUInt16Converter { Bytes = dataToHash }.UInts; while (numberOfLoops > 0) { hash += arrayHack[currentIndex++]; UInt32 tmp = (UInt32)(arrayHack[currentIndex++] << 11) ^ hash; hash = (hash << 16) ^ tmp; hash += hash >> 11; numberOfLoops--; } currentIndex *= 2; // fix the length switch (remainingBytes) { case 3: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex++] << 8); hash ^= hash << 16; hash ^= ((UInt32)dataToHash[currentIndex]) << 18; hash += hash >> 11; break; case 2: hash += (UInt16)(dataToHash[currentIndex++] | dataToHash[currentIndex] << 8); hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += dataToHash[currentIndex]; hash ^= hash << 10; hash += hash >> 1; break; default: break; } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } } public class SuperFastHashUnsafe : IHashAlgorithm { public unsafe UInt32 Hash(Byte[] dataToHash) { Int32 dataLength = dataToHash.Length; if (dataLength == 0) return 0; UInt32 hash = (UInt32)dataLength; Int32 remainingBytes = dataLength & 3; // mod 4 Int32 numberOfLoops = dataLength >> 2; // div 4 fixed (byte* firstByte = &(dataToHash[0])) { /* Main loop */ UInt16* data = (UInt16*)firstByte; for (; numberOfLoops > 0; numberOfLoops--) { hash += *data; UInt32 tmp = (UInt32)(*(data + 1) << 11) ^ hash; hash = (hash << 16) ^ tmp; data += 2; hash += hash >> 11; } switch (remainingBytes) { case 3: hash += *data; hash ^= hash << 16; hash ^= ((UInt32)(*(((Byte*)(data))+2))) << 18; hash += hash >> 11; break; case 2: hash += *data; hash ^= hash << 11; hash += hash >> 17; break; case 1: hash += *((Byte*)data); hash ^= hash << 10; hash += hash >> 1; break; default: break; } } /* Force "avalanching" of final 127 bits */ hash ^= hash << 3; hash += hash >> 5; hash ^= hash << 4; hash += hash >> 17; hash ^= hash << 25; hash += hash >> 6; return hash; } } }
MurmurHash2.cs
/***** BEGIN LICENSE BLOCK ***** * Version: MPL 1.1/GPL 2.0/LGPL 2.1 * * The contents of this file are subject to the Mozilla Public License Version * 1.1 (the "License"); you may not use this file except in compliance with * the License. You may obtain a copy of the License at * http://www.mozilla.org/MPL/ * * Software distributed under the License is distributed on an "AS IS" basis, * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License * for the specific language governing rights and limitations under the * License. * * The Original Code is HashTableHashing.MurmurHash2. * * The Initial Developer of the Original Code is * Davy Landman. * Portions created by the Initial Developer are Copyright (C) 2009 * the Initial Developer. All Rights Reserved. * * Contributor(s): * * * Alternatively, the contents of this file may be used under the terms of * either the GNU General Public License Version 2 or later (the "GPL"), or * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"), * in which case the provisions of the GPL or the LGPL are applicable instead * of those above. If you wish to allow use of your version of this file only * under the terms of either the GPL or the LGPL, and not to allow others to * use your version of this file under the terms of the MPL, indicate your * decision by deleting the provisions above and replace them with the notice * and other provisions required by the GPL or the LGPL. If you do not delete * the provisions above, a recipient may use your version of this file under * the terms of any one of the MPL, the GPL or the LGPL. * * ***** END LICENSE BLOCK ***** */ using System; using System.Runtime.InteropServices; namespace HashTableHashing { public class MurmurHash2Simple : ISeededHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; public UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 currentIndex = 0; while (length >= 4) { UInt32 k = BitConverter.ToUInt32(data, currentIndex); k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; currentIndex += 4; length -= 4; } switch (length) { case 3: h ^= BitConverter.ToUInt16(data, currentIndex); h ^= (UInt32)data[currentIndex + 2] << 16; h *= m; break; case 2: h ^= BitConverter.ToUInt16(data, currentIndex); h *= m; break; case 1: h ^= data[currentIndex]; h *= m; break; default: break; } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } } public class MurmurHash2InlineBitConverter : ISeededHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; public UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 currentIndex = 0; while (length >= 4) { UInt32 k = (UInt32)(data[currentIndex++] | data[currentIndex++] << 8 | data[currentIndex++] << 16 | data[currentIndex++] << 24); k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; length -= 4; } switch (length) { case 3: h ^= (UInt16)(data[currentIndex++] | data[currentIndex++] << 8); h ^= (UInt32)(data[currentIndex] << 16); h *= m; break; case 2: h ^= (UInt16)(data[currentIndex++] | data[currentIndex] << 8); h *= m; break; case 1: h ^= data[currentIndex]; h *= m; break; default: break; } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } } public class MurmurHash2UInt32Hack : ISeededHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; [StructLayout(LayoutKind.Explicit)] struct BytetoUInt32Converter { [FieldOffset(0)] public Byte[] Bytes; [FieldOffset(0)] public UInt32[] UInts; } public UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 currentIndex = 0; // array will be length of Bytes but contains Uints // therefore the currentIndex will jump with +1 while length will jump with +4 UInt32[] hackArray = new BytetoUInt32Converter { Bytes = data }.UInts; while (length >= 4) { UInt32 k = hackArray[currentIndex++]; k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; length -= 4; } currentIndex *= 4; // fix the length switch (length) { case 3: h ^= (UInt16)(data[currentIndex++] | data[currentIndex++] << 8); h ^= (UInt32)data[currentIndex] << 16; h *= m; break; case 2: h ^= (UInt16)(data[currentIndex++] | data[currentIndex] << 8); h *= m; break; case 1: h ^= data[currentIndex]; h *= m; break; default: break; } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } } public class MurmurHash2Unsafe : ISeededHashAlgorithm { public UInt32 Hash(Byte[] data) { return Hash(data, 0xc58f1a7b); } const UInt32 m = 0x5bd1e995; const Int32 r = 24; public unsafe UInt32 Hash(Byte[] data, UInt32 seed) { Int32 length = data.Length; if (length == 0) return 0; UInt32 h = seed ^ (UInt32)length; Int32 remainingBytes = length & 3; // mod 4 Int32 numberOfLoops = length >> 2; // div 4 fixed (byte* firstByte = &(data[0])) { UInt32* realData = (UInt32*)firstByte; while (numberOfLoops != 0) { UInt32 k = *realData; k *= m; k ^= k >> r; k *= m; h *= m; h ^= k; numberOfLoops--; realData++; } switch (remainingBytes) { case 3: h ^= (UInt16)(*realData); h ^= ((UInt32)(*(((Byte*)(realData)) + 2))) << 16; h *= m; break; case 2: h ^= (UInt16)(*realData); h *= m; break; case 1: h ^= *((Byte*)realData); h *= m; break; default: break; } } // Do a few final mixes of the hash to ensure the last few // bytes are well-incorporated. h ^= h >> 13; h *= m; h ^= h >> 15; return h; } } }
Tags: c#, optimization
Unknown said:
Hello,
at 17 March, 2009 14:00Any change to get Delphi version of MurMurHash2?
Davy Landman said:
Hello Tommi Prami,
at 17 March, 2009 15:09The MurMurHash2 is pretty simple, so why don't you try too implement it as a challenge? Using my Delphi implementation of SuperFastHash you could implement MurMurHash2 quickly.
If you run into troubles you're free to contact me for some tips/help.
Cheers,
Davy
Unknown said:
Hello,
at 18 March, 2009 06:08I was just hoping to get Lightning fast ASM-optimized version also ;)
I'll try to port it, I am not too good with shifts etc tough. Never done much of an bit Hacks...
I'll take our challenge, lets se am I worth it :)
-TP-
Unknown said:
Hello,
at 18 March, 2009 08:30Check .bas newsgroup of my try on this...
-TP-
brander said:
This comment has been removed by the author. at 07 February, 2011 21:34brander said:
curious if you've made a MurMurHash3 implementation now that the algorithm has been updated?
at 07 February, 2011 21:35Davy Landman said:
Hi Brander,
at 08 February, 2011 10:49I have not created a new version, your welcome to give it a go. You'll learn a lot from it.
Let me know if you implemented it, and I'll link to your post.
Cheers,
Davy Landman
Ian said:
Great article, thanks!
at 03 June, 2013 12:03You can do fall-through in a switch statement in C# as follows:
switch(x)
{
case 0:
// do something
goto case 1; // fall-through
case 1:
// do something
break; // exit
}
Ian said:
This comment has been removed by the author. at 03 June, 2013 12:03Ian said:
This comment has been removed by the author. at 03 June, 2013 12:04Ian said:
This comment has been removed by the author. at 03 June, 2013 12:06Ian said:
This comment has been removed by the author. at 03 June, 2013 12:06Ian said:
This comment has been removed by the author. at 03 June, 2013 12:06Unknown said:
Otarov Asylbek from Kazakhstan
at 18 May, 2014 09:29Hello , how are you ?
Thank you very much for your code.
I searched this murmurhash2 algorithm everywhere. If I found , this code was very difficult to understand .But your murmurhash2 code is very understandably for me. May I will use your code in my Diploma Project. I will wait your message , my email address otarov.1992@gmail.com. I will glad if you send me message.
Best regards
Otarov Asylbek from Kazakhsatn.
Unknown said:
Otarov Asylbek from Kazakhstan
at 18 May, 2014 09:30Hello , how are you ?
Thank you very much for your code.
I searched this murmurhash2 algorithm everywhere. If I found , this code was very difficult to understand .But your murmurhash2 code is very understandably for me. May I will use your code in my Diploma Project. I will wait your message , my email address otarov.1992@gmail.com. I will glad if you send me message.
Best regards
Otarov Asylbek from Kazakhsatn.
Davy Landman said:
Sure, license is liberal, just keep the license headers and you should be fine.
at 18 May, 2014 09:45Perhaps ask your supervisor about open source code being used?
flav3r said:
hey hello, i cannot understand where is IHashAlgorithm, can you help, thanks!
at 20 January, 2021 10:09flav3r said:
ok i got it, ignore that.
at 20 January, 2021 10:17