Landman Code Exploring my outer regions of coding.

A blog about C#, Delphi, assembler and general developer stuff.

Landman Code Exploring my outer regions of coding.

A blog about C#, Delphi, assembler and general developer stuff.

LINQ to Entities string based dynamic OrderBy

The String Maker picture by  hpk on http://flickr.com/photos/hpk/86047888/The problem I kept having was that at some point you’ll need to convert the GridView.SortExpression from your ASP.NET GridView to a Lamba Expression for your Queryable.OrderBy.

The challenge

You’ll get the string.

“OrderDate DESC”

And you have to translate that to:

ObjectContext.Orders.OrderByDescending(order => order.OrderDate)

In march/may I had spent a half day researching possibilities to solve this automatically. At that point I couldn’t find a clean solution. So I went with a generated solution (using T4 templates) which generated a big switch statement per entity.

A nice solution

We jump to november and I was browsing a little bit on StackOverflow to see if their were some interesting questions. My eye caught the question “How do I apply OrderBy on an IQueryable using a string column name within a generic extension method?”, and it got me started to solve the sorting challenge. I had some hours available so I thought I’d give it another try.

There are a few other’s who have created a solution using the Expression building method, and I’ve combined it and refactored it to make it match more with the current Entity Framework. I’ve also added support for an infinite amount of child entities.

/***** BEGIN LICENSE BLOCK *****
 * Version: MPL 1.1/GPL 2.0/LGPL 2.1
 *
 * The contents of this file are subject to the Mozilla Public License Version
 * 1.1 (the "License"); you may not use this file except in compliance with
 * the License. You may obtain a copy of the License at
 * http://www.mozilla.org/MPL/
 *
 * Software distributed under the License is distributed on an "AS IS" basis,
 * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
 * for the specific language governing rights and limitations under the
 * License.
 *
 * The Original Code is LINQExtensions.StringFieldNameSortingSupport.
 *
 * The Initial Developer of the Original Code is
 * Davy Landman.
 * Portions created by the Initial Developer are Copyright (C) 2008
 * the Initial Developer. All Rights Reserved.
 *
 * Contributor(s):
 *
 *
 * Alternatively, the contents of this file may be used under the terms of
 * either the GNU General Public License Version 2 or later (the "GPL"), or
 * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
 * in which case the provisions of the GPL or the LGPL are applicable instead
 * of those above. If you wish to allow use of your version of this file only
 * under the terms of either the GPL or the LGPL, and not to allow others to
 * use your version of this file under the terms of the MPL, indicate your
 * decision by deleting the provisions above and replace them with the notice
 * and other provisions required by the GPL or the LGPL. If you do not delete
 * the provisions above, a recipient may use your version of this file under
 * the terms of any one of the MPL, the GPL or the LGPL.
 *
 * ***** END LICENSE BLOCK ***** */
using System;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;

namespace LINQExtensions
{
    public static class StringFieldNameSortingSupport
    {
        #region Private expression tree helpers

        private static LambdaExpression GenerateSelector<TEntity>(String propertyName, out Type resultType) where TEntity : class
        {
            // Create a parameter to pass into the Lambda expression (Entity => Entity.OrderByField).
            var parameter = Expression.Parameter(typeof(TEntity), "Entity");
            //  create the selector part, but support child properties
            PropertyInfo property;
            Expression propertyAccess;
            if (propertyName.Contains('.'))
            {
                // support to be sorted on child fields.
                String[] childProperties = propertyName.Split('.');
                property = typeof(TEntity).GetProperty(childProperties[0], BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
                propertyAccess = Expression.MakeMemberAccess(parameter, property);
                for (int i = 1; i < childProperties.Length; i++)
                {
                    property = property.PropertyType.GetProperty(childProperties[i], BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
                    propertyAccess = Expression.MakeMemberAccess(propertyAccess, property);
                }
            }
            else
            {
                property = typeof(TEntity).GetProperty(propertyName, BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
                propertyAccess = Expression.MakeMemberAccess(parameter, property);
            }
            resultType = property.PropertyType;
            // Create the order by expression.
            return Expression.Lambda(propertyAccess, parameter);
        }
        private static MethodCallExpression GenerateMethodCall<TEntity>(IQueryable<TEntity> source, string methodName, String fieldName) where TEntity : class
        {
            Type type = typeof(TEntity);
            Type selectorResultType;
            LambdaExpression selector = GenerateSelector<TEntity>(fieldName, out selectorResultType);
            MethodCallExpression resultExp = Expression.Call(typeof(Queryable), methodName,
                            new Type[] { type, selectorResultType },
                            source.Expression, Expression.Quote(selector));
            return resultExp;
        }
        #endregion
        public static IOrderedQueryable<TEntity> OrderBy<TEntity>(this IQueryable<TEntity> source, string fieldName) where TEntity : class
        {
            MethodCallExpression resultExp = GenerateMethodCall<TEntity>(source, "OrderBy", fieldName);
            return source.Provider.CreateQuery<TEntity>(resultExp) as IOrderedQueryable<TEntity>;
        }

        public static IOrderedQueryable<TEntity> OrderByDescending<TEntity>(this IQueryable<TEntity> source, string fieldName) where TEntity : class
        {
            MethodCallExpression resultExp = GenerateMethodCall<TEntity>(source, "OrderByDescending", fieldName);
            return source.Provider.CreateQuery<TEntity>(resultExp) as IOrderedQueryable<TEntity>;
        }
        public static IOrderedQueryable<TEntity> ThenBy<TEntity>(this IOrderedQueryable<TEntity> source, string fieldName) where TEntity : class
        {
            MethodCallExpression resultExp = GenerateMethodCall<TEntity>(source, "ThenBy", fieldName);
            return source.Provider.CreateQuery<TEntity>(resultExp) as IOrderedQueryable<TEntity>;
        }
        public static IOrderedQueryable<TEntity> ThenByDescending<TEntity>(this IOrderedQueryable<TEntity> source, string fieldName) where TEntity : class
        {
            MethodCallExpression resultExp = GenerateMethodCall<TEntity>(source, "ThenByDescending", fieldName);
            return source.Provider.CreateQuery<TEntity>(resultExp) as IOrderedQueryable<TEntity>;
        }
        public static IOrderedQueryable<TEntity> OrderUsingSortExpression<TEntity>(this IQueryable<TEntity> source, string sortExpression) where TEntity : class
        {
            String[] orderFields = sortExpression.Split(',');
            IOrderedQueryable<TEntity> result = null;
            for (int currentFieldIndex = 0; currentFieldIndex < orderFields.Length; currentFieldIndex++)
            {
                String[] expressionPart = orderFields[currentFieldIndex].Trim().Split(' ');
                String sortField = expressionPart[0];
                Boolean sortDescending = (expressionPart.Length == 2) && (expressionPart[1].Equals("DESC", StringComparison.OrdinalIgnoreCase));
                if (sortDescending)
                {
                    result = currentFieldIndex == 0 ? source.OrderByDescending(sortField) : result.ThenByDescending(sortField);
                }
                else
                {
                    result = currentFieldIndex == 0 ? source.OrderBy(sortField) : result.ThenBy(sortField);
                }
            }
            return result;
        }
    }
}

You can also download this file. All you have to do to use these extensions methods is add a using LINQExtensions;".

It's now very simple to write queries like this

var result = queryableOrders.OrderBy("Employee.LastName").ThenByDescending("Freight").ToList();

I’ve deliberately kept the parsing of the SortExpression out of these extension methods. In my opinion it should be kept outside of the string field names support. I’ve added a new extension method just for this purpose.

var result = queryableOrders.OrderUsingSortExpression("Employee.LastName, Freight DESC").ToList();

Bummer

When researching for writing this article I found the article Dynamic LINQ (Part 1: Using the LINQ Dynamic Query Library) by the Scott Guthrie, which does exactly as I developed. The only reason I still post this code is that it’s less heavy and not under de Microsoft Permissive License (but under a very flexible triple license).

So decide for yourself, use this code directly or download the samples pack and locate the DynamicLinq folder and use that library (which has some very cool stuff in it).

Tags: , ,

Binding the ObjectContext Life-Cycle to the current ASP.NET request

This spider thread picture from flickr is from Infinity Rain and has the nc-nd-2.0 license. My team has switched (with the coaching of class-a) to LINQ to Entities from the ADO.NET Entity Framework for our new applications. Because we came from ADO.NET Typed Datasets we hade to revise our architecture. Together with a team member we designed our new architecture (and asked Alex Thissen to review it). We used a new project to test our architecture and use pair programming to solve problems which arose during the development of the business layer.

We encapsulated the ObjectContext from the Entity Framework using the repository pattern. How we did this I might post about later, but now I’m going to discuss something else.

The problem was that each entity got it’s own repository, but you have to share the ObjectContext, or nothing works except the stuff that happens on that entity alone. Anyone who uses the LINQ to Entities (or LINQ to SQL) can guess what happened, a lot of Exceptions and non updating data. So there are two ways we could solve it, create an repository factory which would inject the correct ObjectContext or make some kind of singleton access to a ObjectContext. We chose for the second solution because we already had an repository factory which we’re planning to use as a unit of work pattern. So that repository factory couldn’t be a singleton and we still had to solve the same problem of the ObjectContext instances.

The singleton pattern I would have normally used was the fully lazy instantiation singleton by Jon Skeet which in my opinion should be mentioned in the MSDN P&P Article: Implementing Singleton in C#. But we wanted a ObjectContext per request, so it’s not really a singleton but more a more managed life-cycle. Although most of our applications are web based, I’d like it if it could survive outside of ASP.NET. Alex Thissen gave us a sample of how he did it, and we used it for a while. The code he provided is below.

public class DataContextFactory
{
    private const string DATACONTEXTKEY = "NorthwindataContext";
    private static object myLock = new object();

    internal static NorthwindEntities CurrentContext
    {
        get
        {
            NorthwindEntities context = DataContext;
            if (context == null)
            {
                lock (myLock)
                {
                    if (context == null)
                    {
                        context = new NorthwindEntities 
                        DataContext = context;
                    }
                }
            }
            return context;
        }
    }

    private static NorthwindEntities DataContext
    {
        get
        {
            if (IsInWebContext)
            {
                return (NorthwindEntities)HttpContext.Current.Items[DATACONTEXTKEY];
            }
            else
            {
                return (NorthwindEntities)CallContext.GetData(DATACONTEXTKEY);
            }
        }

        set
        {
            if (IsInWebContext)
            {
                HttpContext.Current.Items[DATACONTEXTKEY] = value;
            }
            else
            {
                CallContext.SetData(DATACONTEXTKEY, value);
            }
        }
    }

    private static bool IsInWebContext
    {
        get { return HttpContext.Current != null; }
    }
}

We used it in the beginning and it worked. But when I started developing a T4 template for our repositories I looked at the code again and it didn’t feel right, especially the locking. Because both the HttpContext.Current as the CallContext are unique for each thread, so why do we need to lock if only one thread access the same HttpContext.Current or CallContext (please correct me if I'm wrong). A little bit searching on the internet lead me to SimoneB's post about the ASP.NET Singleton-per-Page pattern, I combined this with the CallContext knowledge in my own DataContextFactory. The code for that factory is located below.

internal class DataContextFactory
{
    public static NorthwindEntities CurrentContext
    {
        get
        {
            if (IsInWebContext)
            {
                // support the difference between pages in Server.Transfer
                Page currentPage = HttpContext.Current.Handler as Page;
                if(currentPage == null)
                    throw new NullReferenceException("The page is null");
                return (currentPage.Items["NorthwindEntitiesSingleton"] ??
                       (currentPage.Items["NorthwindEntitiesSingleton"] =
                        new NorthwindEntities())) as NorthwindEntities;
            }
            else
            {
                NorthwindEntities result = CallContext.GetData("NorthwindEntitiesSingleton") as  NorthwindEntities;
                if (result == null)
                {
                    result = new NorthwindEntities();
                    CallContext.SetData("NorthwindEntitiesSingleton", result);
                }
                return result;
            }
        }
    }
    private static bool IsInWebContext
    {
        get { return HttpContext.Current != null; }
    }
}

Because our repositories are generated this DataContextFactory is also generated and therefore it has the string copied allover the place.

Our repositories and the repository factory both store the DataContext in a private field, so if the repository instance is stored in a session, the context will remain the same, but if you create a new repository you’ll have the current DataContext.  To make this more transparent we used the following rule: when we need to use the unit of work pattern, we store the repository factory in the session and only request repositories from that factory. If it’s just a simple page, request the desired repository directly. Below is an example of how a repository looks like using the ContextFactory.

public partial class CategoryRepository: ICategoryRepository
{
    private NorthwindEntities context;

    /// <summary>
    /// Implement this partial method to execute code after the constructor.
    /// </summary>
    partial void AfterConstructed();

    public CategoryRepository() :
        this (DataContextFactory.CurrentContext)
    {
    }
    internal CategoryRepository(NorthwindEntities context)
    {
        this.context = context;
        AfterConstructed();
    }
    // The rest
}

I just have one problem I haven’t figured out yet, the DataContext implements IDisposable, so when I’m done with it I should call Dispose to clean the resources. The problem is, I don’t know when to correctly call Dispose. So if anybody has a good idea how to solve this?

Tags: , , , , ,

SuperFastHash from Paul Hsieh Translated to Delphi and Borland Assembler (BASM)

This cheetah picture from flickr is from Don Van Dyke and has the nc-nd-2.0 license. I've been a member of the FastCode community for a little while. The FastCode community tries to optimize core functions from Delphi using mostly BASM. After they have a very intensive benchmark en validation process the winner is suggested to Borland/CodeGear through a QC report, and is likely included in the next version of Delphi. I even have a few functions which are winners in these benchmarks and have a QC report with my function as a suggestion.

How it started

Why this introduction? Well, the community lives in the Borland Newsgroup in the group borland.public.delphi.language.basm and it's also a general discussion place about 'hardcore' optimalizations for a Delphi function. More than a year ago Juhani Suhonen asked for a fast hash to use for his hashtable. I suggested the old but nicely performing elf-hash, but also noted a much better hash function I recently found. It was called SuperFastHash (SFH) and was created by Paul Hsieh to overcome his 'problems' with the hash functions from Bob Jenkins. Juhani asked if somebody could write the SFH function in basm. A few people worked on a basm implementation and posted it.

In this post I will roughly describe how I as a fresh assembler optimizer created my version of the SFH function.

The Delphi implementation

I started with a regular Delphi implementation of the algorithm, using the c code from Paul's article as reference. After I got my Delphi implementation compiled and working I looked at the BASM code using the CPU debug view and tried to find points where I could help the Delphi compiler generating better assembly. Below is my initial pascal version. (beware I will post my final version later on in this post).

function SuperFastHash(AData:pointer; ADataLength: integer):longword;
// Pascal translation of the SuperFastHash function by Paul Hsieh
// more info: http://www.azillionmonkeys.com/qed/hash.html
// Translation by: Davy Landman
// No warranties, but have fun :)
var
  TempPart: longword;
  RemainingBytes: integer;
begin
  if not Assigned(AData) or (ADataLength <= 0) then
  begin
    Result := 0;
    Exit;
  end;
  Result := ADataLength;
  RemainingBytes := ADataLength and 3;
  ADataLength := ADataLength shr 2; // div 4, so var name is not correct anymore..
  // main loop
  while ADataLength > 0 do
  begin
    inc(Result, PWord(AData)^);
    TempPart := (PWord(Pointer(Cardinal(AData)+2))^ shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    AData := Pointer(Cardinal(AData) + 4);
    inc(Result, Result shr 11);
    dec(ADataLength);
  end;
  // end case
  if RemainingBytes = 3 then
  begin
    inc(Result, PWord(AData)^);
    Result := Result xor (Result shl 16);
    Result := Result xor (PByte(Pointer(Cardinal(AData)+2))^ shl 18);
    inc(Result, Result shr 11);
  end
  else if RemainingBytes = 2 then
  begin
    inc(Result, PWord(AData)^);
    Result := Result xor (Result shl 11);
    inc(Result, Result shr 17);
  end
  else if RemainingBytes = 1 then
  begin
    inc(Result, PByte(AData)^);
    Result := Result xor (Result shl 10);
    inc(Result, Result shr 1);
  end;
  // avalance
  Result := Result xor (Result shl 3);
  inc(Result, Result shr 5);
  Result := Result xor (Result shl 4);
  inc(Result, Result shr 17);
  Result := Result xor (Result shl 25);
  inc(Result, Result shr 6);
end;

Warning this is not final code!

The BASM implementation

I posted this and later posted an assembly version of SFH, I used the CPU view and the code suggestions from Bob Gonder to create this BASM code. Using BASM you can control the exact usage of registers and you're able to fine tune the assembly much better (and easier) than writing Delphi code.

My enthusiasm got the better of me and I spent a whole lot of evenings optimizing the Delphi and assembly versions of the SFH function (while I should have been working on my graduation internship paper). The discussion on the newsgroup turned a few interesting corners about the different kind of definitions of a hashing function. Therefor I must note that this function should only be used for a hashtable! It's tuned for giving a good distribution but not for cryptographically or verification usage.

The final unit

I think I've posted my final functions in the discussion thread, but recently I created a separate function file including some new parts. For instance, I added an functions for larger data which is the same algorithm, but uses less jumps because I unrolled the function. Below is my final unit of the SuperFastHash function.

unit unSuperFastHash;
(*
  A Delphi and assembly translation of the SuperFastHash function by
  Paul Hsieh (http://www.azillionmonkeys.com/qed/hash.html).

  I got the idea for translating it due to borland.public.delphi.language.basm.
  See the full discussion at:
  http://groups.google.com/group/borland.public.delphi.language.basm/
  browse_thread/thread/96482ba4d1d5a016/7745466ab714c3b3

 ***** BEGIN LICENSE BLOCK *****
 * Version: MPL 1.1/GPL 2.0/LGPL 2.1
 *
 * The contents of this file are subject to the Mozilla Public License Version
 * 1.1 (the "License"); you may not use this file except in compliance with
 * the License. You may obtain a copy of the License at
 * http://www.mozilla.org/MPL/
 *
 * Software distributed under the License is distributed on an "AS IS" basis,
 * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
 * for the specific language governing rights and limitations under the
 * License.
 *
 * The Original Code is SuperFastHash Delphi and BASM translation.
 *
 * The Initial Developer of the Original Code is
 * Davy Landman.
 * Portions created by the Initial Developer are Copyright (C) 2007
 * the Initial Developer. All Rights Reserved.
 *
 * Contributor(s):
 *
 *
 * Alternatively, the contents of this file may be used under the terms of
 * either the GNU General Public License Version 2 or later (the "GPL"), or
 * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
 * in which case the provisions of the GPL or the LGPL are applicable instead
 * of those above. If you wish to allow use of your version of this file only
 * under the terms of either the GPL or the LGPL, and not to allow others to
 * use your version of this file under the terms of the MPL, indicate your
 * decision by deleting the provisions above and replace them with the notice
 * and other provisions required by the GPL or the LGPL. If you do not delete
 * the provisions above, a recipient may use your version of this file under
 * the terms of any one of the MPL, the GPL or the LGPL.
 *
 * ***** END LICENSE BLOCK ***** *)

interface

{.$define ASMVersion}

function SuperFastHash(AData: pointer; ADataLength: Integer): Cardinal;
function SuperFastHashLargeData(AData: pointer; ADataLength: Integer): Cardinal;

implementation

// Pascal translation of the SuperFastHash function by Paul Hsieh
// more info: http://www.azillionmonkeys.com/qed/hash.html
function SuperFastHash(AData: pointer; ADataLength: Integer): Cardinal;
{$ifndef ASMVersion}
var
  TempPart: Cardinal;
  RemainingBytes: Integer;
  RemainingDWords: Integer;
begin
  if not Assigned(AData) or (ADataLength <= 0) then
  begin
    Result := 0;
    Exit;
  end;
  Result := ADataLength;
  RemainingBytes := ADataLength and 3; // mod 4
  RemainingDWords := ADataLength shr 2; // div 4

  // main loop
  while RemainingDWords > 0 do
  begin
    Result := Result + PWord(AData)^;
    // splitting the pointer math keeps the amount of registers pushed at 2
    AData  := Pointer(Cardinal(AData) + SizeOf(Word));
    TempPart := (PWord(AData)^ shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    AData  := Pointer(Cardinal(AData) + SizeOf(Word));
    Result := Result + (Result shr 11);
    dec(RemainingDWords);
  end;
  // Handle end cases
  if RemainingBytes = 3 then
  begin
    Result := Result +    PWord(AData)^;
    Result := Result xor (Result shl 16);
    AData  := Pointer(Cardinal(AData) + SizeOf(Word));   // skip to the last byte
    Result := Result xor ((PByte(AData)^ shl 18));
    Result := Result +   (Result shr 11);
  end
  else if RemainingBytes = 2 then
  begin
    Result := Result +    PWord(AData)^;
    Result := Result xor (Result shl 11);
    Result := Result +   (Result shr 17);
  end
  else if RemainingBytes = 1 then
  begin
    Result := Result +    PByte(AData)^;
    Result := Result xor (Result shl 10);
    Result := Result +   (Result shr 1);
  end;
  // Force "avalanching" of final 127 bits
  Result := Result xor (Result shl 3);
  Result := Result +   (Result shr 5);
  Result := Result xor (Result shl 4);
  Result := Result +   (Result shr 17);
  Result := Result xor (Result shl 25);
  Result := Result +   (Result shr 6);
{$else}
asm
    push  esi
    push  edi
    test  eax, eax // data
    jz    @Ret // eax is result
    xchg  edx, eax // swith data and length
    test  eax, eax // length, and hash
    jle    @Ret
@Start:
    mov   edi, eax
    mov   esi, eax
    and   edi, 3    // last few bytes
    shr   esi, 2    // number of DWORD loops
    jz    @Last3
@Loop:
    movzx ecx, word ptr [edx]
    add   eax, ecx
    movzx ecx, word ptr [edx + 2]

    shl   ecx, 11
    xor   ecx, eax
    shl   eax, 16

    xor   eax, ecx
    mov   ecx, eax

    shr   eax, 11
    add   eax, ecx
    add   edx, 4
    dec   esi
    jnz   @Loop
@Last3:
    test  edi, edi
    jz    @Done
    dec   edi
    jz    @OneLeft
    dec   edi
    jz    @TwoLeft

    movzx ecx, word ptr [edx]
    add   eax, ecx
    mov   ecx, eax
    shl   eax, 16
    xor   eax, ecx
    movsx ecx, byte ptr [edx + 2]
    shl   ecx, 18
    xor   eax, ecx
    mov   ecx, eax
    shr   ecx, 11
    add   eax, ecx
    jmp   @Done
@TwoLeft:
    movzx ecx, word ptr [edx]
    add   eax, ecx
    mov   ecx, eax
    shl   eax, 11
    xor   eax, ecx
    mov   ecx, eax
    shr   eax, 17
    add   eax, ecx
    jmp   @Done
@OneLeft:
    movsx ecx, byte ptr [edx]
    add   eax, ecx
    mov   ecx, eax
    shl   eax, 10
    xor   eax, ecx
    mov   ecx, eax
    shr   eax, 1
    add   eax, ecx
@Done:
    // avalanche
    mov   ecx, eax
    shl   eax, 3
    xor   eax, ecx

    mov   ecx, eax
    shr   eax, 5
    add   eax, ecx

    mov   ecx, eax
    shl   eax, 4
    xor   eax, ecx

    mov   ecx, eax
    shr   eax, 17
    add   eax, ecx

    mov   ecx, eax
    shl   eax, 25
    xor   eax, ecx

    mov   ecx, eax
    shr   eax, 6
    add   eax, ecx
@Ret:
    pop   edi
    pop   esi
    ret
{$endif}
end;

function SuperFastHashLargeData(AData: pointer; ADataLength: Integer): Cardinal;
{$ifndef ASMVersion}
type
  TWordArray = array[0..(MaxInt div SizeOf(Word)) - 1] of Word;
  PWordArray = ^TWordArray;
var
  TempPart: Cardinal;
  RemainingBytes: Integer;
  RemainingDWords: Integer;
begin
  if not Assigned(AData) or (ADataLength <= 0) then
  begin
    Result := 0;
    Exit;
  end;
  Result := ADataLength;
  RemainingBytes := ADataLength and 3;
  RemainingDWords := ADataLength shr 2; // div 4
  // large loop
  while RemainingDWords >= 4 do
  begin
    Result := Result + PWord(AData)^;
    TempPart := (PWordArray(AData)^[1] shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    Result := Result + (Result shr 11);

    Result := Result + PWordArray(AData)^[2];
    TempPart := (PWordArray(AData)^[3] shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    Result := Result + (Result shr 11);

    Result := Result + PWordArray(AData)^[4];
    TempPart := (PWordArray(AData)^[5] shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    Result := Result + (Result shr 11);

    Result := Result + PWordArray(AData)^[6];
    TempPart := (PWordArray(AData)^[7] shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    Result := Result + (Result shr 11);

    // update the pointer and the counter
    AData  := Pointer(Cardinal(AData) + (8 * SizeOf(Word)));
    RemainingDWords := RemainingDWords - 4;
  end;
  // small loop
  while RemainingDWords > 0 do
  begin
    Result := Result + PWord(AData)^;
    AData  := Pointer(Cardinal(AData) + SizeOf(Word));
    TempPart := (PWord(AData)^ shl 11) xor Result;
    Result := (Result shl 16) xor TempPart;
    AData  := Pointer(Cardinal(AData) + SizeOf(Word));
    Result := Result + (Result shr 11);
    dec(RemainingDWords);
  end;
  // Handle end cases
  if RemainingBytes = 3 then
  begin
    Result := Result +    PWord(AData)^;
    Result := Result xor (Result shl 16);
    AData  := Pointer(Cardinal(AData) + SizeOf(Word));   // skip to the last byte
    Result := Result xor ((PByte(AData)^ shl 18));
    Result := Result +   (Result shr 11);
  end
  else if RemainingBytes = 2 then
  begin
    Result := Result +    PWord(AData)^;
    Result := Result xor (Result shl 11);
    Result := Result +   (Result shr 17);
  end
  else if RemainingBytes = 1 then
  begin
    Result := Result +    PByte(AData)^;
    Result := Result xor (Result shl 10);
    Result := Result +   (Result shr 1);
  end;
    // Force "avalanching" of final 127 bits
  Result := Result xor (Result shl 3);
  Result := Result +   (Result shr 5);
  Result := Result xor (Result shl 4);
  Result := Result +   (Result shr 17);
  Result := Result xor (Result shl 25);
  Result := Result +   (Result shr 6);
{$else}
 asm
    push  esi
    push  edi
    test  eax, eax // test for nil pointer
    jz    @Ret     // eax is also result, so save ret here
    xchg  edx, eax // swith data and length
    test  eax, eax // length, and hash
    jle    @Ret
@Start:
    mov   edi, eax
    mov   esi, eax
    and   edi, 3    // last few bytes
    shr   esi, 2    // number of DWORD loops
    jz    @Last3
@LargeLoop:
    cmp esi,$04
    jl @Loop
    // first DWORD
    movzx ecx, word ptr [edx]
    add   eax, ecx
    movzx ecx, word ptr [edx + 2]

    shl   ecx, 11
    xor   ecx, eax
    shl   eax, 16

    xor   eax, ecx
    mov   ecx, eax

    shr   eax, 11
    add   eax, ecx
    // second DWORD
    movzx ecx, word ptr [edx + 4]
    add   eax, ecx
    movzx ecx, word ptr [edx + 6]

    shl   ecx, 11
    xor   ecx, eax
    shl   eax, 16

    xor   eax, ecx
    mov   ecx, eax

    shr   eax, 11
    add   eax, ecx

    // third DWORD
    movzx ecx, word ptr [edx + 8]
    add   eax, ecx
    movzx ecx, word ptr [edx + 10]

    shl   ecx, 11
    xor   ecx, eax
    shl   eax, 16

    xor   eax, ecx
    mov   ecx, eax

    shr   eax, 11
    add   eax, ecx

    // fourth DWORD
    movzx ecx, word ptr [edx + 12]
    add   eax, ecx
    movzx ecx, word ptr [edx + 14]

    shl   ecx, 11
    xor   ecx, eax
    shl   eax, 16

    xor   eax, ecx
    mov   ecx, eax

    shr   eax, 11
    add   eax, ecx

    add   edx, 16
    sub   esi, 4
    jz    @Last3
    jmp   @LargeLoop
@Loop:
    movzx ecx, word ptr [edx]
    add   eax, ecx
    movzx ecx, word ptr [edx + 2]

    shl   ecx, 11
    xor   ecx, eax
    shl   eax, 16

    xor   eax, ecx
    mov   ecx, eax

    shr   eax, 11
    add   eax, ecx
    add   edx, 4
    dec   esi
    jnz   @Loop
@Last3:
    test  edi, edi
    jz    @Done
    dec   edi
    jz    @OneLeft
    dec   edi
    jz    @TwoLeft

    movzx ecx, word ptr [edx]
    add   eax, ecx
    mov   ecx, eax
    shl   eax, 16
    xor   eax, ecx
    movsx ecx, byte ptr [edx + 2]
    shl   ecx, 18
    xor   eax, ecx
    mov   ecx, eax
    shr   ecx, 11
    add   eax, ecx
    jmp   @Done
@TwoLeft:
    movzx ecx, word ptr [edx]
    add   eax, ecx
    mov   ecx, eax
    shl   eax, 11
    xor   eax, ecx
    mov   ecx, eax
    shr   eax, 17
    add   eax, ecx
    jmp   @Done
@OneLeft:
    movsx ecx, byte ptr [edx]
    add   eax, ecx
    mov   ecx, eax
    shl   eax, 10
    xor   eax, ecx
    mov   ecx, eax
    shr   eax, 1
    add   eax, ecx
@Done:
    // avalanche
    mov   ecx, eax
    shl   eax, 3
    xor   eax, ecx

    mov   ecx, eax
    shr   eax, 5
    add   eax, ecx

    mov   ecx, eax
    shl   eax, 4
    xor   eax, ecx

    mov   ecx, eax
    shr   eax, 17
    add   eax, ecx

    mov   ecx, eax
    shl   eax, 25
    xor   eax, ecx

    mov   ecx, eax
    shr   eax, 6
    add   eax, ecx
@Ret:
    pop   edi
    pop   esi
    ret
{$endif}
end;

end.

To use this code you'll only have to download the unit and choose if you want to use the Pascal versions or the Assembly versions. You can switch between this using the {$define ASMVersion} preprocessor.

Difference between my version and Paul Hsieh's

The only difference between my Pascal code and Paul's c code is the pointer mathematics in the core loop. Below is the core loop c code.

for (;len > 0; len--) {
    hash  += get16bits (data);
    tmp    = (get16bits (data+2) << 11) ^ hash;
    hash   = (hash << 16) ^ tmp;
    data  += 2*sizeof (uint16_t);
    hash  += hash >> 11;
}

And this is my Pascal version of this code.

while RemainingDWords > 0 do
begin
  Result := Result + PWord(AData)^;
  // splitting the pointer math keeps the amount of registers pushed at 2
  AData  := Pointer(Cardinal(AData) + SizeOf(Word));
  TempPart := (PWord(AData)^ shl 11) xor Result;
  Result := (Result shl 16) xor TempPart;
  AData  := Pointer(Cardinal(AData) + SizeOf(Word));
  Result := Result + (Result shr 11);
  dec(RemainingDWords);
end;

Paul's code increments the pointer at the end data += 2*sizeof (uint16_t);, and uses get16bits (data+2) to get to the second Word while mine increases the data pointer in between, because the Delphi Compiler will not compile PWord(Pointer(Cardinal(AData)+2))^ into movzx ecx, word ptr [edx + 2] but will store the intermediate pointer in a extra register which causes performance loss.

The funny thing is, although I tried a lot of different ways to get a better performance I ended up with an implementation almost identical to Paul's. I hadn't found the assembly version he created until after I wrote my own, and I think it's nice to see that apart from the initialization and the case statement they are the same.

Next time I will post my benchmarking code.

Tags: , , ,