Landman Code Exploring my outer regions of coding.

A blog about C#, Delphi, assembler and general developer stuff.

Landman Code Exploring my outer regions of coding.

A blog about C#, Delphi, assembler and general developer stuff.

Getting (possibly) 500% speed gain on divisions

This post was migrated from My old blog delphi-snippets.blogspot.com, for explanation about this switch see my introduction post. One of my fellow students once made a remark about the old days when a division was much faster if it was written as multiplication.

This means, instead of devising with a 100, you multiply with 0.01 (1/100). Because some programs of mine have a lot of divisions in their core loops, I investigated the difference.

I created the following test program.

program DivVsMult;
{$APPTYPE CONSOLE}
uses
  SysUtils,
  Windows; 
 
var 
  Start,Stop, Start2, Stop2, Freq:int64;
  i : integer; 
  t : real; 
  CpuSpeed : integer; 
begin 
  { TODO -oUser -cConsole Main : Insert code here } 
  { Cpu Speed fastes cpu = 1 slower => 10 
    it's just to determin the number of time to do the loop 
    Maxint div CpuSpeed is calculated } 
  if ParamCount = 1 then 
    CpuSpeed := StrToIntDef(ParamStr(1),1) 
  else 
    CpuSpeed := 10; 
  Writeln('Simple Number division:'); 
  Writeln('Calculating'); 
  QueryPerformanceFrequency(freq); 
  QueryPerformanceCounter(Start); 
  for i:=0 to MaxInt div  CpuSpeed do 
    t := i / 100;
  QueryPerformanceCounter(Stop); 
  Writeln(Format('First Pass Result:  %f',[t])); 
   { This is needed because the compiler would optimize, 
     and would notice the result of the loop isn't used at all, 
     so therefor the result is useless.. so depending on the compiler, it will 
     choose what to do with it, this disables that optimization   } 
  QueryPerformanceCounter(Start2); 
  for i:=0 to MaxInt div CpuSpeed do 
    t := i * (1/100); 
  QueryPerformanceCounter(Stop2); 
  Writeln(Format('Second Pass Result: %15.6f',[t])); 
   { This is needed because the compiler would optimize, 
     and would notice the result of the loop isn't used at all, 
     so therefor the result is useless.. so depending on the compiler, it will 
     choose what to do with it, this disables that optimization   } 
  Writeln('Done, Results:'); 
  Writeln(Format('/ 100  Time:  %6.4f seconds'+#13#10+ 
                 '/ 100  Clock: %d ticks'+#13#10+ 
                 '* 0.01 Time:  %6.4f seconds'+#13#10+ 
                 '* 0.01 Clock: %d ticks',[(Stop-Start) / freq, (Stop-Start), (Stop2-Start2) / freq, (Stop2-Start2)]));
  Writeln; 
  Writeln('Odd Number division:'); 
  QueryPerformanceCounter(Start); 
  for i:=0 to high(i) div CpuSpeed do
    t := i / 556; 
  QueryPerformanceCounter(Stop); 
  Writeln(Format('First Pass Result:  %15.6f',[t])); 
   { This is needed because the compiler would optimize, 
     and would notice the result of the loop isn't used at all, 
     so therefor the result is useless.. so depending on the compiler, it will 
     choose what to do with it, this disables that optimization   } 
  QueryPerformanceCounter(Start2); 
  for i:=0 to high(i) div CpuSpeed do 
    t := i * (1/556); 
  QueryPerformanceCounter(Stop2); 
  Writeln(Format('Second Pass Result: %15.6f',[t])); 
   
  Writeln(Format('/ 556     Time:  %6.4f seconds'+#13#10+ 
                 '/ 556     Clock: %d ticks'+#13#10+ 
                 '* (1/556) Time:  %6.4f seconds'+#13#10+ 
                 '* (1/556) Clock: %d ticks',[(Stop-Start) / freq, (Stop-Start), (Stop2-Start2) / freq, (Stop2-Start2)]));
  Writeln(Format('  (1/556) = %15.14f (approximate)',[1/556])); 
  Readln;
 
end.



On an old P3 900Mhz:

Simple Number division:
Calculating
First Pass Result: 2147483,64
Second Pass Result: 2147483,64
Done, Results:
/ 100 Time: 10,3319 seconds
/ 100 Clock: 36983482 ticks
* 0.01 Time: 2,0378 seconds
* 0.01 Clock: 7294251 ticks

Odd Number division:
First Pass Result: 386238,0647482014610000
Second Pass Result: 386238,0647482014610000
Done, Results:
/ 556 Time: 10,0735 seconds
/ 556 Clock: 36058581 ticks
* (1/556) Time: 2,0446 seconds
* (1/556) Clock: 7318775 ticks
(1/556) = 0,0017985611510791 (approximate)

On a new P4 2.3 Ghz:

Simple Number division:
Calculating
First Pass Result: 2147483.64
Second Pass Result: 2147483.64
Done, Results:
/ 100 Time: 4.6227 seconds
/ 100 Clock: 16547055 ticks
* 0.01 Time: 1.0782 seconds
* 0.01 Clock: 3859508 ticks

Odd Number division:
First Pass Result: 386238.064748
Second Pass Result: 386238.064748
Done, Results:
/ 556 Time: 4.5820 seconds
/ 556 Clock: 16401425 ticks
* (1/556) Time: 12.1746 seconds
* (1/556) Clock: 43579366 ticks
(1/556) = 0.00179856115108 (approximate)

The results are variating, on simple numbers like 0.01 the speedup is allways working, but somehow the very complex numbers tend to be slower sometimes.

I use this tip a lot when working with percentage.

Tags: ,

Post a Comment