Fast reading of files using Memory Mapping
This post was migrated from my old blog delphi-snippets.blogspot.com, for explanation about this switch see my introduction post. It has been six months since I last posted something. Lets just say things got a little busy :). And posting source code on Blogspot seemed to be a bitch because blogspot would filter out the enters. I solved that in the previous post by using an <br /> as an enter. But when copying and pasting from the page the newlines were lost (offcourse DelForExp fixes that.. but still it sucked).
Now I have just a little bit of time, and a few articles I wanted to post. So after some testing I found out blogspot fixed the enter removal and now I’ll try to post more frequently.
Now let’s get ontopic, Memory Mapped Files can be very helpful for reading large files. Looking through the internet you can find many advantages and disadvantages. The important thing is, think about what your doing, MMF can be very fast in one application. But slow in an other, it all depends on the situation, there are enough articles about the subject (for instance this one by the Delphi Compiler Team)
I like MMF a lot when using binary files of a certain format. Let’s assume we have the following file format:
TCustomerStruct = packed record
CustomerID: Longword;
CustomerName: array[0..254] of Char;
CustomerBirthDay: TDateTime;
CustomerRate: Double;
AccountManagerID: Longword;
end;
You could read this using BlockRead:
var
CustomerFile: file of TCustomerStruct;
Customers: array of TCustomerStruct;
i : integer;
begin
AssignFile(CustomerFile,'c:\customers.cus');
try
Reset(CustomerFile); // open the file for reading
SetLength(Customers, FileSize(CustomerFile)); // create the array
BlockRead(CustomerFile, Customers, Length(Customers)); // Read the hole party in to the array
for i := 0 to High(Customers) do
// List all the customers in a memo
memCustomerList.Lines.Add('Name: '+ Customers[i].CustomerName);
finally
CloseFile(CustomerFile);
end;
And now using MemoryMapping:
type
TCustomerStructArray = array[0..MaxInt div SizeOf(TCustomerStruct) - 1] of TCustomerStruct;
PCustomerStructArray = ^TCustomerStructArray;
var
CustomerFile : TMappedFile;
Customers: PCustomerStructArray;
i : integer;
begin
CustomerFile := TMappedFile.Create;
try
CustomerFile.MapFile('c:\customers.cus');
Customers := PCustomerStructArray(CustomerFile.Content); // not needed, but handy
for i := 0 to CustomerFile.Size div SizeOf(TCustomerStruct) -1 do
memCustomerList.Lines.Add('Name: '+ Customers[i].CustomerName);
finally
CustomerFile.Free;
end;
The MaxInt div SizeOf(TCustomerStruct) – 1 is the maximum amount of records (thus memory) loaded at once.
The TMappedFile class is something I created myself so I can be lazy. Off course I will share that piece of code too.
unit unFileMapping;
{
Copyright (c) 2005-2006 by Davy Landman
See the file COPYING.FPC, included in this distribution,
for details about the copyright. Alternately, you may use this source under the provisions of MPL v1.x or later
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
}
interface
uses
Windows, SysUtils;
type
TMappedFile = class
private
FMapping: THandle;
FContent: Pointer;
FSize: Integer;
procedure MapFile(const AFileName: WideString);
public
constructor Create(const AFileName: WideString);
destructor Destroy; override;
property Content: Pointer read FContent;
property Size: Integer read FSize;
end;
implementation
function FileExistsLongFileNames(const FileName: WideString): Boolean;
begin
if Length(FileName) < 2 then
begin
Result := False;
Exit;
end;
if CompareMem(@FileName[1], @WideString('\\')[1], 2) then
Result := (GetFileAttributesW(PWideChar(FileName)) and FILE_ATTRIBUTE_DIRECTORY = 0)
else
Result := (GetFileAttributesW(PWideChar(WideString('\\?\' + FileName))) and FILE_ATTRIBUTE_DIRECTORY = 0)
end;
{ TMappedFile }
constructor TMappedFile.Create(const AFileName: WideString);
begin
inherited Create;
if FileExistsLongFileNames(AFileName) then
MapFile(AFileName)
else
raise Exception.Create('File "' + AFileName + '" does not exists.');
end;
destructor TMappedFile.Destroy;
begin
if Assigned(FContent) then
begin
UnmapViewOfFile(FContent);
CloseHandle(FMapping);
end;
inherited;
end;
procedure TMappedFile.MapFile(const AFileName: WideString);
var
FileHandle: THandle;
begin
if CompareMem(@(AFileName[1]), @('\\'[1]), 2) then
{ Allready an UNC path }
FileHandle := CreateFileW(PWideChar(AFileName), GENERIC_READ, FILE_SHARE_READ or
FILE_SHARE_WRITE, nil, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0)
else
FileHandle := CreateFileW(PWideChar(WideString('\\?\' + AFileName)), GENERIC_READ, FILE_SHARE_READ or
FILE_SHARE_WRITE, nil, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
if FileHandle <> 0 then
try
FSize := GetFileSize(FileHandle, nil);
if FSize <> 0 then
begin
FMapping := CreateFileMappingW(FileHandle, nil, PAGE_READONLY, 0, 0, nil);
//Win32Check(FMapping <> 0);
end;
finally
CloseHandle(FileHandle);
end;
if FSize = 0 then
FContent := nil
else
FContent := MapViewOfFile(FMapping, FILE_MAP_READ, 0, 0, 0);
//Win32Check(FContent <> nil);
end;
end.
The big advantage is, that with BlockRead you can either read the whole content of the file in the array, or buffering the file in blocks. With MMF there is no need to worry about it (unless you get very big files), Windows automatically arranges the memory when requested.