Delphi 2009 – String Performance

NOTE: Downloads are now fixed!

Andreas Hausladen generously took the time to make some detailed comments on my previous post, one of which prompted me to throw together some further performance test cases for String types specifically.  The results were something of a mixed bag and contained some surprises.

The Tests

Methodology

In any discussion of performance testing inevitably methodology comes under scrutiny – I don’t intend getting into that discussion – relative differences are the point of interest here and in terms of the code under test I have tried to be careful to level the playing field as far as possible.  Of course, if anybody finds anything that completely invalidates any of the tests, that’s a different matter entirely.

Compiler settings were the same in all cases  Specific settings of note to be mentioned are:

– For Delphi 2009, the new $STRINGCHECKS compiler setting is OFF.  It is worth noting that this is ON by default and incurs a performance penalty that is unnecessary unless your Delphi projects also make use of, or are made use of by, C++ Builder.

– For all versions of Delphi tested, FastMM 4.90 was used.

The memory manager in Delphi 2009 seems actually to be marginally more efficient even than this latest version of FastMM, which was interesting in itself, and for practical purposes of course we would use the built-in memory manager if that is the most efficient.

But for the purposes of comparative performance testing, using the same memory manager with all compilers ensures that results aren’t influenced by differences in memory manager implementations.

Areas Under Test

Andreas’ comments mentioned use of const parameter declarations specifically, so I looked at these in particular, comparing calls to a procedure with a const parameter and to a procedure with no const declaration on the same parameter.  Subsequent comments discussing the $STRINGCHECKS setting shed some light on Andreas’ initial findings, but still the results from these tests contained a surprise so I left them in.

Simple tests were contrived for string assignment, concatenation and RTL routines Copy() and Delete().

I included IntToStr() in order to repeat the tests that previously gave me cause for concern w.r.t ANSI string performance in Delphi 2009.

I also tested using Pos() to find a substring in a string – both where the substring of interest did exist within the subject string, and also where it did not.

The final test performed in each compiler version was for the Replace() function.

A number of additional tests were then run in specific compilers.

For Delphi 7 and Delphi 2007 I also included tests of the FastStrings FastPos() and FastReplace() functions.  I did not bother including these tests in Delphi 2009 following Andreas comments (confirmed by the tests) about the FastStrings having been superceded by improvements in the RTL.

For Delphi 2009 I included repeats of the various tests using strings declared as ANSIString – no explicit conversions to Unicode were involved in any of these tests – all variables, parameters etc were declared consistently as ANSIString.

It was suggested that a comparison with WideString might be of interest.  I’m not convinced that this really is going to provide much insight – we already know that WideString is a very inefficient type that is not fundamentally changed in Delphi 2009, and that UnicodeString is vastly more efficient  – nevertheless I included these tests to satisfy those with curiosity in this area.

Also for Delphi 2009 I incorporated a repeat of the Concat test using a TStringBuilder, just to see how performance of this class compared with regular string building operations (specifically concatenation in this case).

The Tests and The Results

http://www.deltics.co.nz/blog/wp-content/plugins/downloads-manager/img/icons/default.gif download: StringPerformance - Source (1.45KB)
added: 17/09/2008
clicks: 2268
description: Source code for string performance tests.

Without my Smoketest framework (coming soon, I promise!) this will not compile let alone produce any test results but I provide the source so that anyone can find artefacts in my test code that they feel might explain test results they believe to be inaccurate or unrepresentative.

http://www.deltics.co.nz/blog/wp-content/plugins/downloads-manager/img/icons/ico_excel.gif download: StringPerformance - Test Results Data (19.21KB)
added: 17/09/2008
clicks: 2322
description: Test results data with pretty Excel formatting.

This isn’t how Smoketest emits the results by the way (I wish!).  The test project is compiled as a CONSOLE app and emits performance results as CSV output to the console which I redirected to files that I then imported into Excel.  As a result of this exercise I already have in mind some enhancements to Smoketest to make such comparative testing easier in the future.

For those who have no interest in the complete raw test results, here’s a capture of the summary tab showing the actual results data for Delphi 2009 and relative comparison of those results with those for Delphi 7 and Delphi 2007 (the result data for which is on separate sheets in the workbook).

The colour scheme should be fairly obvious, and follows a gradient scale from red (worse performance) to green (improved performance).  The third column of comparisons shows the relative performance in Delphi 2009 itself of ANSIString and WideString against String (i.e. UnicodeString).

First, The Good News

UnicodeString

On the basis of the tested operations, Unicode string performance appears – with some exceptions – to be generally only slightly slower than for ANSI strings in Delphi 7.  The gap between Delphi 2009 and 2007 is greater however, presumably because Delphi 2007 incorporates improvements over Delphi 7 as Andreas had suggested.

Unsurprisingly it is char-wise operations that suffer most – Copy() and Pos() for example.

ANSIString

Some surprising extremes here – some things are appreciably faster than Delphi 7, but some crucial operations are quite significantly slower and in some cases I’m at a loss to explain why.  The lack of an ANSI version of IntToStr() hurts badly if you are using ANSIString explicitly.

WideString

No real surprises here.  WideString is slow and performance isn’t really any different in Delphi 2009 compared to either Delphi 7 or Delphi 2007.

Two notable exceptions to this are WideChar indexing into a WideString and the Replace() function, both of which do seem to be improved compared to Delphi 7, but not Delphi 2007.  i.e. these improvements came in Delphi 2007 and remain in Delphi 2009, rather than being a Delphi 2009 improvement per se.

FastStrings

Not highlighted in the above image, but available in the test result data, the comparative performance of FastStrings against the RTL routines in Delphi 7 and Delphi 2007 confirmed something that Andreas had suggested – FastStrings is no longer as generally useful as it once was.

Admittedly my tests only exercised FastPos() and FastReplace() as these are the routines most relevant to myself.  Of these FastPos() has clearly been superceded by improvements in the RTL, but not so FastReplace(), which is still twice as fast as the RTL StringReplace() routine (but the FastStrings version is admittedly slightly more cumbersome to invoke).

The Not-So Good News

As mentioned, overall UnicodeString performance is comparable to string handling performance in Delphi 7 and somewhat less efficient than Delphi 2007.

There are however a number of cases where the string handling in Delphi 2009 is not just slightly, but significantly worse.

The RTL Copy() routine is the most obvious and possibly significant example, being fully three times slower than the Delphi 2007 implementation.

The Pos() routine seems to have suffered quite badly in Delphi 2009, being only half as efficient as that in Delphi 2007.

The Bad News – ANSI Strings

When concerns about the impact of Unicode on applications has been raised, one suggestion has been to “ANSIfy” any necessary code.  That is, make string (and char etc) declarations explicitly ANSI in order to maintain previous ANSI string behaviour in those areas of an application where is is felt necessary.

Unfortunately if performance is a factor in such cases, this might not fly.

Conclusion

First it should go without saying that the significance of any of these results to you will vary according to your application needs.

For myself the findings were predominantly encouraging.

My two greatest concerns with the Unicode implementation were performance and memory footprint.  Having adopted UTF16, there is simply no avoiding the memory footprint issue, but it seems that the performance issue is – to a large extent – not a significant concern.

But this is does not seem to be necessarily a universal truth.

If performance of string handling code is crucial to your application then it would perhaps be advisable to specifically test your implementation approach and ensure you are getting the most from the compiler.  When it was a simple question of ANSIString vs WideString it was pretty easy.

With UnicodeString and ANSIString it could be more complex.

For my own part, I am no longer too concerned about the FastStrings situation in Delphi 2009, although I might ANSIfy the FastStrings interfaces for my own use just in case, particularly FastReplace().

The project for which these concerns were most relevant for me is currently still in Delphi 5, so I suspect that an eventual migration to Delphi 2009 will actually show an overall improvement, relative to that Delphi 5 base (even though we already use FastStrings and FastMM).

Parting Thought:

- Whereforeart thou TStringBuilder?

The results for a simple string concatenation using TStringBuilder were staggering.  You will note from the test code that I was careful to not muddy the test results with construction and destruction of the TStringBuilder itself, but of course in real world usage some additional overhead is bound to be incurred by the need to erect and tear down TStringBuilder instances as needed.

I haven’t tested the advantage that TStringBuilder delivers in Delphi.NET, and if colleagues experiences with the C# equivalent are anything to go by, it could be huge.  But this simply doesn’t seem to apply on the Win32 side and in fact TStringBuilder seems to exact nothing but penalties – it results in harder to read code and seems to be slower – dramatically so – than “raw” string operations.

Certainly, if you were to use TStringBuilder with performance in mind, it looks like you could be making a terrible mistake, although the test in this exercise hardly constituted a comprehensive test of all TStringBuilder functionality.  Perhaps there are other operations where it is faster.

But it looks to me as if TStringBuilder is there primarily as a .NET compatibility fixture, rather than to provide any real benefit to developers of Win32 applications, with the possible exception of developers wishing or needing to single-source a Win32/.NET codebase where string handling performance isn’t a concern.

Tags: , , , ,

20 comments

  1. Bruce McGee’s avatar

    Nice. Very detailed.

    I can’t download the source code or test results.

  2. Tjipke’s avatar

    About TStringBuilder: even in .Net a stringbuilder is slower if you are only doing a (small) number of concattenations.
    And for Delphi stringbuilder is not needed: the stringbuilder way of concattenation is already build into string:
    1* a string will not be copied on assignment (a := a+b)
    2* (re)allocation will use FastMM, which uses a pretty efficient way of growing (like a stringbuilder should)

    I haven’t looked into the TStringBuilder implementation of D2009; but an implementation using a string as storage and inlining methods should be mucht slower then normal string concattenation ;-)

  3. Jolyon Smith’s avatar

    Thanks for the alert on the downloads Bruce! I have to confess I didn’t test the downloads as I’ve not previously had problems with the download manager I’m using – it just works.

    Not this time though. I suspect it’s something to do with my ISP and the file types. Unfortunately I can’t fix it from here so it’ll have to wait until this evening.

  4. Kryvich’s avatar

    The Bad News – ANSI Strings –

    It’s strange. I’ve just check a string indexing, Delphi 2007 and Delphi 2009 generates the same code.

    var
    s: AnsiString;
    c: AnsiChar;
    begin
    s := ‘The String';
    c := s[3];

    mov eax, [ebp-$04]
    movzx ebx, [eax+$02]

    I suggest to double check whether you set the Release configuration in Delphi 2009 when make your tests.

  5. Jolyon Smith’s avatar

    Hi Kryvich.

    That’s interesting – my test involved indexing each char in a string, not picking out one particular char, so iterating from 1 to Length().

    My guess is that it’s the performance of Length() that is causing the problem then, even with stringchecks off.

    I’ve fixed the download links now so you can see the actual test code (I think my ISP didn’t like serving up the Excel and Delphi file extensions so I’ve zipped the files up now and everything is fine)

  6. Kryvich’s avatar

    OK. I haven’t Deltics.SmokeTest unit. So I copy you’re ANSIStringIndexing; to the new project and compile both with Delphi 2007 and Delphi 2009.

    procedure ANSIStringIndexing;
    const
    ANSIDATA: ANSIString = ‘The quick brown fox jumps over the lazy dog';
    var
    i: Integer;
    c: ANSIChar;
    s: ANSIString;
    begin
    s := ANSIDATA;
    for i := 1 to Length(s) do
    c := s[i];
    end;

    procedure TForm1.Button2Click(Sender: TObject);
    begin
    ANSIStringIndexing;
    end;

    I got identical code in both IDE.

    [Kryvich provided identical assembler dumps from D2007 and 2009 which I have removed for the convenience of other readers of these comments - they were identical, I promise: Ed]

  7. Kryvich’s avatar

    There is 2 calls: LStrLAsg and LStrClr.
    I look in deeper and found only 1 additional check for Element Size in LStrLAsg in Delphi 2009. That’s all.

    Why you got that difference – very strange.

  8. Jolyon Smith’s avatar

    OK, well that’s interesting because it’s not actually what I was comparing in my test results.

    (btw – I hope you didn’t mind me snipping out the identical asm for the convenience of other readers/commenters)

    The ANSIString test case wasn’t actually performed in Delphi 2007 because the String test case is (or should be) identical. However, this made me go back and check and there was a small difference in the TString.Indexing case and the TANSIString.Indexing case – the ANSIString case assigned the typed const ANSIString to a local var, where the String case just indexed the typed const String directly.

    I corrected that discrepancy to end up with these cases:

    procedure TString.Indexing;
    var
    i: Integer;
    c: Char;
    begin
    for i := 1 to Length(STRDATA) do
    c := STRDATA[i];
    end;

    procedure TANSIString.Indexing;
    var
    i: Integer;
    c: ANSIChar;
    begin
    for i := 1 to Length(ANSIDATA) do
    c := ANSIDATA[i];
    end;

    And sure enough Delphi 2009 now gives up a test result the same as Delphi 2007 for ANSIString.

    But what’s really baking my noodle now is that in Delphi 2007 the two cases give different results – TANSIString is consistently 33% faster than TString! In Delphi 7 results are identical (matching the faster of the Delphi 2007 results).

    I’m beginning to wish I’d never started this! :)

  9. Kryvich’s avatar

    > I hope you didn’t mind me snipping out the identical asm for the convenience of other readers/commenters

    Of course. Sorry for posting these dumps.

    > the two cases give different results – TANSIString is consistently 33% faster than TString!

    Jolyon, I think it’s hard to get correct results because of external factors: CPU load, multithreaded environment, CPU cache.

    To make results more reliable you can:
    – disable CPU cache,
    – load Windows in safe mode.

  10. Bruce McGee’s avatar

    A couple of suggestions:

    1) In light of your CompilerVersion post, you should probably change these lines:

    {$if CompilerVersion > 18.5}

    to read

    {$if CompilerVersion >= 20.0}

    2) Maybe save the test results in an older format for people who don’t use Office 2007. There’s always the free Excel viewer, but you can’t do any “tweaking”.

    http://www.microsoft.com/downloadS/details.aspx?familyid=1CD6ACF9-CE06-4E1C-8DCF-F33F669DBC3A&displaylang=en

  11. Jolyon Smith’s avatar

    @Kryvich – no worries about posting the dumps – please don’t think that I don’t appreciate it, it was just that once I was able to “certify” that they were identical the actual dumps themselves were surplus to requirement and bit “noisy”, that’s all. But it was absolutely useful to have them posted in the first place, so all good.

    :)

    My feeling is that external factors should lead to variations and inconsistencies in results – i.e. in this case sometimes String might be faster than ANSIString and vice versa, or the difference may vary, but this strange discrepancy in Delphi 2007 is consistently repeatable even if I reverse the order of the tests (run ANSIString cases before String), for example.

    As far as multithreading influences might go – SmokeTest runs it’s test cases on a worker thread, separate from the main thread. I am running the tests on a dual core system, but SmokeTest is implemented to detect this and in an N-core situation it sets the affinity of the main thread to CPU 1 and the worker thread to CPU 2.

    It’s not perfect but it should, in theory at least, remove context switching artefacts from the results if nothing else.

    CPU Cache is a good point and I have a “note to self” to investigate adding calls to flush the CPU cache (something I discovered when researching my VMT patching hack) for performance test cases.

    I shall re-run the tests with these changes in place and post the results.

    @Bruce – I probably should but I’m not sure that SmokeTest itself will even compile on Delphi.NET and the compiler versions are accurate enough for Win32. :)

    I did think of saving in an older Excel format but the gradient colour scheme used in the results wasn’t portable, according to the file-save-as warning that I got anyway.

    But as well as the Excel viewer, there is also the Office Compatibility pack that, aiui, will allow older Office applications to open 2007 files:

    http://www.microsoft.com/downloads/details.aspx?FamilyID=941b3470-3ae9-4aee-8f43-c6bb74cd1466&displaylang=en

  12. Bruce McGee’s avatar

    Just nit-picking a little.

    As for the results file, I was also thinking of people who don’t use Microsoft Office at all.

  13. Anders Isaksson’s avatar

    Jolyon, if you followed the FastCode discussions in .basm, you might remember all the work that had to be done to get the benchmarks reliable, with statistical significance. It’s no easy feat to actually know what has been tested/measured…

    One thing I haven’t seen mentioned here is code alignment. From the FastCode experience, we know that code alignment can be the decisive factor for best performance – and we have no control over that!

    Experimenting with the code, adding nop instructions here and there to align the inner loops can give a large boost, which immediately is lost as soon as *any* code is changed *anywhere* else in the project…

    Then there’s also stack alignment of local variables, which also can give measurable differences for anything larger than two bytes. This can partly be controlled, by adding dummy declarations, but there isn’t any real control over this either.

    AndersI

  14. Jolyon Smith’s avatar

    Hi Anders,

    Thanks for the detailed and insightful comments.

    No, I didn’t follow those discussions you refer to, but I am aware of the difficulty with metrics.

    I think the playing field was levelled as far as possible (following the “Redux” testing at least) given that I did not set out to establish the fastest possible code producable from each compiler but rather given the same set of typical conditions (bog standard project settings etc) how did the results from each compare.

  15. Anders Isaksson’s avatar

    Yes, sorry my comment didn’t come out clear – I was trying to explain why you can get some unintuitive result, like your

    “But what’s really baking my noodle now is that in Delphi 2007 the two cases give different results – TANSIString is consistently 33% faster than TString! In Delphi 7 results are identical (matching the faster of the Delphi 2007 results).”

    This may be just an artifact of code alignment in the library code, something which is more or less impossible to do anything about, and can drive anyone, trying to measure performance, crazy.

  16. Jolyon Smith’s avatar

    Ah yes, I should also have clarified in my “Redux” post that this oddity disappeared, although that was probably apparent from the results in that later exercise, which to me “feel” right.

    There may be a couple of small surprises but there’s no longer anything truly befuddling or inexplicable in there.

    :)

  17. Pontus Berg’s avatar

    POS() fails in this key instance. Any suggestion on how to fix that is appreciated.

    procedure TForm1.DBGrid1TitleClick(Column: TColumn);
    {$J+}
    const PreviousColumnIndex : integer = -1;
    {$J-}
    begin
    if DBGrid1.DataSource.DataSet is TCustomADODataSet then
    with TCustomADODataSet(DBGrid1.DataSource.DataSet) do
    begin
    try
    DBGrid1.Columns[PreviousColumnIndex].title.Font.Style :=
    DBGrid1.Columns[PreviousColumnIndex].title.Font.Style – [fsBold];
    except
    end;

    Column.title.Font.Style :=
    Column.title.Font.Style + [fsBold];
    PreviousColumnIndex := Column.Index;

    if (Pos(Column.Field.FieldName, Sort) = 1)
    and (Pos(‘ DESC’, Sort)= 0) then
    Sort := Column.Field.FieldName + ‘ DESC’
    else
    Sort := Column.Field.FieldName + ‘ ASC';
    end;
    end;

  18. Jolyon Smith’s avatar

    In what way is Pos() failing? I haven’t tried this code as of course I don’t have the necessary infrastructure in place.

    If you can post a self-contained example I’ll take a look, although CodeGear would probably be the most appropriate to look at any actual problems.

Comments are now closed.