What’s in a Word … ?

In an exchange with David Heffernan both on SO and in the comments here on Te Waka, I had cause to climb in my own personal “Wayback Machine” and further investigate an apparent change in compiler behaviour between Delphi 2007 and 2009.

This change was first identified as the result of some code of mine that stopped working in Delphi 2009. The instinctive reaction is “It must be some Unicode issue”, but it turned out that the “problem” was actually a fix to a compiler bug!

To illustrate the bug that was fixed, compile and run this code in Delphi 2007 or earlier and then compile and run it using Delphi 2009 or later:

{$apptype CONSOLE}

program overloads;

function Foo(const aWord: Word): Word; overload;
begin
  result := 42;
end;

function Foo(const aWord: Longword): Longword; overload;
begin
  result := 0;
end;

var
  i: Integer;
begin
  i := 300000;
  WriteLn(Foo(i));
end.

In Delphi 2007 and earlier, the compiler calls the Word version of Foo(), not the LongWord, even though i is declared as an Integer (32-bit) which clearly is too big for a Word (16-bit) sized parameter.

It’s staggering to me that this problem had not bitten me in the rear appendage before that occasion in 2008.

In hindsight it is even more staggering that the compiler did not – and does not – emit a warning when an ordinal value is truncated in this fashion.

Tags: ,

6 comments

  1. David Heffernan’s avatar

    I think the compiler is still broken. There’s no rationale that I can see for picking one of these overloads. The documentation says (emphasis mine):

    You can pass to an overloaded routine parameters that are
    not identical in type with those in any of the routine’s
    declarations, but that are assignment-compatible with the
    parameters in more than one declaration. This happens
    most frequently when a routine is overloaded with different
    integer types or different real types – for example:

    procedure Store(X: Longint); overload;
    procedure Store(X: Shortint); overload;

    In these cases, when it is possible to do so without
    ambiguity, the compiler invokes the routine whose
    parameters are of the type with the smallest range that
    accommodates the actual parameters in the call
    .

    Well, neither Word nor LongWord accommodates Integer. So, what gives. Why does this code compile at all?

    1. Deltics’s avatar

      Well, of the two, Longword is the least broken when considering an integer value. Both are 32-bits so all the bits will get through at least, even if they might not necessarily mean on the other side what you meant when you passed them in. :)

      But yes, given the barrage of hints and warnings we get when moving strings around the place, the compiler could be a lot more helpful in this area, issuing hints and/or warnings when information is potentially lost in these sorts of cases, or even in simple assignments:

      var
        u: String;
        s: UTF8String;
        i: Integer;
        b: Byte;
      begin
        u := 'ASCII';   // These days a Unicode string
        s := u;         // WARNING! IMPLICIT CONVERSION! ALARM! ALARM!
      
        i := 300000;
        b := i;         // tumbleweeds ...
      end;
      

      Which to me is particularly annoying given that the implicit string conversion is never-the-less a conversion which in the case illustrated won’t result in any data loss at all and should at most be a hint, whilst the compiler remains utterly silent about the truncation of a 32-bit value to just 8 (unless you have range checking enabled in which case you – or your users – will at least get a slap in the face at runtime).

      1. David Heffernan’s avatar

        Good comparison with string conversions.

        But forget about warnings/hints on these integer conversions. They should be errors. And your program in the post should simply not compile.

  2. David Heffernan’s avatar

    One lesson to learn here is never to write overloads that are distinguished only by different integral types. The documentation is incomplete, the compiler’s behaviour is hard to predict, and the language sloppy.

    I’d name your functions: ReverseBytesWord and ReverseBytesLongWord.

  3. Denis’s avatar

    Had same problem earlier.

    In such cases I use logical names instead of overloads, like:

    procedure bswap16(var value: uint16);
    procedure bswap32(var value: uint32);
    procedure bswap64(var value: uint64);
    procedure bswapBuf(value: pbyte; size: uint32);

Comments are now closed.