Optimize this compass directions code

Mark_Griffiths · 3 November 2021 09:05

Since you’ve got experience in the area and I couldn’t find a definitive answer, can I ask how are the edge cases handled? i.e. is 11.25 degrees North, or North North East? Similarly, is 33.75 degrees North North East, or North East.

I have a vague recollection that edge cases always went to the more significant bearing - i.e 11.25 would be North and 33.75 would be North East.

If my memory is correct, we’d have to make use of bankers rounding after all to achieve this:

Result := TCompassDirection(Round(inBearing / DEGREES_PER_DIRECTION) mod NUMBER_OF_DIRECTIONS);

Edit: Looking at your post again, it suggests that it always goes to the counter clockwise bearing, so we’d actually have to do:

Result := TCompassDirection(Ceil((inBearing - ANTI_CLOCKWISE_OFFSET) / DEGREES_PER_DIRECTION) mod NUMBER_OF_DIRECTIONS);

In this case, I think that the use of constants is making it harder to understand the code - with actual values instead:

Result := TCompassDirection(Round(inBearing / 22.5) mod 16);

Result := TCompassDirection(Ceil((inBearing - 11.25) / 22.5) mod 16);

Colins2 · 3 November 2021 09:48

In the method I outlined >=11.25° would be the lesser edge and <33.75° would be the upper edge.
I don’t see where the rounding comes into it, why not just compare the double bearing value with the const boundary values?
11.2499999999° would be N and 11.25° would be NNE.
The program appears to be rounding a double value of between 0.000 and 359.999 down to 1 of 16 values, so it is only going to be an approximate direction.
I don’t know the program but a GPS direction output of anywhere between the above figures is going to say “You are travelling in a NNE’ly direction” or point a (graphical) compass needle there.
It just seemed to me that the code was overly complicated to achieve this.

Mark_Griffiths · 3 November 2021 10:15

Sorry, I think that I’ve double confused myself and the Trunc method would match that.

One of the requested criteria was for the code to execute as quickly as possible. I’d expect that doing the calculation would be faster than doing up to 16 comparisons.

I’d expect that a lookup table would be even quicker, but I’d have to test it to be sure. A lookup should be possible on the following value without the need for 2 tables or anything special:

Trunc(inBearing * 2 + 0.5);  // Multiply by 2 and then round normally

The reason why I think this should be quicker is that multiplication is normally significantly faster than division.

Colins2 · 3 November 2021 14:16

@Mark_Griffiths @Paul_McGee @Lachlan
Paul, I couldn’t quite understand your times, what was the average time per enumeration?

Mark,
I wrote a small c++ program and created 1000 random doubles in an array,
I also created an empty string array of 1000,
I did no truncation or rounding and simply ran the array through a if/else if loop using 16 values.
The first run added the direction string to the array during the enumeration.
That took 56 milliseconds for the 1000 samples.
Then I ran the same array through the if/elseif loop but outputting the string just to the console and that took 878 milliseconds for the 1000 samples so just 0.8 milliseconds per record.
Lachlan,
How fast does it need to be?

Paul_McGee · 3 November 2021 14:48

Hi @Colins2 . The (140ms - 250 ms) range was total for ( 10,000 x 1000 = 10 million ) operations.

(On my laptop with 4 cores and 396 Chrome tabs open)

Paul_McGee · 3 November 2021 17:36

This seems right to me. I made a C++Builder version of the Delphi code and I think it’s interesting.
[NB I haven’t spent much time yet closely verifying any of the output produced]

First observation, c++ runs faster. (10 million operations)

     O[i] := CompassDirnByBinary ( data[i] );          // delphi = 140 ms   // c++ = 100 ms 
     O[i] := CompassDirectionOf2 ( data[i] );          // delphi = 170 ms   // c++ = 100 ms
     O[i] := CompassDirectionOf3 ( data[i] );          // delphi = 180 ms   // c++ =  40 ms

Second, in c++ there was no performance disadvantage to a normal loop like this.

  for var cd := cdNorth to cdNNW do                                                  
	  if I < CompPoints[cd] then begin
		Result := cd;
		Exit;
	  end;

(otoh - c++ downside - they don’t entirely gel with enum-indexed arrays.)

Last point. I took @Scott_Sedgwick 's calculation approach, but divided by 32, and used an appropriate array.

const
  halfpoints  : array [ 0..31 ] of TCompassDirection
					= (           cdNorth, cdNNE, cdNNE, cdNE, cdNE, cdENE, cdENE,
					     cdEast,  cdEast,  cdESE, cdESE, cdSE, cdSE, cdSSE, cdSSE,
                         cdSouth, cdSouth, cdSSW, cdSSW, cdSW, cdSW, cdWSW, cdWSW,
					     cdWest,  cdWest,  cdWNW, cdWNW, cdNW, cdNW, cdNNW, cdNNW, cdNorth );

In Delphi it slightly decreased the performance.
In C++ … it vastly increased performance.

From looking at the binary structure of the doubles, I also don’t think there should be any need for rounding when the division is by 16 or 32. I don’t think there’s any data being lost there.

My starting point, initial idea had been to try to get to integer operations instead of floating point ones.
But it looks like floating point operations can run pretty bloody fast, and you just have to try to get out of the cpu’s way as much as possible.

EDIT … I had to think about why there was a difference between Scott’s div 16 function and the div 32.
There isn’t any reason for them to be different … and it comes down to a cast to an enum vs an index into an array.
Interesting that the second is much faster in c++ than the first. And in Delphi they are about the same.
Changing the return expression yields the same 40 ms result with Scott’s function in c++.

const TCompassDirection points [ 16 ]
					  = { cdNorth, cdNNE, cdNE, cdENE,
						  cdEast,  cdESE, cdSE, cdSSE,
						  cdSouth, cdSSW, cdSW, cdWSW,
						  cdWest,  cdWNW, cdNW, cdNNW  };

AaronDC · 4 November 2021 01:18

From looking at the binary structure of the doubles, I also don’t think there should be any need for rounding when the division is by 16 or 32. I don’t think there’s any data being lost there.

Does the compiler make division by 16/32 logical shift rights (4/5) by default?

Paul_McGee · 4 November 2021 01:27

It must work a little different with doubles because there is an order of magnitude part to the structure.

Paul_McGee · 4 November 2021 01:31

I was thinking about why Delphi doesn’t show the same performance increase as the c++ with arrays. My guess is going to be bounds checking. I haven’t tried Delphi with bc turned off.

EDIT - well, yes I have. It is the default apparently. So, not range checking then.

I deliberately generated doubles up to 365.0 to also see what effect error-checking has - but I haven’t done anything with that (on purpose, anyway) yet.

AaronDC · 4 November 2021 01:48

Oh duh of course - IEEE 754 vs integer. My bad.

AaronDC · 4 November 2021 01:48

IEEE 754 or something

Paul_McGee · 4 November 2021 02:20

I only had that in my head from looking at the binary layout and trying to see some patterns.
The Wikipedia article is pretty nice: Double-precision floating-point format - Wikipedia

Colins2 · 4 November 2021 07:16

@Paul_McGee
Hi Paul,
Have you put the c++ code up anywhere? I’m struggling to fully understand the Delphi code and what it achieves. What does it output and where?
I’m still trying to understand the necessity of all these routines, unfortunately we don’t know the origin of the compass bearing, nor the destination of the result.
As far as I can tell - and I may very well be wrong - all these methods still end up enumerating a value and getting a direction, cdESE for example. is that then enumerated to 101.25°?
If the routine received an input bearing of (say) 105.3° what do the function(s) return “ESE” or 101.25?

Paul_McGee · 4 November 2021 08:32

@Colins2 … GitHub - pmcgee69/double-precision-to-enum-perf-test

The functions just return the enum (compass heading).

Colins2 · 4 November 2021 11:13

@Paul_McGee
Thanks Paul, I will have a play with it in C++ Builder.

Roger_Plant · 5 November 2021 04:47

I tried something, but it was slow

I tried making the function parameter a single.
That sped things up a lot in 32 bit windows version.
I am doubtful any error from using a single would be important, given they are being rounded
by a lot by definition, and assuming the measurements are actually real measurements.

NickR · 5 November 2021 04:59

I quite like Rudy’s Rudy's Delphi Corner - Floating point numbers - Sand or dirt

lexedmonds · 5 November 2021 05:29

Does using the Currency type (4 decimal places) improve things?

Lex

Paul_McGee · 6 November 2021 22:09

Another option … that I don’t know how to write yet … but going on from Erik van Bilsen on SIMD instructions, you could eg make 16 double precision comparisons in one 128 bit vector instruction … plus one instruction to see which one worked …

General idea here : SIMDization of switch statements

Scott_Sedgwick · 6 November 2021 22:36

I’ve not kept up with the more recent refinements of Delphi compiler design, but I think the problem with a case statement, or a bunch of comparisons, is that each comparison is a processor operation - and a case statement is, effectively, a long, nested if-then statement.

So, if there are 16 cardinal directions, then a case statement will (on average) process 8 comparisons before it finds a match and exits. The maximum number of operations is 15, and the minimum is 1.

On the other hand, an arithmetic operation is also one processor operation, so the divide and cast method does one addition, one rounding and one div (the cast is a free operation from the compilers perspective). So 3 operations (every time).

If you have studied your computational complexity, then comparisons (at their most efficient) have complexity:
Ωn = 15
On = 8
on = 1

Whereas divide and cast has:
Ωn = On = on = 3

The On is the value you should really worry about.