Using floats vs fixed point maths


#1

there's been a few interesting posts on this, and perhaps deserves a post of its own :slight_smile:


(More) Math help needed!
#2

@thetechnobear writes:

using floats and float functions (like powf) are expensive in terms of cpu cycles

Hmmm...

Cortex M4 clock cycles:

integer operations:

ADD - 1 clock cycle
SUB - 1 clock cycle
MUL - 1 clock cycle
SDIV- 2-12 clock cycles

Single precision float operations:

VADD.F32 - 1 clock cycle
VMUL.F32 - 1 clock cycle
VDIV.F32 - 14 clock cyles

Conclusion: single precision float operations on the cortex M4 are just as fast as integer operations. Division is costly for both ints and floats and should be avoided.

I got the impression that axoloti's use of fixed point was more about the re-use of legacy code from prior platforms which didn't have FPU support- which is a pity (IMO) because DSP code written with floats is a lot easier to read.

As for powf- yeah it's slow. I've benchmarked some of my lookup table based code as being about 10x faster, while being within 5e-5 of the standard library result- which is good enough for most audio dsp.


#3

The only ‘justification’ and refute I’ve seen is here

So not entirely clear, and certainly closer than I’d have thought - as long as your careful.

And for sure, doing everything in float would make things much easier to read and write :slight_smile:


#4

Some nice features from the STM32F4:

  • The intrinsic functions for conversion from float to fixed point (and reciprocally) take 1 cycle. So it is not a problem to use both q27 integers and floats in an object.
  • The FPU includes 32... yes, 32 float registers. These 32 registers are not used for anything else than your math/dsp while many integer registers are already used for specific purposes.
  • the float division - that takes 14 cycles - can be executed in parallel with integer instructions.

So, in some cases, it is a good idea to mix fixed point and floats !


#5

thanks @deadsy and @SmashedTransistors some really interesting points.

this post was kind of related...

in particular:

yes this is a case in point...

so, as mentioned in the link above, Olivier from MI (pinchenettes in post above), stated that he uses float because he saw little advantage of fixed point maths, due to the FPU present on the chips (, and his experiments with converting the elements resonator reinforced this for him) , and i think we can all agree, MI modules are very efficient in their use of the CPU - so show with proper use, floats can be efficient.(*)

so when we moved the MI code to axoloti, we weren't going to 'convert' the code to fixed point (that would be a complete re-write), so wrapped it with conversion calls in/out.
generally i'll say ive been happy with the performance of the MI objects.... esp. baring in mnd clouds/elements are run on the same chip as axoloti.

so its does seem a valid conclusion that using floats can yield good performance - i think so.

however, seems they do need to be used with the same care as int32.
you will see the MI also use tables rather than float point operations for things like exp, and im sure he has many other optimisations.

so perhaps the take-away, is floats are not intrinsically 'bad', but be careful what operations/functions you use... its very easy with floats to start using std functions that are costly.


also I think for clarity we need to remember to to stay with float, and do not use doubles as this I'm assuming are 64 bit, and so very expensive... aren't there times when floats get automatically coerced in to doubles... do we need to take care to avoid this?

also float constants.... i think we use the compiler options to assume floats, but we should really be explicit e.g. use 24.0f rather than 24.0


(*) as a complete aside, I think the MI code, also shows that if used 'intelligently' C++ can also be used for audio code, you just have to know what to use, and what not too.


#6

I like to do an "objdump -d" on the produced binary. ie- see what the compiler is actually doing rather than what I think it is doing. You can learn some interesting things:

float foo1(float x) {
  return x / 2.f;
}

float foo2(float x) {
  return x / 10.f;
}

is compiled to...

foo1:
 320:   eef6 7a00       vmov.f32        s15, #96        ; 0x3f000000  0.5
 324:   ee20 0a27       vmul.f32        s0, s0, s15
 328:   4770            bx      lr
 32a:   bf00            nop

foo2:
 32c:   eef2 7a04       vmov.f32        s15, #36        ; 0x41200000  10.0
 330:   ee80 0a27       vdiv.f32        s0, s0, s15
 334:   4770            bx      lr
 336:   bf00            nop

So in the first case /2 gets optimized to multiply by 0.5 - good.
In the second case /10 doesn't get optimized to multiply by 0.1 - why?

Answer: Because 0.1 does not have an exact FP representation, and so it has to be left as / 10 to properly represent the will of the programmer.

So if you thought that /k (constant) would always get optimized to * 1/k you'd be wrong- and you'd get undesired div operations in your code.

24.0f rather than 24.0

float foo3(float x) {
  return x * 0.1;
}

foo3:
 338:   b508            push    {r3, lr}
 33a:   ee10 0a10       vmov    r0, s0
 33e:   f7ff fffe       bl      0 <__aeabi_f2d>
 342:   a305            add     r3, pc, #20     ; (adr r3, 358 <foo3+0x20>)
 344:   e9d3 2300       ldrd    r2, r3, [r3]
 348:   f7ff fffe       bl      0 <__aeabi_dmul>
 34c:   f7ff fffe       bl      0 <__aeabi_d2f>
 350:   ee00 0a10       vmov    s0, r0
 354:   bd08            pop     {r3, pc}
 356:   bf00            nop
 358:   9999999a        .word   0x9999999a
 35c:   3fb99999        .word   0x3fb99999

Whoah. Soft emulation....

float foo4(float x) {
  return x * 0.1f;
}

foo4:
 360:   eddf 7a02       vldr    s15, [pc, #8]   ; 36c <foo4+0xc>
 364:   ee20 0a27       vmul.f32        s0, s0, s15
 368:   4770            bx      lr
 36a:   bf00            nop
 36c:   3dcccccd        .word   0x3dcccccd

That's better....


#7

I see Johannes also did some testing on floats in his jt folder library>community>jt>devel.

I dont know what to make of it though..... :=)


#8

Thus most of the time it's better to code

x = y * (1.0f / 3.0f);

than

x = y / 3.0f;