Forum: War Ensemble BBS

is double slower?

From fir@fir@grunge.pl to comp.lang.c on Mon Nov 4 08:53:00 2024

From Newsgroup: comp.lang.c

float takes less space and when you keep arrays of floats for sure float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the double
is slower?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Nov 4 01:27:01 2024

From Newsgroup: comp.lang.c

On 11/3/2024 11:53 PM, fir wrote:

float takes less space and when you keep arrays of floats for sure float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the double
is slower?

Ask the GPU.
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Mon Nov 4 15:43:30 2024

From Newsgroup: comp.lang.c

Chris M. Thomasson wrote:

On 11/3/2024 11:53 PM, fir wrote:

float takes less space and when you keep arrays of floats for sure float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the
double is slower?

Ask the GPU.

why? as tu cpu im not sure as on older cpus the calculations was anyway
made on double hardware (?, im not so sure) even if you passed float
to function im not sure if on assembly level yu not passed double

then after sse afair you got scalar code for floats and doubles
bbut simply i realized i dont know if double calculation on local
variables (not arrays) are in fakt anyway notable slower

--- Synchronet 3.20a-Linux NewsLink 1.114

From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Mon Nov 4 16:54:21 2024

From Newsgroup: comp.lang.c

Am 04.11.2024 um 08:53 schrieb fir:

float takes less space and when you keep arrays of floats for sure float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the double
is slower?

Look at the instruction tables at agner.org.

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to comp.lang.c on Mon Nov 4 19:23:21 2024

From Newsgroup: comp.lang.c

fir wrote:

Chris M. Thomasson wrote:

On 11/3/2024 11:53 PM, fir wrote:

float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the
double is slower?

Ask the GPU.

why? as tu cpu im not sure as on older cpus the calculations was anyway
made on double hardware (?, im not so sure) even if you passed float
to function im not sure if on assembly level yu not passed double

then after sse afair you got scalar code for floats and doubles
bbut simply i realized i dont know if double calculation on local
variables (not arrays) are in fakt anyway notable slower

im writing some cpu intensive experiment (something liek alpha blending
images on cpu mostly) and interestingly i just turned float into double
in that routine and it speeded up (as far as i can see, as i dont have
tme for much tests , changing to double turnet 35 ,s per frame into 34
ms per frame

--- Synchronet 3.20a-Linux NewsLink 1.114

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Nov 4 12:48:20 2024

From Newsgroup: comp.lang.c

On 11/4/2024 10:23 AM, fir wrote:

fir wrote:

Chris M. Thomasson wrote:

On 11/3/2024 11:53 PM, fir wrote:

float takes less space and when you keep arrays of floats for sure
float
is better (less spase and uses less memory bandwidth so i guess floats >>>> can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the
double is slower?

Ask the GPU.

why? as tu cpu im not sure as on older cpus the calculations was anyway
made on double hardware (?, im not so sure) even if you passed float
to function im not sure if on assembly level yu not passed double

then after sse afair you got scalar code for floats and doubles
bbut simply i realized i dont know if double calculation on local
variables (not arrays) are in fakt anyway notable slower

im writing some cpu intensive experiment (something liek alpha blending images on cpu mostly) and interestingly i just turned float into double
in that routine and it speeded up (as far as i can see, as i dont have
tme for much tests , changing to double turnet 35 ,s per frame into 34
ms per frame

Well, do you need double precision anyway?
--- Synchronet 3.20a-Linux NewsLink 1.114

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Nov 4 12:53:30 2024

From Newsgroup: comp.lang.c

On 11/4/2024 6:43 AM, fir wrote:

Chris M. Thomasson wrote:

On 11/3/2024 11:53 PM, fir wrote:

float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

but when you do calculations on local variables not floats do the
double is slower?

Ask the GPU.

why? as tu cpu im not sure as on older cpus the calculations was anyway
made on double hardware (?, im not so sure) even if you passed float
to function im not sure if on assembly level yu not passed double

In the realm of shaders float is the way. double is not always there.

then after sse afair you got scalar code for floats and doubles
bbut simply i realized i dont know if double calculation on local
variables (not arrays) are in fakt anyway notable slower

--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 09:14:15 2024

From Newsgroup: comp.lang.c

On 04/11/2024 08:53, fir wrote:

float takes less space and when you keep arrays of floats for sure float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

Certainly if you have a lot of them, then the memory bandwidth and cache
it rate can make floats faster than doubles.

but when you do calculations on local variables not floats do the double
is slower?

I assume that for the calculations in question, the accuracy and range
of float is enough - otherwise the answer is obviously use doubles.

This is going to depend on the cpu, the type of instructions, the source
code in question, the compiler and the options. So there is no single
easy answer.

You can, as Bonita suggested, look up instruction timing information at agner.org for the cpu you are using (assuming it's an x86 device) to get
some idea of any fundamental differences in timings. Usually for modern
"big" processors, basic operations such as addition and multiplication
are single cycle or faster (i.e., multiple instructions can be done in parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.

Next, consider if you can be using vector or SIMD operations. On some devices, you can do that with floats but not doubles - and even if you
can use doubles, you can usually run floats at twice the rate.

In the source code, remember it is very easy to accidentally promote to
double when writing in C. If you want to stick to floats, make sure you
don't use double-precision constants - a missing "f" suffix can change a
whole expression into double calculations. Remember that it takes time
to convert between float and double.

Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those
are the ones I know - if you are using another compiler, look at the
details of its flags.

Obviously you want optimisation enabled if speed is relevant - -O2 is a
good start. Make sure you are optimising for the cpu(s) you are using - "-march=native" is good for local programs, but you will want something
more specific if the binary needs to run on a variety of machines. The
closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.

Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have infinities, or need deterministic control of things like associativity,
it will make your results wrong.

"-Wdouble-promotion" can be helpful to spot accidental use of doubles in
what you think is a float expression. "-Wfloat-equal" is a good idea, especially if you are mixing floats and doubles. "-Wfloat-conversion"
will warn about implicit conversions from doubles to floats (or to
integers).

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to David Brown on Tue Nov 5 10:49:18 2024

From Newsgroup: comp.lang.c

David Brown wrote:

On 04/11/2024 08:53, fir wrote:

float takes less space and when you keep arrays of floats for sure float
is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

Certainly if you have a lot of them, then the memory bandwidth and cache
it rate can make floats faster than doubles.

but when you do calculations on local variables not floats do the
double is slower?

I assume that for the calculations in question, the accuracy and range
of float is enough - otherwise the answer is obviously use doubles.

This is going to depend on the cpu, the type of instructions, the source
code in question, the compiler and the options. So there is no single
easy answer.

You can, as Bonita suggested, look up instruction timing information at agner.org for the cpu you are using (assuming it's an x86 device) to get
some idea of any fundamental differences in timings. Usually for modern "big" processors, basic operations such as addition and multiplication
are single cycle or faster (i.e., multiple instructions can be done in parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.

Next, consider if you can be using vector or SIMD operations. On some devices, you can do that with floats but not doubles - and even if you
can use doubles, you can usually run floats at twice the rate.

In the source code, remember it is very easy to accidentally promote to double when writing in C. If you want to stick to floats, make sure you don't use double-precision constants - a missing "f" suffix can change a whole expression into double calculations. Remember that it takes time
to convert between float and double.

Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those
are the ones I know - if you are using another compiler, look at the
details of its flags.

Obviously you want optimisation enabled if speed is relevant - -O2 is a
good start. Make sure you are optimising for the cpu(s) you are using - "-march=native" is good for local programs, but you will want something
more specific if the binary needs to run on a variety of machines. The closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.

Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have infinities, or need deterministic control of things like associativity,
it will make your results wrong.

"-Wdouble-promotion" can be helpful to spot accidental use of doubles in
what you think is a float expression. "-Wfloat-equal" is a good idea, especially if you are mixing floats and doubles. "-Wfloat-conversion"
will warn about implicit conversions from doubles to floats (or to
integers).

the code that seem to speeded up a bit when turning float to double is

union Color
{
unsigned u;
struct { unsigned char b,g,r,a;};
};

inline float distance2d_(float x1, float y1, float x2, float y2)
{
return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}

inline unsigned GetPixelUnsafe_(int x, int y)
{
return frame_bitmap[y*frame_size_x+x];
}
inline void SetPixelUnsafe_(int x, int y, unsigned color)
{
frame_bitmap[y*frame_size_x+x]=color;
}

void DrawPoint(int i)
{
// if(!point[i].enabled) return;

int xq = point[i].x;
int yq = point[i].y;

Color c;
Color bc;

if(d_toggler)
{
// DrawCircle(xq,yq,point[i].radius,0xffffff);
FillCircle(xq,yq,point[i].radius,point[i].c.u);

return;
}

float R = point[i].radius*5;

int y_start = max(0, yq-R);
int y_end = min(frame_size_y, yq+R);
int x_start = max(0, xq-R);
int x_end = min(frame_size_x, xq+R);

for(int y = y_start; y<y_end; y++)
{
for(int x = x_start; x<x_end; x++)
{
//fere below was float ->
double p = (R - distance2d_(x,y,point[i].x,point[i].y));

if(!i_toggler)
{
if(p<0.4*R) continue;
}
else
if(p<0) continue;

p/=R;

bc.u = GetPixelUnsafe_(x,y);
int r = bc.r + (point[i].c.r)* p*p*p;
int g = bc.g + (point[i].c.g)* p*p*p;
int b = bc.b + (point[i].c.b)* p*p*p;

if(!r_toggler)
{
if(r>255) r = 255;
if(g>255) g = 255;
if(b>255) b = 255;
}

c.r = r;
c.g = g;
c.b = b;

SetPixelUnsafe_(x,y,c.u);

}
}

}

this just draws something like little light that darkens as 1/(r*r*r)
and is able to add n-lights in place to mix colors end eventually
"overlight" (so this is kinda blending)

its very time consuming liek draving 100 of them (rhen r is 9) was
taking 35 ms on old machine afair)
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to David Brown on Tue Nov 5 10:51:19 2024

From Newsgroup: comp.lang.c

fir wrote:

this just draws something like little light that darkens as 1/(r*r*r)
and is able to add n-lights in place to mix colors end eventually
"overlight" (so this is kinda blending)

its very time consuming liek draving 100 of them (rhen r is 9) was
taking 35 ms on old machine afair)

right now i cant do test (must work on some other problems) of it but my previous faith like float is never slower possibly is being like under question

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to David Brown on Tue Nov 5 11:03:10 2024

From Newsgroup: comp.lang.c

fir wrote:

David Brown wrote:

On 04/11/2024 08:53, fir wrote:

float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

Certainly if you have a lot of them, then the memory bandwidth and cache
it rate can make floats faster than doubles.

but when you do calculations on local variables not floats do the
double is slower?

I assume that for the calculations in question, the accuracy and range
of float is enough - otherwise the answer is obviously use doubles.

This is going to depend on the cpu, the type of instructions, the source
code in question, the compiler and the options. So there is no single
easy answer.

You can, as Bonita suggested, look up instruction timing information at
agner.org for the cpu you are using (assuming it's an x86 device) to get
some idea of any fundamental differences in timings. Usually for modern
"big" processors, basic operations such as addition and multiplication
are single cycle or faster (i.e., multiple instructions can be done in
parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.

Next, consider if you can be using vector or SIMD operations. On some
devices, you can do that with floats but not doubles - and even if you
can use doubles, you can usually run floats at twice the rate.

In the source code, remember it is very easy to accidentally promote to
double when writing in C. If you want to stick to floats, make sure you
don't use double-precision constants - a missing "f" suffix can change a
whole expression into double calculations. Remember that it takes time
to convert between float and double.

Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those
are the ones I know - if you are using another compiler, look at the
details of its flags.

Obviously you want optimisation enabled if speed is relevant - -O2 is a
good start. Make sure you are optimising for the cpu(s) you are using -
"-march=native" is good for local programs, but you will want something
more specific if the binary needs to run on a variety of machines. The
closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.

Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have
infinities, or need deterministic control of things like associativity,
it will make your results wrong.

"-Wdouble-promotion" can be helpful to spot accidental use of doubles in
what you think is a float expression. "-Wfloat-equal" is a good idea,
especially if you are mixing floats and doubles. "-Wfloat-conversion"
will warn about implicit conversions from doubles to floats (or to
integers).

the code that seem to speeded up a bit when turning float to double is

union Color
{
unsigned u;
struct { unsigned char b,g,r,a;};
};

inline float distance2d_(float x1, float y1, float x2, float y2)
{
return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}

inline unsigned GetPixelUnsafe_(int x, int y)
{
return frame_bitmap[y*frame_size_x+x];
}
inline void SetPixelUnsafe_(int x, int y, unsigned color)
{
frame_bitmap[y*frame_size_x+x]=color;
}

void DrawPoint(int i)
{
// if(!point[i].enabled) return;

int xq = point[i].x;
int yq = point[i].y;

Color c;
Color bc;

if(d_toggler)
{
// DrawCircle(xq,yq,point[i].radius,0xffffff);
FillCircle(xq,yq,point[i].radius,point[i].c.u);

return;
}

float R = point[i].radius*5;

int y_start = max(0, yq-R);
int y_end = min(frame_size_y, yq+R);
int x_start = max(0, xq-R);
int x_end = min(frame_size_x, xq+R);

for(int y = y_start; y<y_end; y++)
{
for(int x = x_start; x<x_end; x++)
{
//fere below was float ->
double p = (R - distance2d_(x,y,point[i].x,point[i].y));

if(!i_toggler)
{
if(p<0.4*R) continue;
}
else
if(p<0) continue;

p/=R;

bc.u = GetPixelUnsafe_(x,y);
int r = bc.r + (point[i].c.r)* p*p*p;
int g = bc.g + (point[i].c.g)* p*p*p;
int b = bc.b + (point[i].c.b)* p*p*p;

if(!r_toggler)
{
if(r>255) r = 255;
if(g>255) g = 255;
if(b>255) b = 255;
}

c.r = r;
c.g = g;
c.b = b;

SetPixelUnsafe_(x,y,c.u);

}
}

}

this just draws something like little light that darkens as 1/(r*r*r)
and is able to add n-lights in place to mix colors end eventually
"overlight" (so this is kinda blending)

its very time consuming liek draving 100 of them (rhen r is 9) was
taking 35 ms on old machine afair)

some can test it BTW

https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/view?usp=sharing

its for windows but worx under wine afair /and on linux wirtual machine
on windows also (afair, i dont know as i got only windows)
--- Synchronet 3.20a-Linux NewsLink 1.114

From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 11:25:15 2024

From Newsgroup: comp.lang.c

On 05/11/2024 10:49, fir wrote:

David Brown wrote:

On 04/11/2024 08:53, fir wrote:

float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
can be as twice faster in some aspects)

Certainly if you have a lot of them, then the memory bandwidth and cache
it rate can make floats faster than doubles.

but when you do calculations on local variables not floats do the
double is slower?

I assume that for the calculations in question, the accuracy and range
of float is enough - otherwise the answer is obviously use doubles.

This is going to depend on the cpu, the type of instructions, the source
code in question, the compiler and the options. So there is no single
easy answer.

You can, as Bonita suggested, look up instruction timing information at
agner.org for the cpu you are using (assuming it's an x86 device) to get
some idea of any fundamental differences in timings. Usually for modern
"big" processors, basic operations such as addition and multiplication
are single cycle or faster (i.e., multiple instructions can be done in
parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.

Next, consider if you can be using vector or SIMD operations. On some
devices, you can do that with floats but not doubles - and even if you
can use doubles, you can usually run floats at twice the rate.

In the source code, remember it is very easy to accidentally promote to
double when writing in C. If you want to stick to floats, make sure you
don't use double-precision constants - a missing "f" suffix can change a
whole expression into double calculations. Remember that it takes time
to convert between float and double.

Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those
are the ones I know - if you are using another compiler, look at the
details of its flags.

Obviously you want optimisation enabled if speed is relevant - -O2 is a
good start. Make sure you are optimising for the cpu(s) you are using -
"-march=native" is good for local programs, but you will want something
more specific if the binary needs to run on a variety of machines. The
closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.

Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have
infinities, or need deterministic control of things like associativity,
it will make your results wrong.

"-Wdouble-promotion" can be helpful to spot accidental use of doubles in
what you think is a float expression. "-Wfloat-equal" is a good idea,
especially if you are mixing floats and doubles. "-Wfloat-conversion"
will warn about implicit conversions from doubles to floats (or to
integers).

the code that seem to speeded up a bit when turning float to double is

I've tried to snip the bits that are important here.

inline float distance2d_(float x1, float y1, float x2, float y2)
{
return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}

What happens here depends on what #include files you use. If you have #include <math.h>, then "sqrt" is defined with doubles. So the
sum-of-squares expression is calculated using floats. Then this sum is converted to a double (taking an extra instruction or two) before
calling double-precision sqrt. Then it is converting that result back
to float to return it.

If you have "#include <tgmath.h>", then "sqrt" here will be done as
float sqrtf, rather than double. But the library version of sqrtf()
might actually call sqrt (double). If you want to be sure, be explicit
with sqrtf().

And on many platforms, sqrt (float or double) uses a library function
for full IEEE compatibility. With "-ffast-math", you are telling the
compiler you promise that the operand for "sqrt" will be "nice", and it
can use a single hardware sqrt instruction. This will likely be a lot
faster, especially if the float version is used. (Disclaimer - I
haven't looked at this on modern x86 targets. Check yourself - I
recommend putting your code into godbolt.org and examining the assembly.)

In the code that uses this function, you are starting with integer types
that need to be converted to float to pass to the distance function, and
the result of the call is used in a float expression before being
converted to double.

In short, it is a complete mess of conversions. And unless you are
using something like gcc's "-ffast-math" to say "don't worry about the
minor details of IEEE, optimise akin to integer arithmetic", then the
compiler has to generate all these back-and-forth conversions.

Being consistent in your types is going to improve things, whether you
use floats or doubles. You might even be better off using integer
arithmetic in some points.

//fere below was float ->
double p = (R - distance2d_(x,y,point[i].x,point[i].y));

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to David Brown on Tue Nov 5 11:42:38 2024

From Newsgroup: comp.lang.c

David Brown wrote:

On 05/11/2024 10:49, fir wrote:

David Brown wrote:

On 04/11/2024 08:53, fir wrote:

float takes less space and when you keep arrays of floats for sure
float
is better (less spase and uses less memory bandwidth so i guess floats >>>> can be as twice faster in some aspects)

Certainly if you have a lot of them, then the memory bandwidth and cache >>> it rate can make floats faster than doubles.

but when you do calculations on local variables not floats do the
double is slower?

I assume that for the calculations in question, the accuracy and range
of float is enough - otherwise the answer is obviously use doubles.

This is going to depend on the cpu, the type of instructions, the source >>> code in question, the compiler and the options. So there is no single
easy answer.

You can, as Bonita suggested, look up instruction timing information at
agner.org for the cpu you are using (assuming it's an x86 device) to get >>> some idea of any fundamental differences in timings. Usually for modern >>> "big" processors, basic operations such as addition and multiplication
are single cycle or faster (i.e., multiple instructions can be done in
parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.

Next, consider if you can be using vector or SIMD operations. On some
devices, you can do that with floats but not doubles - and even if you
can use doubles, you can usually run floats at twice the rate.

In the source code, remember it is very easy to accidentally promote to
double when writing in C. If you want to stick to floats, make sure you >>> don't use double-precision constants - a missing "f" suffix can change a >>> whole expression into double calculations. Remember that it takes time
to convert between float and double.

Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those
are the ones I know - if you are using another compiler, look at the
details of its flags.

Obviously you want optimisation enabled if speed is relevant - -O2 is a
good start. Make sure you are optimising for the cpu(s) you are using - >>> "-march=native" is good for local programs, but you will want something
more specific if the binary needs to run on a variety of machines. The
closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.

Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have
infinities, or need deterministic control of things like associativity,
it will make your results wrong.

"-Wdouble-promotion" can be helpful to spot accidental use of doubles in >>> what you think is a float expression. "-Wfloat-equal" is a good idea,
especially if you are mixing floats and doubles. "-Wfloat-conversion"
will warn about implicit conversions from doubles to floats (or to
integers).

the code that seem to speeded up a bit when turning float to double is

I've tried to snip the bits that are important here.

inline float distance2d_(float x1, float y1, float x2, float y2)
{
return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}

What happens here depends on what #include files you use. If you have #include <math.h>, then "sqrt" is defined with doubles. So the sum-of-squares expression is calculated using floats. Then this sum is converted to a double (taking an extra instruction or two) before
calling double-precision sqrt. Then it is converting that result back
to float to return it.

If you have "#include <tgmath.h>", then "sqrt" here will be done as
float sqrtf, rather than double. But the library version of sqrtf()
might actually call sqrt (double). If you want to be sure, be explicit
with sqrtf().

And on many platforms, sqrt (float or double) uses a library function
for full IEEE compatibility. With "-ffast-math", you are telling the compiler you promise that the operand for "sqrt" will be "nice", and it
can use a single hardware sqrt instruction. This will likely be a lot faster, especially if the float version is used. (Disclaimer - I
haven't looked at this on modern x86 targets. Check yourself - I
recommend putting your code into godbolt.org and examining the assembly.)

In the code that uses this function, you are starting with integer types
that need to be converted to float to pass to the distance function, and
the result of the call is used in a float expression before being
converted to double.

In short, it is a complete mess of conversions. And unless you are
using something like gcc's "-ffast-math" to say "don't worry about the
minor details of IEEE, optimise akin to integer arithmetic", then the compiler has to generate all these back-and-forth conversions.

Being consistent in your types is going to improve things, whether you
use floats or doubles. You might even be better off using integer
arithmetic in some points.

//fere below was float ->
double p = (R - distance2d_(x,y,point[i].x,point[i].y));

well that interesting..especially i was unaware of this sqrtf i will see
a bit later

as to -fast-math i dont noticed the difference though i was not testing
it besides simple sight.. i used it back years then but later i disabled
it as i get some bug in one code which was afair caused by that
(im not sure though, today i rarely code at all so im not to much fresh
to various test)

in fact i could more hardy optimise it just by building table with that
fading circle of size 45x45 and do a look up there (back then i was
doing a big doze of thsi level optimisations, but after all i know it is
to do on final stage of app as it generally makes harder to work on it
at live and test various changes, but as final stage its generally worth
if something runs 30-50% faster)

--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to David Brown on Tue Nov 5 14:42:22 2024

From Newsgroup: comp.lang.c

fir wrote:

David Brown wrote:

On 05/11/2024 10:49, fir wrote:

David Brown wrote:

On 04/11/2024 08:53, fir wrote:

float takes less space and when you keep arrays of floats for sure
float
is better (less spase and uses less memory bandwidth so i guess floats >>>>> can be as twice faster in some aspects)

Certainly if you have a lot of them, then the memory bandwidth and
cache
it rate can make floats faster than doubles.

but when you do calculations on local variables not floats do the
double is slower?

I assume that for the calculations in question, the accuracy and range >>>> of float is enough - otherwise the answer is obviously use doubles.

This is going to depend on the cpu, the type of instructions, the
source
code in question, the compiler and the options. So there is no single >>>> easy answer.

You can, as Bonita suggested, look up instruction timing information at >>>> agner.org for the cpu you are using (assuming it's an x86 device) to
get
some idea of any fundamental differences in timings. Usually for
modern
"big" processors, basic operations such as addition and multiplication >>>> are single cycle or faster (i.e., multiple instructions can be done in >>>> parallel) for float and double. But division, square root, and other
more complex operations can take a lot longer with doubles.

Next, consider if you can be using vector or SIMD operations. On some >>>> devices, you can do that with floats but not doubles - and even if you >>>> can use doubles, you can usually run floats at twice the rate.

In the source code, remember it is very easy to accidentally promote to >>>> double when writing in C. If you want to stick to floats, make sure
you
don't use double-precision constants - a missing "f" suffix can
change a
whole expression into double calculations. Remember that it takes time >>>> to convert between float and double.

Then look at your compiler flags - these can make a big difference to
the speed of floating point code. I'm giving gcc flags, because those >>>> are the ones I know - if you are using another compiler, look at the
details of its flags.

Obviously you want optimisation enabled if speed is relevant - -O2 is a >>>> good start. Make sure you are optimising for the cpu(s) you are
using -
"-march=native" is good for local programs, but you will want something >>>> more specific if the binary needs to run on a variety of machines. The >>>> closer you are to the exact cpu model, the better the code scheduling
and instruction choice can be.

Look closely at "-ffast-math" in the gcc manual. If that is suitable
for your code (and it often is), it can make a huge difference to
floating point intensive code. If it is unsuitable because you have
infinities, or need deterministic control of things like associativity, >>>> it will make your results wrong.

"-Wdouble-promotion" can be helpful to spot accidental use of
doubles in
what you think is a float expression. "-Wfloat-equal" is a good idea, >>>> especially if you are mixing floats and doubles. "-Wfloat-conversion" >>>> will warn about implicit conversions from doubles to floats (or to
integers).

the code that seem to speeded up a bit when turning float to double is

I've tried to snip the bits that are important here.

inline float distance2d_(float x1, float y1, float x2, float y2)
{
return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}

What happens here depends on what #include files you use. If you have
#include <math.h>, then "sqrt" is defined with doubles. So the
sum-of-squares expression is calculated using floats. Then this sum is
converted to a double (taking an extra instruction or two) before
calling double-precision sqrt. Then it is converting that result back
to float to return it.

If you have "#include <tgmath.h>", then "sqrt" here will be done as
float sqrtf, rather than double. But the library version of sqrtf()
might actually call sqrt (double). If you want to be sure, be explicit
with sqrtf().

And on many platforms, sqrt (float or double) uses a library function
for full IEEE compatibility. With "-ffast-math", you are telling the
compiler you promise that the operand for "sqrt" will be "nice", and it
can use a single hardware sqrt instruction. This will likely be a lot
faster, especially if the float version is used. (Disclaimer - I
haven't looked at this on modern x86 targets. Check yourself - I
recommend putting your code into godbolt.org and examining the assembly.)

In the code that uses this function, you are starting with integer types
that need to be converted to float to pass to the distance function, and
the result of the call is used in a float expression before being
converted to double.

In short, it is a complete mess of conversions. And unless you are
using something like gcc's "-ffast-math" to say "don't worry about the
minor details of IEEE, optimise akin to integer arithmetic", then the
compiler has to generate all these back-and-forth conversions.

Being consistent in your types is going to improve things, whether you
use floats or doubles. You might even be better off using integer
arithmetic in some points.

//fere below was float ->
double p = (R - distance2d_(x,y,point[i].x,point[i].y));

well that interesting..especially i was unaware of this sqrtf i will see
a bit later

as to -fast-math i dont noticed the difference though i was not testing
it besides simple sight.. i used it back years then but later i disabled
it as i get some bug in one code which was afair caused by that
(im not sure though, today i rarely code at all so im not to much fresh
to various test)

in fact i could more hardy optimise it just by building table with that fading circle of size 45x45 and do a look up there (back then i was
doing a big doze of thsi level optimisations, but after all i know it is
to do on final stage of app as it generally makes harder to work on it
at live and test various changes, but as final stage its generally worth
if something runs 30-50% faster)

ok i tested it though not extensibly (depends on how many those lights
overlap itself etc)

the version i got was 17 ms both full doubles or full floats (with
sqrtf) are 15 ms so its notable speedup, thsi mixing seem to was
wrong
as to what is faster float than doubles its both 15 ms though sometimes
float version blinks on 16 and double blinks sometimes for 14 ..so maybe double is slightly faster

may depend on which machine run probably as those times really wary on different cpus afaik
--- Synchronet 3.20a-Linux NewsLink 1.114

From fir@fir@grunge.pl to David Brown on Wed Nov 6 20:20:41 2024

From Newsgroup: comp.lang.c

fir wrote:

some can test it BTW

https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/view?usp=sharing

its for windows but worx under wine afair /and on linux wirtual machine
on windows also (afair, i dont know as i got only windows)

you may also see it on youtube if afraid to runn app (though app is much better)

https://www.youtube.com/watch?v=7_Fodb7ivZY

--- Synchronet 3.20a-Linux NewsLink 1.114

From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Wed Nov 6 12:55:42 2024

From Newsgroup: comp.lang.c

On 11/6/2024 11:20 AM, fir wrote:

fir wrote:

some can test it BTW

https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/
view?usp=sharing

its for windows but worx under wine afair /and on linux wirtual machine
on windows also (afair, i dont know as i got only windows)

you may also see it on youtube if afraid to runn app (though app is much better)

https://www.youtube.com/watch?v=7_Fodb7ivZY

Pretty nice! :^)
--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online
Recent Visitors
- Microbot
  Tue Nov 26 18:01:41 2024
  from Moore, Ok via Telnet
- Microbot
  Mon Nov 25 19:29:30 2024
  from Moore, Ok via Telnet
- Winston
  Mon Nov 25 10:05:27 2024
  from Kerrville, Tx via SSH
- Mousepad
  Sun Nov 24 23:54:38 2024
  from Green Bay, Wi via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	993
Nodes:	10 (0 / 10)
Uptime:	208:05:26
Calls:	12,972
Calls today:	1
Files:	186,574
Messages:	3,268,395

is double slower?

Who's Online

Recent Visitors

System Info