• is double slower?

    From fir@fir@grunge.pl to comp.lang.c on Mon Nov 4 08:53:00 2024
    From Newsgroup: comp.lang.c

    float takes less space and when you keep arrays of floats for sure float
    is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the double
    is slower?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Nov 4 01:27:01 2024
    From Newsgroup: comp.lang.c

    On 11/3/2024 11:53 PM, fir wrote:
    float takes less space and when you keep arrays of floats for sure float
    is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the double
    is slower?

    Ask the GPU.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to comp.lang.c on Mon Nov 4 15:43:30 2024
    From Newsgroup: comp.lang.c

    Chris M. Thomasson wrote:
    On 11/3/2024 11:53 PM, fir wrote:
    float takes less space and when you keep arrays of floats for sure float
    is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the
    double is slower?

    Ask the GPU.

    why? as tu cpu im not sure as on older cpus the calculations was anyway
    made on double hardware (?, im not so sure) even if you passed float
    to function im not sure if on assembly level yu not passed double

    then after sse afair you got scalar code for floats and doubles
    bbut simply i realized i dont know if double calculation on local
    variables (not arrays) are in fakt anyway notable slower


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Mon Nov 4 16:54:21 2024
    From Newsgroup: comp.lang.c

    Am 04.11.2024 um 08:53 schrieb fir:
    float takes less space and when you keep arrays of floats for sure float
    is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the double
    is slower?

    Look at the instruction tables at agner.org.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to comp.lang.c on Mon Nov 4 19:23:21 2024
    From Newsgroup: comp.lang.c

    fir wrote:
    Chris M. Thomasson wrote:
    On 11/3/2024 11:53 PM, fir wrote:
    float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the
    double is slower?

    Ask the GPU.

    why? as tu cpu im not sure as on older cpus the calculations was anyway
    made on double hardware (?, im not so sure) even if you passed float
    to function im not sure if on assembly level yu not passed double

    then after sse afair you got scalar code for floats and doubles
    bbut simply i realized i dont know if double calculation on local
    variables (not arrays) are in fakt anyway notable slower


    im writing some cpu intensive experiment (something liek alpha blending
    images on cpu mostly) and interestingly i just turned float into double
    in that routine and it speeded up (as far as i can see, as i dont have
    tme for much tests , changing to double turnet 35 ,s per frame into 34
    ms per frame


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Nov 4 12:48:20 2024
    From Newsgroup: comp.lang.c

    On 11/4/2024 10:23 AM, fir wrote:
    fir wrote:
    Chris M. Thomasson wrote:
    On 11/3/2024 11:53 PM, fir wrote:
    float takes less space and when you keep arrays of floats for sure
    float
    is better (less spase and uses less memory bandwidth so i guess floats >>>> can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the
    double is slower?

    Ask the GPU.

    why? as tu cpu im not sure as on older cpus the calculations was anyway
    made on double hardware (?, im not so sure) even if you passed float
    to function im not sure if on assembly level yu not passed  double

    then after sse afair you got scalar code for floats and doubles
    bbut simply i realized i dont know if double calculation on local
    variables (not arrays) are in fakt anyway notable slower


    im writing some cpu intensive experiment (something liek alpha blending images on cpu mostly) and interestingly i just turned float into double
    in that routine and it speeded up (as far as i can see, as i dont have
    tme for much tests , changing to double turnet 35 ,s per frame into 34
    ms per frame



    Well, do you need double precision anyway?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Nov 4 12:53:30 2024
    From Newsgroup: comp.lang.c

    On 11/4/2024 6:43 AM, fir wrote:
    Chris M. Thomasson wrote:
    On 11/3/2024 11:53 PM, fir wrote:
    float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)

    but when you do calculations on local variables not floats do the
    double is slower?

    Ask the GPU.

    why? as tu cpu im not sure as on older cpus the calculations was anyway
    made on double hardware (?, im not so sure) even if you passed float
    to function im not sure if on assembly level yu not passed  double

    In the realm of shaders float is the way. double is not always there.



    then after sse afair you got scalar code for floats and doubles
    bbut simply i realized i dont know if double calculation on local
    variables (not arrays) are in fakt anyway notable slower



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 09:14:15 2024
    From Newsgroup: comp.lang.c

    On 04/11/2024 08:53, fir wrote:
    float takes less space and when you keep arrays of floats for sure float
    is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)


    Certainly if you have a lot of them, then the memory bandwidth and cache
    it rate can make floats faster than doubles.

    but when you do calculations on local variables not floats do the double
    is slower?

    I assume that for the calculations in question, the accuracy and range
    of float is enough - otherwise the answer is obviously use doubles.


    This is going to depend on the cpu, the type of instructions, the source
    code in question, the compiler and the options. So there is no single
    easy answer.

    You can, as Bonita suggested, look up instruction timing information at agner.org for the cpu you are using (assuming it's an x86 device) to get
    some idea of any fundamental differences in timings. Usually for modern
    "big" processors, basic operations such as addition and multiplication
    are single cycle or faster (i.e., multiple instructions can be done in parallel) for float and double. But division, square root, and other
    more complex operations can take a lot longer with doubles.

    Next, consider if you can be using vector or SIMD operations. On some devices, you can do that with floats but not doubles - and even if you
    can use doubles, you can usually run floats at twice the rate.


    In the source code, remember it is very easy to accidentally promote to
    double when writing in C. If you want to stick to floats, make sure you
    don't use double-precision constants - a missing "f" suffix can change a
    whole expression into double calculations. Remember that it takes time
    to convert between float and double.


    Then look at your compiler flags - these can make a big difference to
    the speed of floating point code. I'm giving gcc flags, because those
    are the ones I know - if you are using another compiler, look at the
    details of its flags.

    Obviously you want optimisation enabled if speed is relevant - -O2 is a
    good start. Make sure you are optimising for the cpu(s) you are using - "-march=native" is good for local programs, but you will want something
    more specific if the binary needs to run on a variety of machines. The
    closer you are to the exact cpu model, the better the code scheduling
    and instruction choice can be.

    Look closely at "-ffast-math" in the gcc manual. If that is suitable
    for your code (and it often is), it can make a huge difference to
    floating point intensive code. If it is unsuitable because you have infinities, or need deterministic control of things like associativity,
    it will make your results wrong.

    "-Wdouble-promotion" can be helpful to spot accidental use of doubles in
    what you think is a float expression. "-Wfloat-equal" is a good idea, especially if you are mixing floats and doubles. "-Wfloat-conversion"
    will warn about implicit conversions from doubles to floats (or to
    integers).



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to David Brown on Tue Nov 5 10:49:18 2024
    From Newsgroup: comp.lang.c

    David Brown wrote:
    On 04/11/2024 08:53, fir wrote:
    float takes less space and when you keep arrays of floats for sure float
    is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)


    Certainly if you have a lot of them, then the memory bandwidth and cache
    it rate can make floats faster than doubles.

    but when you do calculations on local variables not floats do the
    double is slower?

    I assume that for the calculations in question, the accuracy and range
    of float is enough - otherwise the answer is obviously use doubles.


    This is going to depend on the cpu, the type of instructions, the source
    code in question, the compiler and the options. So there is no single
    easy answer.

    You can, as Bonita suggested, look up instruction timing information at agner.org for the cpu you are using (assuming it's an x86 device) to get
    some idea of any fundamental differences in timings. Usually for modern "big" processors, basic operations such as addition and multiplication
    are single cycle or faster (i.e., multiple instructions can be done in parallel) for float and double. But division, square root, and other
    more complex operations can take a lot longer with doubles.

    Next, consider if you can be using vector or SIMD operations. On some devices, you can do that with floats but not doubles - and even if you
    can use doubles, you can usually run floats at twice the rate.


    In the source code, remember it is very easy to accidentally promote to double when writing in C. If you want to stick to floats, make sure you don't use double-precision constants - a missing "f" suffix can change a whole expression into double calculations. Remember that it takes time
    to convert between float and double.


    Then look at your compiler flags - these can make a big difference to
    the speed of floating point code. I'm giving gcc flags, because those
    are the ones I know - if you are using another compiler, look at the
    details of its flags.

    Obviously you want optimisation enabled if speed is relevant - -O2 is a
    good start. Make sure you are optimising for the cpu(s) you are using - "-march=native" is good for local programs, but you will want something
    more specific if the binary needs to run on a variety of machines. The closer you are to the exact cpu model, the better the code scheduling
    and instruction choice can be.

    Look closely at "-ffast-math" in the gcc manual. If that is suitable
    for your code (and it often is), it can make a huge difference to
    floating point intensive code. If it is unsuitable because you have infinities, or need deterministic control of things like associativity,
    it will make your results wrong.

    "-Wdouble-promotion" can be helpful to spot accidental use of doubles in
    what you think is a float expression. "-Wfloat-equal" is a good idea, especially if you are mixing floats and doubles. "-Wfloat-conversion"
    will warn about implicit conversions from doubles to floats (or to
    integers).



    the code that seem to speeded up a bit when turning float to double is

    union Color
    {
    unsigned u;
    struct { unsigned char b,g,r,a;};
    };


    inline float distance2d_(float x1, float y1, float x2, float y2)
    {
    return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
    }

    inline unsigned GetPixelUnsafe_(int x, int y)
    {
    return frame_bitmap[y*frame_size_x+x];
    }
    inline void SetPixelUnsafe_(int x, int y, unsigned color)
    {
    frame_bitmap[y*frame_size_x+x]=color;
    }

    void DrawPoint(int i)
    {
    // if(!point[i].enabled) return;

    int xq = point[i].x;
    int yq = point[i].y;

    Color c;
    Color bc;

    if(d_toggler)
    {
    // DrawCircle(xq,yq,point[i].radius,0xffffff);
    FillCircle(xq,yq,point[i].radius,point[i].c.u);

    return;
    }

    float R = point[i].radius*5;

    int y_start = max(0, yq-R);
    int y_end = min(frame_size_y, yq+R);
    int x_start = max(0, xq-R);
    int x_end = min(frame_size_x, xq+R);

    for(int y = y_start; y<y_end; y++)
    {
    for(int x = x_start; x<x_end; x++)
    {
    //fere below was float ->
    double p = (R - distance2d_(x,y,point[i].x,point[i].y));


    if(!i_toggler)
    {
    if(p<0.4*R) continue;
    }
    else
    if(p<0) continue;

    p/=R;

    bc.u = GetPixelUnsafe_(x,y);
    int r = bc.r + (point[i].c.r)* p*p*p;
    int g = bc.g + (point[i].c.g)* p*p*p;
    int b = bc.b + (point[i].c.b)* p*p*p;

    if(!r_toggler)
    {
    if(r>255) r = 255;
    if(g>255) g = 255;
    if(b>255) b = 255;
    }

    c.r = r;
    c.g = g;
    c.b = b;

    SetPixelUnsafe_(x,y,c.u);

    }
    }

    }

    this just draws something like little light that darkens as 1/(r*r*r)
    and is able to add n-lights in place to mix colors end eventually
    "overlight" (so this is kinda blending)

    its very time consuming liek draving 100 of them (rhen r is 9) was
    taking 35 ms on old machine afair)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to David Brown on Tue Nov 5 10:51:19 2024
    From Newsgroup: comp.lang.c

    fir wrote:
    this just draws something like little light that darkens as 1/(r*r*r)
    and is able to add n-lights in place to mix colors end eventually
    "overlight" (so this is kinda blending)

    its very time consuming liek draving 100 of them (rhen r is 9) was
    taking 35 ms on old machine afair)


    right now i cant do test (must work on some other problems) of it but my previous faith like float is never slower possibly is being like under question


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to David Brown on Tue Nov 5 11:03:10 2024
    From Newsgroup: comp.lang.c

    fir wrote:
    David Brown wrote:
    On 04/11/2024 08:53, fir wrote:
    float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)


    Certainly if you have a lot of them, then the memory bandwidth and cache
    it rate can make floats faster than doubles.

    but when you do calculations on local variables not floats do the
    double is slower?

    I assume that for the calculations in question, the accuracy and range
    of float is enough - otherwise the answer is obviously use doubles.


    This is going to depend on the cpu, the type of instructions, the source
    code in question, the compiler and the options. So there is no single
    easy answer.

    You can, as Bonita suggested, look up instruction timing information at
    agner.org for the cpu you are using (assuming it's an x86 device) to get
    some idea of any fundamental differences in timings. Usually for modern
    "big" processors, basic operations such as addition and multiplication
    are single cycle or faster (i.e., multiple instructions can be done in
    parallel) for float and double. But division, square root, and other
    more complex operations can take a lot longer with doubles.

    Next, consider if you can be using vector or SIMD operations. On some
    devices, you can do that with floats but not doubles - and even if you
    can use doubles, you can usually run floats at twice the rate.


    In the source code, remember it is very easy to accidentally promote to
    double when writing in C. If you want to stick to floats, make sure you
    don't use double-precision constants - a missing "f" suffix can change a
    whole expression into double calculations. Remember that it takes time
    to convert between float and double.


    Then look at your compiler flags - these can make a big difference to
    the speed of floating point code. I'm giving gcc flags, because those
    are the ones I know - if you are using another compiler, look at the
    details of its flags.

    Obviously you want optimisation enabled if speed is relevant - -O2 is a
    good start. Make sure you are optimising for the cpu(s) you are using -
    "-march=native" is good for local programs, but you will want something
    more specific if the binary needs to run on a variety of machines. The
    closer you are to the exact cpu model, the better the code scheduling
    and instruction choice can be.

    Look closely at "-ffast-math" in the gcc manual. If that is suitable
    for your code (and it often is), it can make a huge difference to
    floating point intensive code. If it is unsuitable because you have
    infinities, or need deterministic control of things like associativity,
    it will make your results wrong.

    "-Wdouble-promotion" can be helpful to spot accidental use of doubles in
    what you think is a float expression. "-Wfloat-equal" is a good idea,
    especially if you are mixing floats and doubles. "-Wfloat-conversion"
    will warn about implicit conversions from doubles to floats (or to
    integers).



    the code that seem to speeded up a bit when turning float to double is

    union Color
    {
    unsigned u;
    struct { unsigned char b,g,r,a;};
    };


    inline float distance2d_(float x1, float y1, float x2, float y2)
    {
    return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
    }

    inline unsigned GetPixelUnsafe_(int x, int y)
    {
    return frame_bitmap[y*frame_size_x+x];
    }
    inline void SetPixelUnsafe_(int x, int y, unsigned color)
    {
    frame_bitmap[y*frame_size_x+x]=color;
    }

    void DrawPoint(int i)
    {
    // if(!point[i].enabled) return;

    int xq = point[i].x;
    int yq = point[i].y;

    Color c;
    Color bc;

    if(d_toggler)
    {
    // DrawCircle(xq,yq,point[i].radius,0xffffff);
    FillCircle(xq,yq,point[i].radius,point[i].c.u);

    return;
    }

    float R = point[i].radius*5;

    int y_start = max(0, yq-R);
    int y_end = min(frame_size_y, yq+R);
    int x_start = max(0, xq-R);
    int x_end = min(frame_size_x, xq+R);

    for(int y = y_start; y<y_end; y++)
    {
    for(int x = x_start; x<x_end; x++)
    {
    //fere below was float ->
    double p = (R - distance2d_(x,y,point[i].x,point[i].y));


    if(!i_toggler)
    {
    if(p<0.4*R) continue;
    }
    else
    if(p<0) continue;

    p/=R;

    bc.u = GetPixelUnsafe_(x,y);
    int r = bc.r + (point[i].c.r)* p*p*p;
    int g = bc.g + (point[i].c.g)* p*p*p;
    int b = bc.b + (point[i].c.b)* p*p*p;

    if(!r_toggler)
    {
    if(r>255) r = 255;
    if(g>255) g = 255;
    if(b>255) b = 255;
    }

    c.r = r;
    c.g = g;
    c.b = b;

    SetPixelUnsafe_(x,y,c.u);

    }
    }

    }

    this just draws something like little light that darkens as 1/(r*r*r)
    and is able to add n-lights in place to mix colors end eventually
    "overlight" (so this is kinda blending)

    its very time consuming liek draving 100 of them (rhen r is 9) was
    taking 35 ms on old machine afair)

    some can test it BTW


    https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/view?usp=sharing

    its for windows but worx under wine afair /and on linux wirtual machine
    on windows also (afair, i dont know as i got only windows)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Nov 5 11:25:15 2024
    From Newsgroup: comp.lang.c

    On 05/11/2024 10:49, fir wrote:
    David Brown wrote:
    On 04/11/2024 08:53, fir wrote:
    float takes less space and when you keep arrays of floats for sure float >>> is better (less spase and uses less memory bandwidth so i guess floats
    can be as twice faster in some aspects)


    Certainly if you have a lot of them, then the memory bandwidth and cache
    it rate can make floats faster than doubles.

    but when you do calculations on local variables not floats do the
    double is slower?

    I assume that for the calculations in question, the accuracy and range
    of float is enough - otherwise the answer is obviously use doubles.


    This is going to depend on the cpu, the type of instructions, the source
    code in question, the compiler and the options.  So there is no single
    easy answer.

    You can, as Bonita suggested, look up instruction timing information at
    agner.org for the cpu you are using (assuming it's an x86 device) to get
    some idea of any fundamental differences in timings.  Usually for modern
    "big" processors, basic operations such as addition and multiplication
    are single cycle or faster (i.e., multiple instructions can be done in
    parallel) for float and double.  But division, square root, and other
    more complex operations can take a lot longer with doubles.

    Next, consider if you can be using vector or SIMD operations.  On some
    devices, you can do that with floats but not doubles - and even if you
    can use doubles, you can usually run floats at twice the rate.


    In the source code, remember it is very easy to accidentally promote to
    double when writing in C.  If you want to stick to floats, make sure you
    don't use double-precision constants - a missing "f" suffix can change a
    whole expression into double calculations.  Remember that it takes time
    to convert between float and double.


    Then look at your compiler flags - these can make a big difference to
    the speed of floating point code.  I'm giving gcc flags, because those
    are the ones I know - if you are using another compiler, look at the
    details of its flags.

    Obviously you want optimisation enabled if speed is relevant - -O2 is a
    good start.  Make sure you are optimising for the cpu(s) you are using -
    "-march=native" is good for local programs, but you will want something
    more specific if the binary needs to run on a variety of machines.  The
    closer you are to the exact cpu model, the better the code scheduling
    and instruction choice can be.

    Look closely at "-ffast-math" in the gcc manual.  If that is suitable
    for your code (and it often is), it can make a huge difference to
    floating point intensive code.  If it is unsuitable because you have
    infinities, or need deterministic control of things like associativity,
    it will make your results wrong.

    "-Wdouble-promotion" can be helpful to spot accidental use of doubles in
    what you think is a float expression.  "-Wfloat-equal" is a good idea,
    especially if you are mixing floats and doubles.  "-Wfloat-conversion"
    will warn about implicit conversions from doubles to floats (or to
    integers).



    the code that seem to speeded up a bit when turning float to double is


    I've tried to snip the bits that are important here.

    inline float distance2d_(float x1, float y1, float x2, float y2)
     {
      return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
     }


    What happens here depends on what #include files you use. If you have #include <math.h>, then "sqrt" is defined with doubles. So the
    sum-of-squares expression is calculated using floats. Then this sum is converted to a double (taking an extra instruction or two) before
    calling double-precision sqrt. Then it is converting that result back
    to float to return it.

    If you have "#include <tgmath.h>", then "sqrt" here will be done as
    float sqrtf, rather than double. But the library version of sqrtf()
    might actually call sqrt (double). If you want to be sure, be explicit
    with sqrtf().

    And on many platforms, sqrt (float or double) uses a library function
    for full IEEE compatibility. With "-ffast-math", you are telling the
    compiler you promise that the operand for "sqrt" will be "nice", and it
    can use a single hardware sqrt instruction. This will likely be a lot
    faster, especially if the float version is used. (Disclaimer - I
    haven't looked at this on modern x86 targets. Check yourself - I
    recommend putting your code into godbolt.org and examining the assembly.)


    In the code that uses this function, you are starting with integer types
    that need to be converted to float to pass to the distance function, and
    the result of the call is used in a float expression before being
    converted to double.

    In short, it is a complete mess of conversions. And unless you are
    using something like gcc's "-ffast-math" to say "don't worry about the
    minor details of IEEE, optimise akin to integer arithmetic", then the
    compiler has to generate all these back-and-forth conversions.


    Being consistent in your types is going to improve things, whether you
    use floats or doubles. You might even be better off using integer
    arithmetic in some points.


         //fere below was float ->
        double p = (R - distance2d_(x,y,point[i].x,point[i].y));


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to David Brown on Tue Nov 5 11:42:38 2024
    From Newsgroup: comp.lang.c

    David Brown wrote:
    On 05/11/2024 10:49, fir wrote:
    David Brown wrote:
    On 04/11/2024 08:53, fir wrote:
    float takes less space and when you keep arrays of floats for sure
    float
    is better (less spase and uses less memory bandwidth so i guess floats >>>> can be as twice faster in some aspects)


    Certainly if you have a lot of them, then the memory bandwidth and cache >>> it rate can make floats faster than doubles.

    but when you do calculations on local variables not floats do the
    double is slower?

    I assume that for the calculations in question, the accuracy and range
    of float is enough - otherwise the answer is obviously use doubles.


    This is going to depend on the cpu, the type of instructions, the source >>> code in question, the compiler and the options. So there is no single
    easy answer.

    You can, as Bonita suggested, look up instruction timing information at
    agner.org for the cpu you are using (assuming it's an x86 device) to get >>> some idea of any fundamental differences in timings. Usually for modern >>> "big" processors, basic operations such as addition and multiplication
    are single cycle or faster (i.e., multiple instructions can be done in
    parallel) for float and double. But division, square root, and other
    more complex operations can take a lot longer with doubles.

    Next, consider if you can be using vector or SIMD operations. On some
    devices, you can do that with floats but not doubles - and even if you
    can use doubles, you can usually run floats at twice the rate.


    In the source code, remember it is very easy to accidentally promote to
    double when writing in C. If you want to stick to floats, make sure you >>> don't use double-precision constants - a missing "f" suffix can change a >>> whole expression into double calculations. Remember that it takes time
    to convert between float and double.


    Then look at your compiler flags - these can make a big difference to
    the speed of floating point code. I'm giving gcc flags, because those
    are the ones I know - if you are using another compiler, look at the
    details of its flags.

    Obviously you want optimisation enabled if speed is relevant - -O2 is a
    good start. Make sure you are optimising for the cpu(s) you are using - >>> "-march=native" is good for local programs, but you will want something
    more specific if the binary needs to run on a variety of machines. The
    closer you are to the exact cpu model, the better the code scheduling
    and instruction choice can be.

    Look closely at "-ffast-math" in the gcc manual. If that is suitable
    for your code (and it often is), it can make a huge difference to
    floating point intensive code. If it is unsuitable because you have
    infinities, or need deterministic control of things like associativity,
    it will make your results wrong.

    "-Wdouble-promotion" can be helpful to spot accidental use of doubles in >>> what you think is a float expression. "-Wfloat-equal" is a good idea,
    especially if you are mixing floats and doubles. "-Wfloat-conversion"
    will warn about implicit conversions from doubles to floats (or to
    integers).



    the code that seem to speeded up a bit when turning float to double is


    I've tried to snip the bits that are important here.

    inline float distance2d_(float x1, float y1, float x2, float y2)
    {
    return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
    }


    What happens here depends on what #include files you use. If you have #include <math.h>, then "sqrt" is defined with doubles. So the sum-of-squares expression is calculated using floats. Then this sum is converted to a double (taking an extra instruction or two) before
    calling double-precision sqrt. Then it is converting that result back
    to float to return it.

    If you have "#include <tgmath.h>", then "sqrt" here will be done as
    float sqrtf, rather than double. But the library version of sqrtf()
    might actually call sqrt (double). If you want to be sure, be explicit
    with sqrtf().

    And on many platforms, sqrt (float or double) uses a library function
    for full IEEE compatibility. With "-ffast-math", you are telling the compiler you promise that the operand for "sqrt" will be "nice", and it
    can use a single hardware sqrt instruction. This will likely be a lot faster, especially if the float version is used. (Disclaimer - I
    haven't looked at this on modern x86 targets. Check yourself - I
    recommend putting your code into godbolt.org and examining the assembly.)


    In the code that uses this function, you are starting with integer types
    that need to be converted to float to pass to the distance function, and
    the result of the call is used in a float expression before being
    converted to double.

    In short, it is a complete mess of conversions. And unless you are
    using something like gcc's "-ffast-math" to say "don't worry about the
    minor details of IEEE, optimise akin to integer arithmetic", then the compiler has to generate all these back-and-forth conversions.


    Being consistent in your types is going to improve things, whether you
    use floats or doubles. You might even be better off using integer
    arithmetic in some points.


    //fere below was float ->
    double p = (R - distance2d_(x,y,point[i].x,point[i].y));


    well that interesting..especially i was unaware of this sqrtf i will see
    a bit later

    as to -fast-math i dont noticed the difference though i was not testing
    it besides simple sight.. i used it back years then but later i disabled
    it as i get some bug in one code which was afair caused by that
    (im not sure though, today i rarely code at all so im not to much fresh
    to various test)

    in fact i could more hardy optimise it just by building table with that
    fading circle of size 45x45 and do a look up there (back then i was
    doing a big doze of thsi level optimisations, but after all i know it is
    to do on final stage of app as it generally makes harder to work on it
    at live and test various changes, but as final stage its generally worth
    if something runs 30-50% faster)

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to David Brown on Tue Nov 5 14:42:22 2024
    From Newsgroup: comp.lang.c

    fir wrote:
    David Brown wrote:
    On 05/11/2024 10:49, fir wrote:
    David Brown wrote:
    On 04/11/2024 08:53, fir wrote:
    float takes less space and when you keep arrays of floats for sure
    float
    is better (less spase and uses less memory bandwidth so i guess floats >>>>> can be as twice faster in some aspects)


    Certainly if you have a lot of them, then the memory bandwidth and
    cache
    it rate can make floats faster than doubles.

    but when you do calculations on local variables not floats do the
    double is slower?

    I assume that for the calculations in question, the accuracy and range >>>> of float is enough - otherwise the answer is obviously use doubles.


    This is going to depend on the cpu, the type of instructions, the
    source
    code in question, the compiler and the options. So there is no single >>>> easy answer.

    You can, as Bonita suggested, look up instruction timing information at >>>> agner.org for the cpu you are using (assuming it's an x86 device) to
    get
    some idea of any fundamental differences in timings. Usually for
    modern
    "big" processors, basic operations such as addition and multiplication >>>> are single cycle or faster (i.e., multiple instructions can be done in >>>> parallel) for float and double. But division, square root, and other
    more complex operations can take a lot longer with doubles.

    Next, consider if you can be using vector or SIMD operations. On some >>>> devices, you can do that with floats but not doubles - and even if you >>>> can use doubles, you can usually run floats at twice the rate.


    In the source code, remember it is very easy to accidentally promote to >>>> double when writing in C. If you want to stick to floats, make sure
    you
    don't use double-precision constants - a missing "f" suffix can
    change a
    whole expression into double calculations. Remember that it takes time >>>> to convert between float and double.


    Then look at your compiler flags - these can make a big difference to
    the speed of floating point code. I'm giving gcc flags, because those >>>> are the ones I know - if you are using another compiler, look at the
    details of its flags.

    Obviously you want optimisation enabled if speed is relevant - -O2 is a >>>> good start. Make sure you are optimising for the cpu(s) you are
    using -
    "-march=native" is good for local programs, but you will want something >>>> more specific if the binary needs to run on a variety of machines. The >>>> closer you are to the exact cpu model, the better the code scheduling
    and instruction choice can be.

    Look closely at "-ffast-math" in the gcc manual. If that is suitable
    for your code (and it often is), it can make a huge difference to
    floating point intensive code. If it is unsuitable because you have
    infinities, or need deterministic control of things like associativity, >>>> it will make your results wrong.

    "-Wdouble-promotion" can be helpful to spot accidental use of
    doubles in
    what you think is a float expression. "-Wfloat-equal" is a good idea, >>>> especially if you are mixing floats and doubles. "-Wfloat-conversion" >>>> will warn about implicit conversions from doubles to floats (or to
    integers).



    the code that seem to speeded up a bit when turning float to double is


    I've tried to snip the bits that are important here.

    inline float distance2d_(float x1, float y1, float x2, float y2)
    {
    return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
    }


    What happens here depends on what #include files you use. If you have
    #include <math.h>, then "sqrt" is defined with doubles. So the
    sum-of-squares expression is calculated using floats. Then this sum is
    converted to a double (taking an extra instruction or two) before
    calling double-precision sqrt. Then it is converting that result back
    to float to return it.

    If you have "#include <tgmath.h>", then "sqrt" here will be done as
    float sqrtf, rather than double. But the library version of sqrtf()
    might actually call sqrt (double). If you want to be sure, be explicit
    with sqrtf().

    And on many platforms, sqrt (float or double) uses a library function
    for full IEEE compatibility. With "-ffast-math", you are telling the
    compiler you promise that the operand for "sqrt" will be "nice", and it
    can use a single hardware sqrt instruction. This will likely be a lot
    faster, especially if the float version is used. (Disclaimer - I
    haven't looked at this on modern x86 targets. Check yourself - I
    recommend putting your code into godbolt.org and examining the assembly.)


    In the code that uses this function, you are starting with integer types
    that need to be converted to float to pass to the distance function, and
    the result of the call is used in a float expression before being
    converted to double.

    In short, it is a complete mess of conversions. And unless you are
    using something like gcc's "-ffast-math" to say "don't worry about the
    minor details of IEEE, optimise akin to integer arithmetic", then the
    compiler has to generate all these back-and-forth conversions.


    Being consistent in your types is going to improve things, whether you
    use floats or doubles. You might even be better off using integer
    arithmetic in some points.


    //fere below was float ->
    double p = (R - distance2d_(x,y,point[i].x,point[i].y));


    well that interesting..especially i was unaware of this sqrtf i will see
    a bit later

    as to -fast-math i dont noticed the difference though i was not testing
    it besides simple sight.. i used it back years then but later i disabled
    it as i get some bug in one code which was afair caused by that
    (im not sure though, today i rarely code at all so im not to much fresh
    to various test)

    in fact i could more hardy optimise it just by building table with that fading circle of size 45x45 and do a look up there (back then i was
    doing a big doze of thsi level optimisations, but after all i know it is
    to do on final stage of app as it generally makes harder to work on it
    at live and test various changes, but as final stage its generally worth
    if something runs 30-50% faster)


    ok i tested it though not extensibly (depends on how many those lights
    overlap itself etc)

    the version i got was 17 ms both full doubles or full floats (with
    sqrtf) are 15 ms so its notable speedup, thsi mixing seem to was
    wrong
    as to what is faster float than doubles its both 15 ms though sometimes
    float version blinks on 16 and double blinks sometimes for 14 ..so maybe double is slightly faster

    may depend on which machine run probably as those times really wary on different cpus afaik
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From fir@fir@grunge.pl to David Brown on Wed Nov 6 20:20:41 2024
    From Newsgroup: comp.lang.c

    fir wrote:

    some can test it BTW


    https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/view?usp=sharing


    its for windows but worx under wine afair /and on linux wirtual machine
    on windows also (afair, i dont know as i got only windows)

    you may also see it on youtube if afraid to runn app (though app is much better)

    https://www.youtube.com/watch?v=7_Fodb7ivZY

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Wed Nov 6 12:55:42 2024
    From Newsgroup: comp.lang.c

    On 11/6/2024 11:20 AM, fir wrote:
    fir wrote:

    some can test it BTW


    https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/
    view?usp=sharing


    its for windows but worx under wine afair /and on linux wirtual machine
    on windows also (afair, i dont know as i got only windows)

    you may also see it on youtube if afraid to runn app (though app is much better)

    https://www.youtube.com/watch?v=7_Fodb7ivZY


    Pretty nice! :^)
    --- Synchronet 3.20a-Linux NewsLink 1.114