• signalling a condvar from inside vs. signalling a condvar von outside

    From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sat Apr 12 17:23:16 2025
    From Newsgroup: comp.lang.c++

    #include <iostream>
    #include <thread>
    #include <mutex>
    #include <condition_variable>
    #include <sys/resource.h>

    using namespace std;

    int main()
    {
    struct synch_t
    {
    mutex mtx;
    bool signalled = false;
    condition_variable cv;
    };
    atomic_long nVoluntary;
    auto thr = [&]( synch_t &me, synch_t &you, auto &&toYou )
    {
    for( size_t r = 10'000; r; --r )
    {
    unique_lock lockMe( me.mtx );
    me.cv.wait( lockMe, [&] { return me.signalled; } );
    me.signalled = false;
    lockMe.unlock();
    toYou( you );
    }
    rusage ru;
    getrusage( RUSAGE_THREAD, &ru );
    nVoluntary.fetch_add( ru.ru_nvcsw );
    };
    auto doIt = [&]( char const *what, auto &&toYou )
    {
    synch_t synchA { {}, true, {} }, synchB;
    nVoluntary.store( 0, memory_order_relaxed );
    jthread
    thrA( [&]() { thr( synchA, synchB, move( toYou ) ); } ),
    thrB( [&]() { thr( synchB, synchA, move( toYou ) ); } );
    thrA.join();
    thrB.join();
    cout << what << nVoluntary.load( memory_order_relaxed ) << endl;
    };
    doIt( "inside: ", []( synch_t &synch )
    {
    lock_guard lock( synch.mtx );
    synch.signalled = true;
    synch.cv.notify_one();
    } );
    doIt( "outside: ", []( synch_t &synch )
    {
    unique_lock lock( synch.mtx );
    synch.signalled = true;
    lock.unlock();
    synch.cv.notify_one();
    } );

    }

    It nearly doesn't matter in terms of numbers of context switches if
    you signal a condvar from inside our outside. The above program run
    on a Zen4-CPU with WSL2:

    inside: 20130
    outside: 19811

    On a 28 core Skylake-CPU with Ubuntu:

    inside: 19997
    outside: 19888

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sat Apr 12 12:33:57 2025
    From Newsgroup: comp.lang.c++

    On 4/12/2025 8:23 AM, Bonita Montero wrote:
    [...]
    It nearly doesn't matter in terms of numbers of context switches if
    you  signal a condvar from inside our outside. The above program run
    on a Zen4-CPU with WSL2:

        inside: 20130
        outside: 19811

    On a 28 core Skylake-CPU with Ubuntu:

        inside: 19997
        outside: 19888


    There is a scalability problem wrt signalling inside the critical
    section. Does your convdar impl use wait morphing?
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Apr 13 17:38:19 2025
    From Newsgroup: comp.lang.c++

    Am 12.04.2025 um 21:33 schrieb Chris M. Thomasson:

    There is a scalability problem wrt signalling inside the critical
    section. Does your convdar impl use wait morphing?

    There's no scalability problem with that since the kernel call to
    release a thread happens only when the mutex is accessible *and*
    the cv is signalled.


    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Apr 13 12:32:45 2025
    From Newsgroup: comp.lang.c++

    On 4/13/2025 8:38 AM, Bonita Montero wrote:
    Am 12.04.2025 um 21:33 schrieb Chris M. Thomasson:

    There is a scalability problem wrt signalling inside the critical
    section. Does your convdar impl use wait morphing?

    There's no scalability problem with that since the kernel call to
    release a thread happens only when the mutex is accessible *and*
    the cv is signalled.



    No. When you signal a condvar while holding the lock it means that
    waiters can wake and just instantly wait on the lock. This is why wait morphing was created. It helps, but only so much...
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Sun Apr 13 22:08:15 2025
    From Newsgroup: comp.lang.c++

    Am 13.04.2025 um 21:32 schrieb Chris M. Thomasson:
    On 4/13/2025 8:38 AM, Bonita Montero wrote:
    Am 12.04.2025 um 21:33 schrieb Chris M. Thomasson:

    There is a scalability problem wrt signalling inside the critical
    section. Does your convdar impl use wait morphing?

    There's no scalability problem with that since the kernel call to
    release a thread happens only when the mutex is accessible *and*
    the cv is signalled.



    No. ...

    The numer of context-switches my code shows say that there's only
    one context-switchz per wait.

    When you signal a condvar while holding the lock it means that waiters can wake and just instantly wait on the lock. This is why wait
    morphing was created. It helps, but only so much...


    idiot.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Sun Apr 13 14:40:40 2025
    From Newsgroup: comp.lang.c++

    On 4/13/2025 1:08 PM, Bonita Montero wrote:
    Am 13.04.2025 um 21:32 schrieb Chris M. Thomasson:
    On 4/13/2025 8:38 AM, Bonita Montero wrote:
    Am 12.04.2025 um 21:33 schrieb Chris M. Thomasson:

    There is a scalability problem wrt signalling inside the critical
    section. Does your convdar impl use wait morphing?

    There's no scalability problem with that since the kernel call to
    release a thread happens only when the mutex is accessible *and*
    the cv is signalled.



    No. ...

    The numer of context-switches my code shows say that there's only
    one context-switchz per wait.

    You code is hard to read. Sigh. Signalling while locked or unlocked was
    an old debate. Think of signalling while holding the lock. A thread gets
    woken and immediately sees that the lock is held. Oh well. Wait morphing
    can help with that. However, signal outside when you can...


    When you signal a condvar while holding the lock it means that
    waiters can wake and just instantly wait on the lock. This is why wait
    morphing was created. It helps, but only so much...


    idiot.


    Whatever.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Tue Apr 15 11:41:06 2025
    From Newsgroup: comp.lang.c++

    Am 13.04.2025 um 23:40 schrieb Chris M. Thomasson:

    You code is hard to read. ...

    The code is beautiful.

    Signalling while locked or unlocked was an old debate. Think of signalling while holding the lock. A thread gets woken and immediately sees that the lock is held. Oh well. Wait morphing can help with that. However, signal outside when you can...

    The number of context-switches determined via getrusage() is twice per
    loop iteration, i.e. on switch to the thread and one switch from the
    thread; so everything is optimal with glibc.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Tue Apr 15 12:07:23 2025
    From Newsgroup: comp.lang.c++

    On 4/15/2025 2:41 AM, Bonita Montero wrote:
    Am 13.04.2025 um 23:40 schrieb Chris M. Thomasson:

    You code is hard to read. ...

    The code is beautiful.

    I can understand it, but, well, shit happens. I would not say it's beautiful... But, that is just me. Again, shit happens.


    Signalling while locked or unlocked was an old debate. Think of
    signalling
    while holding the lock. A thread gets woken and immediately sees that the
    lock is held. Oh well. Wait morphing  can help with that. However, signal >> outside when you can...

    The number of context-switches determined via getrusage() is twice per
    loop iteration, i.e. on switch to the thread and one switch from the
    thread; so everything is optimal with glibc.

    In real applications there is generally more going on in those critical sections vs your test... Well, I have seen some horror shows in my life.

    Again, think of a scenario where the lock is held. The thread signals... Another thread wakes up and has to instantly block on a wait morphing
    queue in the kernel. This is not "ideal". A signal outside of the mutex
    can be beneficial. Signalling outside can give a signaled thread a
    possible fast-path into the critical section, completely eliminating the
    need for kernel interaction. Now, an adaptive mutex can try to help with
    this via limited spinning... However, try to signal outside when you can.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Wed Apr 16 06:12:20 2025
    From Newsgroup: comp.lang.c++

    Am 15.04.2025 um 21:07 schrieb Chris M. Thomasson:

    In real applications there is generally more going on in those critical sections vs your test... Well, I have seen some horror shows in my life.

    Again, think of a scenario where the lock is held. The thread signals... Another thread wakes up and has to instantly block on a wait morphing
    queue in the kernel.

    Not with glibc.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Wed Apr 16 15:59:09 2025
    From Newsgroup: comp.lang.c++

    On 4/15/2025 9:12 PM, Bonita Montero wrote:
    Am 15.04.2025 um 21:07 schrieb Chris M. Thomasson:

    In real applications there is generally more going on in those
    critical sections vs your test... Well, I have seen some horror shows
    in my life.

    Again, think of a scenario where the lock is held. The thread
    signals... Another thread wakes up and has to instantly block on a
    wait morphing queue in the kernel.

    Not with glibc.


    Sigh. I would need to see how glibc works internally. But that is
    besides the point. Try to signal/broadcast outside when possible.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Thu Apr 17 05:58:31 2025
    From Newsgroup: comp.lang.c++

    Am 17.04.2025 um 00:59 schrieb Chris M. Thomasson:

    Sigh. I would need to see how glibc works internally. But that is
    besides the point. Try to signal/broadcast outside when possible.

    As I've shown that's not necessary with glibc; the number of context
    switches and the CPU time is nearly the same for both cases.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Wed Apr 16 22:26:21 2025
    From Newsgroup: comp.lang.c++

    On 4/16/2025 8:58 PM, Bonita Montero wrote:
    Am 17.04.2025 um 00:59 schrieb Chris M. Thomasson:

    Sigh. I would need to see how glibc works internally. But that is
    besides the point. Try to signal/broadcast outside when possible.

    As I've shown that's not necessary with glibc; the number of context
    switches and the CPU time is nearly the same for both cases.


    So, signal wherever you like! I don't care. I will signal outside when I
    can. That's that.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c++ on Thu Apr 17 13:20:40 2025
    From Newsgroup: comp.lang.c++

    Am 17.04.2025 um 07:26 schrieb Chris M. Thomasson:

    So, signal wherever you like! I don't care. I will signal outside when I can. That's that.

    Of course you can, but it doesn't matter if you signal from outside or
    inside.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c++ on Thu Apr 17 10:51:58 2025
    From Newsgroup: comp.lang.c++

    On 4/17/2025 4:20 AM, Bonita Montero wrote:
    Am 17.04.2025 um 07:26 schrieb Chris M. Thomasson:

    So, signal wherever you like! I don't care. I will signal outside when
    I can. That's that.

    Of course you can, but it doesn't matter if you signal from outside or inside.


    The only thing I can say is that signalling, especially broadcasting,
    from the outside is ideal no matter what lib's you are using. If the lib
    has a very clever wait morph, so be it. Were are talking about scaling mutexes, so, well, ugggh.
    --- Synchronet 3.20c-Linux NewsLink 1.2