Oh that's interesting! I'm glad I asked about this then :) I'm reproducing this on Summit (POWER arch) with Argobots 1.0 and gcc (everything built using spack). I cannot reproduce it on my laptop (x86_64 arch) with Argobots 1.0 and gcc, but there are so many differences between my laptop and Summit I wasn't sure where to start :) I'll try using the most recent git revision on Summit and see what that does. thanks, -Phil On 1/6/21 2:06 PM, Iwasaki, Shintaro wrote:
Hi Phil,
Thank you for a good question! I created an issue: https://github.com/pmodels/argobots/issues/287
Yes, what you expect is correct. A ULT (lock/unlock) may not yield if there is no contention. We guarantee this and will make it clear in the specification.
The current Argobots (assuming the current master) should work as you expect; ULT A should never yield in your case.
In the case of Argobots 1.0 or 1.0.1, a ULT may yield because of the following possible reasons, both of which are fixed in the current master: 1. Lock is not performed atomically "strong" while the architecture supports weak atomics (e.g., on ARM and POWER) (fixed by https://github.com/pmodels/argobots/pull/223) 2. If you explicitly pass `--disable-simple-mutex` at configuration time, the previous mutex-handover mechanism may have this issue (fixed by https://github.com/pmodels/argobots/pull/268)
Regarding 1., because some atomic instructions spuriously fail ("weak" https://en.cppreference.com/w/c/atomic/atomic_compare_exchange), maybe the current spinlock implementation in Argobots causes this issue if you are using non-Intel hardware. I'd be happy if you could let us know what combination of hardware and compiler (with a compiler version) you are using. If you are using Intel hardware, I believe the current Argobots master work correctly unless you use a not-so-common compiler (e.g., PGI), but I will check.
(Note that a priority lock/unlock is just a hint, so it will not help.)
Anyway, I should make this point clearer in the specification. At the same time, I will add a test to see if this is really the case. If the current mechanism is broken, I will fix it. Please estimate that this clarification and fix (if possible) will come this week.
Thanks, Shintaro Iwasaki
------------------------------------------------------------------------ *From:* Phil Carns via discuss <[email protected]> *Sent:* Wednesday, January 6, 2021 12:27 PM *To:* [email protected] <[email protected]> *Cc:* Carns, Philip H. <[email protected]> *Subject:* [argobots-discuss] question about ABT_mutex and ULT scheduling Hi all,
We've isolated a situation where the ABT_mutex construct is behaving a little differently than I expected. We have two ULTs running on a single ES. The ULTs are using ABT_mutex_lock/free() to protect a shared data structure. This specific configuration will never have lock contention (the mutex is really there to protect more complex configurations where there are more ESs and ULTs participating than what I described above).
Here is what puzzles me: I'm not 100% sure, but it really looks like ULT A yields to ULT B when attempting to lock the mutex sometimes, even though there is no contention.
This is a performance bug for us; ULT B is only supposed to execute when ULT A is idle in this configuration. We don't really want to give up execution when acquiring an uncontested mutex if we don't have to.
I'm sure we could work around it (restructuring our code, or using a spinlock or priority mutex or something), but I wanted to ask on the list first: is the behavior I described above (a ULT yielding on a mutex lock, even if the mutex is available) expected? Or is it an indication that we are doing something wrong somewhere? I want to make sure that I understand the problem before altering the code.
thanks!
-Phil
_______________________________________________ discuss mailing list [email protected] https://lists.argobots.org/mailman/listinfo/discuss <https://lists.argobots.org/mailman/listinfo/discuss>