dumping stack information for running ULTs
Hi all, I followed Shintaro's suggestions from the previous list email thread and was able to get a stack dump routine that works as expected. For all pools under my control I can see stack information like this for each ULT, with stack unwinding: == pool (0x557d5b8c4580) == === ULT (0x7f47429590c1) === id : 0 ctx : 0x7f4742959120 p_ctx : 0x7f4742958fe0 p_link : (nil) stack : 0x7f47427590c0 stacksize : 2097152 #0 0x7f4744461760 in ythread_unwind_stack () <+16> (RSP = 0x7f4742959010) #1 0x7f474445e3dd in ABT_thread_yield () <+157> (RSP = 0x7f4742959020) #2 0x7f474447e255 in __margo_hg_progress_fn () <+645> (RSP = 0x7f4742959040) #3 0x7f474446381e in ABTD_ythread_func_wrapper () <+30> (RSP = 0x7f47429590b0) #4 0x7f47444639c1 in make_fcontext () <+33> (RSP = 0x7f47429590c0) 00007f47427590c0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f47427590e0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ... That only works for ULTs that are literally in the pool, though, I think. I don't believe that I am getting information for ULTs that are executing (and thus are not presently in a pool data structure). Is there any way to accomplish that? I tried to at least get the caller's own stack at least by doing something like this: ABT_thread_self(&self); ABT_info_print_thread_stack(outfile, self); That almost works, but I'm just getting the raw address information and not the translated stack unwinding: p_ctx : 0x7f4742758fb8 p_link : (nil) stack : 0x7f4742559000 stacksize : 2097152 00007f4742559000: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f4742559020: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ... For completeness it would be nice if the caller's stack were also human readable, and even better if I could somehow find stack information for ULTs executing on other ESs as well. For our use case we typically spawn many detached threads into a service pool, so we aren't tracking thread references. thanks, -Phil
Hi Phil,
That only works for ULTs that are literally in the pool, though, I think. Yes, that's right. If that ULT yields, the Argobots runtime can print the stack by either ABT_info_print_thread_stacks_in_pool() or ABT_info_print_thread_stack().
I don't believe that I am getting information for ULTs that are executing Technically, it's very challenging to dump a "running" ULT's stack from another ULT. libunwind needs a stack pointer (in reality more than a stack pointer), but we cannot get it from a running ULT. For example, the following currently does not work as expected. The Argobots runtime cannot unwind the function stack. ABT_info_print_thread_stack(outfile, another_running_thread); // This does not work. We need to stop a ULT to check its stack information.
- Complete "ABT_info_print_thread_stack" Printing a function call stack of a running ULT using libunwind is very challenging. The Argobots runtime needs to identify whether a ULT is running or not, and if it is running, it must be stopped by a signal. This is extremely error-prone. I don't know how to guarantee the atomicity (e.g., two threads call this function for the same ULT?). Stopping a specific execution stream (=Pthreads) might not be portable across OSs, so it might need a few fallback implementations (pthread_kill or some signal-related function). No matter Argobots supports it or not, using a signal itself badly affects other runtimes and applications that use Argobots (e.g., system calls fail unexpectedly, pthread_cond_wait() will wake up, ...). Since it's stack unwinding of the running Pthreads (i.e., nothing is ULT-specific), I am not sure if this feature must be supported by the Argobots runtime considering the harmfulness of the potential signal-based implementation. I understand the demand, and most users might feel the current implementation is incomplete. If this feature is really needed, we will more seriously consider the design, but please do not assume that we can provide this implementation very soon, though finally the priority depends on urgency and significance. - "ABT_self_print_thread_stack" The following implementation is easy. We can safely stop the caller ULT, so libunwind can print the function stack. ABT_self_print_thread_stack(FILE *fp); If this is sufficient, I can implement this quickly. To print the running ULT's stack, basically the user can launch a signal handler on its underlying execution stream and call this function in that signal handler. We do not plan to guarantee async-signal safety for ABT_self_print_thread_stack(), so if this is the way to go, please let us know so that we will do our best to make it async-signal safe. Thanks, Shintaro ________________________________ From: Phil Carns via discuss <[email protected]> Sent: Wednesday, April 14, 2021 6:44 PM To: [email protected] <[email protected]> Cc: Carns, Philip H. <[email protected]> Subject: [argobots-discuss] dumping stack information for running ULTs Hi all, I followed Shintaro's suggestions from the previous list email thread and was able to get a stack dump routine that works as expected. For all pools under my control I can see stack information like this for each ULT, with stack unwinding: == pool (0x557d5b8c4580) == === ULT (0x7f47429590c1) === id : 0 ctx : 0x7f4742959120 p_ctx : 0x7f4742958fe0 p_link : (nil) stack : 0x7f47427590c0 stacksize : 2097152 #0 0x7f4744461760 in ythread_unwind_stack () <+16> (RSP = 0x7f4742959010) #1 0x7f474445e3dd in ABT_thread_yield () <+157> (RSP = 0x7f4742959020) #2 0x7f474447e255 in __margo_hg_progress_fn () <+645> (RSP = 0x7f4742959040) #3 0x7f474446381e in ABTD_ythread_func_wrapper () <+30> (RSP = 0x7f47429590b0) #4 0x7f47444639c1 in make_fcontext () <+33> (RSP = 0x7f47429590c0) 00007f47427590c0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f47427590e0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ... That only works for ULTs that are literally in the pool, though, I think. I don't believe that I am getting information for ULTs that are executing (and thus are not presently in a pool data structure). Is there any way to accomplish that? I tried to at least get the caller's own stack at least by doing something like this: ABT_thread_self(&self); ABT_info_print_thread_stack(outfile, self); That almost works, but I'm just getting the raw address information and not the translated stack unwinding: p_ctx : 0x7f4742758fb8 p_link : (nil) stack : 0x7f4742559000 stacksize : 2097152 00007f4742559000: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f4742559020: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ... For completeness it would be nice if the caller's stack were also human readable, and even better if I could somehow find stack information for ULTs executing on other ESs as well. For our use case we typically spawn many detached threads into a service pool, so we aren't tracking thread references. thanks, -Phil
Hi Shintaro, Thanks for the explanation, that all makes sense. In a perfect world it would be nice to display stack information from running ULTs, but it's not worth incurring the technical debt to pursue any of those methods to collect it right now. I'll stick with the stack traces of ULTs in the pool for our use case (in conjunction with information that I can report from our own explicit tracking). This discussion was very helpful so that I know how to label the data clearly, though. thanks! -Phil On 4/14/21 9:02 PM, Iwasaki, Shintaro wrote:
Hi Phil,
That only works for ULTs that are literally in the pool, though, I think. Yes, that's right. If that ULT yields, the Argobots runtime can print the stack by either ABT_info_print_thread_stacks_in_pool() or ABT_info_print_thread_stack().
I don't believe that I am getting information for ULTs that are executing Technically, it's very challenging to dump a "running" ULT's stack from another ULT. libunwind needs a stack pointer (in reality more than a stack pointer), but we cannot get it from a running ULT. For example, the following currently does not work as expected. The Argobots runtime cannot unwind the function stack. ABT_info_print_thread_stack(outfile, another_running_thread); // This does not work. We need to stop a ULT to check its stack information.
- Complete "ABT_info_print_thread_stack" Printing a function call stack of a running ULT using libunwind is very challenging. The Argobots runtime needs to identify whether a ULT is running or not, and if it is running, it must be stopped by a signal. This is extremely error-prone. I don't know how to guarantee the atomicity (e.g., two threads call this function for the same ULT?). Stopping a specific execution stream (=Pthreads) might not be portable across OSs, so it might need a few fallback implementations (pthread_kill or some signal-related function). No matter Argobots supports it or not, using a signal itself badly affects other runtimes and applications that use Argobots (e.g., system calls fail unexpectedly, pthread_cond_wait() will wake up, ...). Since it's stack unwinding of the running Pthreads (i.e., nothing is ULT-specific), I am not sure if this feature must be supported by the Argobots runtime considering the harmfulness of the potential signal-based implementation.
I understand the demand, and most users might feel the current implementation is incomplete. If this feature is really needed, we will more seriously consider the design, but please do not assume that we can provide this implementation very soon, though finally the priority depends on urgency and significance.
- "ABT_self_print_thread_stack" The following implementation is easy. We can safely stop the caller ULT, so libunwind can print the function stack. ABT_self_print_thread_stack(FILE *fp); If this is sufficient, I can implement this quickly. To print the running ULT's stack, basically the user can launch a signal handler on its underlying execution stream and call this function in that signal handler. We do not plan to guarantee async-signal safety for ABT_self_print_thread_stack(), so if this is the way to go, please let us know so that we will do our best to make it async-signal safe.
Thanks, Shintaro
------------------------------------------------------------------------ *From:* Phil Carns via discuss <[email protected]> *Sent:* Wednesday, April 14, 2021 6:44 PM *To:* [email protected] <[email protected]> *Cc:* Carns, Philip H. <[email protected]> *Subject:* [argobots-discuss] dumping stack information for running ULTs
Hi all,
I followed Shintaro's suggestions from the previous list email thread and was able to get a stack dump routine that works as expected. For all pools under my control I can see stack information like this for each ULT, with stack unwinding:
== pool (0x557d5b8c4580) == === ULT (0x7f47429590c1) === id : 0 ctx : 0x7f4742959120 p_ctx : 0x7f4742958fe0 p_link : (nil) stack : 0x7f47427590c0 stacksize : 2097152 #0 0x7f4744461760 in ythread_unwind_stack () <+16> (RSP = 0x7f4742959010) #1 0x7f474445e3dd in ABT_thread_yield () <+157> (RSP = 0x7f4742959020) #2 0x7f474447e255 in __margo_hg_progress_fn () <+645> (RSP = 0x7f4742959040) #3 0x7f474446381e in ABTD_ythread_func_wrapper () <+30> (RSP = 0x7f47429590b0) #4 0x7f47444639c1 in make_fcontext () <+33> (RSP = 0x7f47429590c0) 00007f47427590c0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f47427590e0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ...
That only works for ULTs that are literally in the pool, though, I think. I don't believe that I am getting information for ULTs that are executing (and thus are not presently in a pool data structure).
Is there any way to accomplish that?
I tried to at least get the caller's own stack at least by doing something like this:
ABT_thread_self(&self); ABT_info_print_thread_stack(outfile, self);
That almost works, but I'm just getting the raw address information and not the translated stack unwinding:
p_ctx : 0x7f4742758fb8 p_link : (nil) stack : 0x7f4742559000 stacksize : 2097152 00007f4742559000: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 00007f4742559020: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 ...
For completeness it would be nice if the caller's stack were also human readable, and even better if I could somehow find stack information for ULTs executing on other ESs as well.
For our use case we typically spawn many detached threads into a service pool, so we aren't tracking thread references.
thanks,
-Phil
participants (2)
-
Iwasaki, Shintaro -
Phil Carns