A Thursday 26 June 2008 14:32:51, Daniel Jacobowitz wrote: > On Sun, Jun 15, 2008 at 10:09:44PM +0100, Pedro Alves wrote: > > I've added three points where GDB makes the internal (inferior_ptid) > > thread switch to the user selected thread (user_selected_ptid): > > > >  1 - execute_command, before executing a command > >  2 - execute_command, after executing a command, > >  3 - fetch_inferior_event, after handling the event. > > > > With these in place, the user never notices when GDB switches > > between threads internally while handing events. > > The _internal name is a warning sign to me: it means we have two > functions that conceptually do the same thing, but one of them is only > "safe for internal use". We call both of them from internal to GDB, > so let's try to find a better name. How about > execute_command_in_current_context? > Ack. internal refered to "internal selected thread" vs "user/external selected thread". Please keep reading, though. > Is -interpreter-exec ever going to get a --thread argument? We have > to be careful because it currently restores the globally selected > thread via cli_interpreter_exec, and even if it didn't it may call > a user defined command. > > There are 22 calls to execute_command in GDB (including two from > Insight). This patch changes seven of them to > execute_command_internal: thread.c, mi-main.c, mi-cmd-env.c. > I checked all the others, just in case - they mostly look correct. > > The only cases I'm worried about are breakpoint commands and > conditions. For instance bpstat_do_actions to execute_control_command > to execute_command will mess with inferior_ptid. Should it? Ack. Keep reading. > What about bpstat_check_breakpoint_conditions to breakpoint_cond_eval > to evaluate_expression? > > Also, hook- and hookpost- commands will still be run after > execute_command_internal - but every call to execute_command_internal > manually sets the appropriate thread, so this looks fine. > Yeah, this can get twisty. I've been computing these issue in the background for a while, as these issues have been worrying me. I'd been through several completelly different versions of this patch, as it's tricky to get all issues covered correctly. I'd prefer that the next revision goes in the right direction. :-) Let me start with a bit of some goals and problem description, then work my way into a proposal, and then there's a patch attached that applies on top of this one you've reviewed, so the incremental changes can be seen better. Here are some thoughts I keep bumping myself into: - Ideally, at least the target interfaces and register frame and stack management components of GDB shouldn't know about a "current thread" (inferior_ptid) at all. I count over 750 references to it, so that is very far from being true. - In the long run, it should be possible to support more than one context. Say, the IDE can display two or even more frame info windows, each with its own thread selected. Something similar to what described here (3.4.1): http://www.sourceware.org/gdb/papers/multi-arch/real-multi-arch/ ... should be supportable. MI's --thread and --frame new switches should enable implementing that on the frontend side. - MI is gaining "stateless" --thread and --frame commands. Commands that take those arguments should not change the currently user selected thread and frame. - Non-stop shouldn't change the current user selected thread and frame. At least if the current thread is stopped. - In non-stop mode, the current thread may exit while the user has it selected. Switching current thread, and restoring it with a cleanup is problematic in non-stop mode. 1) The original thread may exit before we re-select it - current thread 1 is running - switch to thread 2 - - trying to restore thread 1 is messy * ideally, thread 1 is no longer in the thread list at this point. The main reason the current patch could keep exited threads listed was due to context switching needing inferior_ptid to be listed...) * restoring by ptid is ambiguous is the app/OS quickly reuses thread 1's ptid (remember it had exited). 2) The selected thread may change between creating the cleanup, and running it. E.g., To handle an event, we need to switch to the thread that got it --- Frame parsing/handling only being aplicable to inferior_ptid, and context switching are the main reasons behind this, off the top of my head. There's a bunch of cleanup work and core rework to do to be able to remove this requirement. Picture these examples: - User has thread 1 selected - thread 2 has an event - GDB switches to thread 2 to handle it - It was an internal breakpoint, thread 2 is immediatelly resumed, so the user should not notice the thread switch. - GDB has finished handling the event, and reverts to thread 2 (and also the frame the user had selected in it), (the user didn't notice the thread switching, unless the original current thread or frame were destroyed by the just handled event.) In this case, restoring with a cleanup might seem appropriate. Now this example: - User has thread 1 selected. - thread 2 has an event - GDB switches to thread 2 to handle it - It was a breakpoint, and has breakpoint commands attached to it. - The breakpoint has several commands attached actually: thread 3 print value1 up print value2 up print value3 end - GDB has finished handling the event, so should restore to the user selected thread and frame. But, which is it now? thread 1, or thread 3? In all-stop mode? In non-stop mode? After posting this patch I realized that the _internal variant indeed isn't good enough in situations like: (gdb) define mycomm >up >print a >end (gdb) thread 1 (gdb) thread apply 2 mycomm mycomm is executed with execute_command_internal (on thread 2), but "up" and "print a" are ran with execute_command (hence on thread 1)... In non stop mode, we don't want stop events to switch the current thread, but, breakpoint commands should apply to the context of the thread that just hit the breakpoint, e.g.: (gdb) b foo.c:100 Breakpoint 1 ... (gdb) commands >thread >print func_arg >c& >end (gdb) thread 1 (gdb) c -a& (thread 2 hits breakpoint 1) (GDB switches to it to handle the event) (GDB calls bpstat_do_actions, which apply in the context of thread 1) Boo! Breakpoint command became useless... Now another example. Imagine in non-stop mode, this case: (gdb) b foo.c:100 Breakpoint 1 ... (gdb) commands >thread 3 >c& >end (gdb) interrupt -a (all threads stopped) (gdb) thread 2 (gdb) c& (thread 2 resumes) (gdb) thread 1 (gdb) print foo_c (gdb) print foo_b (gdb) pri (thread 2 hits breakpoint 1, and runs its commands, which included switching to thread 3) nt foo_a (note that the breakpoint was hit while the user was inspecting thread 1, typing a print command) To which thread should the last "print foo_a" apply to? My opinion is that is should apply to thread 1. Does anyone disagree? We still need something else to make GDB behave this way. --------------------------------------------------------------- With all the above in mind, I thought of: - calling the { selected thread and frame } thing a "context". - having a context stack (The top level context being what the user/frontend considers current.) Instead of temporarilly switching to another thread and frame; execute commands in it, and restoring the selected thread and frame, we do: - push a new context on the context stack - switch the selected thread and frame in this new now top level context - execute command(s) which can switch thread and frame at will - pop context - Whenever a thread exit is detected, we go through all contexts and if it was the selected thread in it, invalidate it. Note that execute_command_internal is then gone. No need for it anymore. E.g. 1 - thread apply, pushing context to execute commands: (top level context selected) (gdb) thread 1 (gdb) thread apply 2 thread 3 (push new context) (selected thread 2) (execute "thread 3") (pop context) (gdb) thread [Current thread is 1 ...] From the user point of view, this is the current HEAD's behaviour actually. E.g. 2 - thread apply frame restoring broke with execute_command_internal, if command applied to the already selected thread, with contexts, it doesn't: (top level context selected) (gdb) thread 1 (gdb) frame #0 ... (gdb) thread apply 1 up (push new context) (selected thread 1) (execute "up") (pop context) (gdb) thread [Current thread is 1 ...] (gdb) frame #0 ... E.g. 2: (top level context selected) (gdb) b foo.c:100 Breakpoint 1 ... (gdb) commands >print func_arg >c& >end (gdb) thread 1 (thread 1 is running) (gdb) c -a& (thread 2 hits breakpoint 1) (GDB switches to it to handle the event) (GDB pushes new context) (GDB switches to the event thread (thread 2)) (calls bpstat_do_actions) (pops context) (restores current context, which is thread 1) E.g. 3 - thread exits: (gdb) b foo.c:100 Breakpoint 1 ... (gdb) commands >print inferior_func () >c& >end (top level context selected) (gdb) thread 1 (gdb) c -a& (all threads running now) (thread 2 hits breakpoint 1) (GDB switches to it to handle the event) (GDB pushes new context) (GDB switches to the event thread (thread 2)) (calls bpstat_do_actions) (inferior function call, inferior is set running for a bit (thread 1 exits (all contexts with thread 1 selected are invalidated) ) ) (pops context) (restores current context, which was thread 1, but is not invalidaded. i.e., there is now no selected thread in the current context.) (gdb) interrupt Cannot execute this command without a selected thread. See `help thread'. E.g. 4: (gdb) b foo.c:100 Breakpoint 1 ... (gdb) commands >thread 3 >c& >end (gdb) interrupt -a (all threads stopped) (gdb) thread 2 (gdb) c& (thread 2 resumes) (gdb) thread 1 (gdb) print foo_c (gdb) print foo_b (gdb) pri (thread 2 hits breakpoint 1 (a new context is pushed) (breakpoint commands are ran (thread 3; c&) ) (context is poped, now back in thread 1) nt foo_a (foo_a is evaluated in thread 1) Hope I've made some sense. Attached is a first draft at implementing this, needs a bit of cleanup, but is does work. What do you guys think? Is this the right direction? It does regtest regression free on x86_64-unknown-linux-gnu/all-stop. Vladimir, I've added a basic --thread switch to MI just for testing, based on older MI non-stop patches I had here, just so you could see what would change in your perspective. As you see, only a couple of lines should change. I'll remove it from any final patch. > > +enum thread_state > > +{ > > + THREAD_STOPPED, > > + THREAD_RUNNING, > > + THREAD_EXITED, > > +}; > > + > > +static enum thread_state main_thread_state = 0; > > THREAD_STOPPED instead of 0? Done. > > > - /* Backup current thread and selected frame. */ > > - if (!is_running (inferior_ptid)) > > - saved_frame_id = get_frame_id (get_selected_frame (NULL)); > > - else > > - saved_frame_id = null_frame_id; > > - > > - old_chain = make_cleanup_restore_current_thread (inferior_ptid, > > saved_frame_id); > > Is this sort of simplification really a good idea? Functions which > leave inferior_ptid with an essentially unspecified value, while it's > still global state used by most of GDB. > The idea is that execute_command restores the user selected thread and frame anyway (and MI's equivalent does as well), so this isn't needed. Let's discuss the other general issues first, as if the direction I'm proposing is better, this function will still be restoring the thread and frame, when the context is poped. > > +/* Only safe to use this cleanup on a stopped thread, and > > + inferior_ptid isn't ran before running the cleanup. */ > > + > > struct current_thread_cleanup > > Not sure what you mean by this comment. > I've changed the comment to this: /* It is only safe to use this cleanup iff inferior_ptid is alive and stopped, and, by if by the time it is ran, inferior_ptid represents the same thread, it is alive and it is stopped. */ Does it make it clearer? > > + /* In non-stop mode, we don't want GDB to switch threads on the > > + users back, to avoid races where the user is typing a command to > > user's Fixed, thanks. -- Pedro Alves