From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10007 invoked by alias); 12 Dec 2011 16:57:06 -0000 Received: (qmail 9993 invoked by uid 22791); 12 Dec 2011 16:57:04 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 12 Dec 2011 16:56:51 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=EU1-MAIL.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Ra9BI-0001Ez-QL from pedro_alves@mentor.com ; Mon, 12 Dec 2011 08:56:49 -0800 Received: from scottsdale.localnet ([172.16.63.104]) by EU1-MAIL.mgc.mentorg.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 12 Dec 2011 16:56:46 +0000 From: Pedro Alves To: Jan Kratochvil Subject: Re: [PATCH] PR threads/10729: x86 hw watchpoints and non-stop mode Date: Mon, 12 Dec 2011 17:23:00 -0000 User-Agent: KMail/1.13.6 (Linux/2.6.38-13-generic; KDE/4.7.2; x86_64; ; ) Cc: Tristan Gingold , gdb-patches@sourceware.org References: <201112051601.59664.pedro@codesourcery.com> <201112091630.20916.pedro@codesourcery.com> <20111211233811.GA27629@host2.jankratochvil.net> In-Reply-To: <20111211233811.GA27629@host2.jankratochvil.net> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201112121656.44478.pedro@codesourcery.com> X-IsSubscribed: yes Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org X-SW-Source: 2011-12/txt/msg00350.txt.bz2 On Sunday 11 December 2011 23:38:11, Jan Kratochvil wrote: > Hi Pedro, > > on simple non-hit (inferior does not touch "j" at all) watchpoint case: > strace -o 2 -q ./gdb -nx ./36 -ex start -ex 'watch j' -ex stepi -ex 'set confirm no' -ex q > grep 'PTRACE_....USER' 1 >1b; grep 'PTRACE_....USER' 2 >2b > > It has performance regression of 15 ptrace syscalls -> 27 ptrace syscalls. Hmm. Before, a single step (si) (instead of all syscalls fro the begining) does: $ tail -f 1 | grep PTRACE_P...USER ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg), 0x601028) = 0 ptrace(PTRACE_PEEKUSER, 15618, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0 ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg) + 48, 0x4000) = 0 ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg) + 56, 0xd0101) = 0 ptrace(PTRACE_PEEKUSER, 15618, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0 ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg), 0) = 0 ptrace(PTRACE_POKEUSER, 15618, offsetof(struct user, u_debugreg) + 56, 0xd0100) = 0 So, 7 related ptrace calls. After the patch, just one single-step does: $ tail -f 2 | grep PTRACE_P...USER ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg), 0x601028) = 0 ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 8, 0) = 0 ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 16, 0) = 0 ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 24, 0) = 0 ptrace(PTRACE_POKEUSER, 16139, offsetof(struct user, u_debugreg) + 56, 0xd0101) = 0 ptrace(PTRACE_PEEKUSER, 16139, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0 ptrace(PTRACE_PEEKUSER, 16139, offsetof(struct user, u_debugreg) + 56, [0xd0101]) = 0 Still 7 related ptrace syscalls. The 15 -> 27 jump in all PTRACE_P...USER syscalls is because for all stops, we're now reading both DR_STATUS and DR_CONTROL, while before we were only reading DR_STATUS. And there are stops in starting up an inferior until it reaches main. That looks easily fixed. In addition, before, we'd only poke to DR[N] if the register was used, now (i386|amd64)_linux_prepare_to_resume always writes all of DR0-3, even when only one DR is used. That's looks easily fixed too. I'm trying out this patch: gdb/amd64-linux-nat.c | 3 ++- gdb/i386-nat.c | 26 ++++++++++++++++++-------- 2 files changed, 20 insertions(+), 9 deletions(-) diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c index 9699f84..dc6a735 100644 --- a/gdb/amd64-linux-nat.c +++ b/gdb/amd64-linux-nat.c @@ -390,7 +390,8 @@ amd64_linux_prepare_to_resume (struct lwp_info *lwp) struct i386_debug_reg_state *state = i386_debug_reg_state (); for (i = DR_FIRSTADDR; i <= DR_LASTADDR; i++) - amd64_linux_dr_set (lwp->ptid, i, state->dr_mirror[i]); + if (state->dr_ref_count[i] > 0) + amd64_linux_dr_set (lwp->ptid, i, state->dr_mirror[i]); amd64_linux_dr_set (lwp->ptid, DR_CONTROL, state->dr_control_mirror); diff --git a/gdb/i386-nat.c b/gdb/i386-nat.c index 94306a1..6d59b14 100644 --- a/gdb/i386-nat.c +++ b/gdb/i386-nat.c @@ -607,22 +607,32 @@ i386_stopped_data_address (struct target_ops *ops, CORE_ADDR *addr_p) int rc = 0; unsigned status; unsigned control; + unsigned control_p = 0; /* Get the current values the inferior has. If the thread was running when we last changed watchpoints, the mirror no longer represents what was set in this thread's debug registers. */ status = i386_dr_low.get_status (); - control = i386_dr_low.get_control (); ALL_DEBUG_REGISTERS(i) { - if (I386_DR_WATCH_HIT (status, i) - /* This second condition makes sure DRi is set up for a data - watchpoint, not a hardware breakpoint. The reason is - that GDB doesn't call the target_stopped_data_address - method except for data watchpoints. In other words, I'm - being paranoiac. */ - && I386_DR_GET_RW_LEN (control, i) != 0) + if (!I386_DR_WATCH_HIT (status, i)) + continue; + + /* Fetching DR_CONTROL may require another syscall. Avoid when + possible. */ + if (!control_p) + { + control = i386_dr_low.get_control (); + control_p = 1; + } + + /* This second condition makes sure DRi is set up for a data + watchpoint, not a hardware breakpoint. The reason is + that GDB doesn't call the target_stopped_data_address + method except for data watchpoints. In other words, I'm + being paranoiac. */ + if (I386_DR_GET_RW_LEN (control, i) != 0) { addr = i386_dr_low.get_addr (i); rc = 1; With those changes, we're actually even better than before. Your example when from 15 -> 12. A single stepi that doesn't trigger a watchpoint only does: ptrace(PTRACE_POKEUSER, 23332, offsetof(struct user, u_debugreg), 0x601028) = 0 ptrace(PTRACE_POKEUSER, 23332, offsetof(struct user, u_debugreg) + 56, 0xd0101) = 0 ptrace(PTRACE_PEEKUSER, 23332, offsetof(struct user, u_debugreg) + 48, [0x4000]) = 0 That is, write DR0 and DR_CONTROL on resume, and, read DR_STATUS on stop. > On Fri, 09 Dec 2011 17:30:20 +0100, Pedro Alves wrote: > > @@ -513,22 +499,7 @@ i386_update_inferior_debug_regs (struct i386_debug_reg_state *new_state) > > ALL_DEBUG_REGISTERS (i) > > { > > if (I386_DR_VACANT (new_state, i) != I386_DR_VACANT (&dr_mirror, i)) > > - { > > - if (!I386_DR_VACANT (new_state, i)) > > - { > > - i386_dr_low.set_addr (i, new_state->dr_mirror[i]); > > - > > > > - /* Only a sanity check for leftover bits (set possibly only > > - by inferior). */ > > - if (i386_dr_low.unset_status) > > - i386_dr_low.unset_status (I386_DR_WATCH_MASK (i)); > > Deleting this part is a regression. Testcase for that part is attached. Thanks a lot, I'll take a look. I had assumed this bit: if (lwp->stopped_by_watchpoint) amd64_linux_dr_set (lwp->ptid, DR_STATUS, 0); in amd64_linux_prepare_to_resume would fix the issue. > > > + && I386_DR_GET_RW_LEN (control, i) != 0) > > { > > - addr = state->dr_mirror[i]; > > + addr = i386_dr_low.get_addr (i); > > Why to do this change? Why we can no longer trust DR_MIRROR? This is > a performance regression. This is non-stop, so threads can be running while we change the global state->dr_mirror (and friends). Say, we set a watchpoint, and let the threads rusume. Now, say you delete the watchpoint, or add/remove watchpoints such that state->dr_mirror[*] changes. Inserting/deleting watchpoines updates state->dr_mirror[*]. Now threads haven't been updated with the mirror yet, and say a thread has meanwhile hit an old watchpoint, but we haven't handled the SIGTRAP yet. If we trusted state->dr_mirror[*], we'd mistake the real trapped address to whatever was currently state->dr_mirror[i]. So state->dr_mirror now represents intention. To get at the address that trapped, we need to read the state the thread had when it trapped. I'll add some comments to the code. -- Pedro Alves