From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id +HzkMluEjmLLWggAWB0awg (envelope-from ) for ; Wed, 25 May 2022 15:32:43 -0400 Received: by simark.ca (Postfix, from userid 112) id CD22B1E220; Wed, 25 May 2022 15:32:43 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, NICE_REPLY_A,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id 1A0E41E00D for ; Wed, 25 May 2022 15:32:42 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B3EF13830672 for ; Wed, 25 May 2022 19:32:41 +0000 (GMT) Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) by sourceware.org (Postfix) with ESMTPS id 348CA3856DD2 for ; Wed, 25 May 2022 19:32:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 348CA3856DD2 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=palves.net Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-f54.google.com with SMTP id c9-20020a7bc009000000b0039750ec5774so3935939wmb.5 for ; Wed, 25 May 2022 12:32:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=ANKZZoPCoBKzX6c72T9WbQ95RU/msbGPCWiwuQEtjlg=; b=hdmtijAmmxE4/7qBPkxtaXR3uNok9wIHBFa1Ycu/nEODLlp3yVsCD+OvFBOH5+9xDU sbGeGXj9N3kkqhU56/DXMvKL0JMijs4M/KeZa3HxQDdPvltQmsz9C2TCrANqsMFGiWBO hxLzC9XrsXAR/942JVjAsKdDVcKo4EtYKM4lzHCfc6MH1ETtZbnChY0DB6bn2ti1x0hG SZc/3t5WEYaWGM0fz5Nno6XqtHTcJi7W06uCFwqdx+FQuRg8hO5Nz00nbWwx5lt2RPz2 w24/sm51haicm3LE0X6xkFmUBsnP/xw4R2+cS+G6U5loFflbZLYLHWcSGABpjJpBZi1O 8exw== X-Gm-Message-State: AOAM531iCyonUItrlnsm+PkjfRU/ayqCsleCnWTpKjC3LDS68FDYWheQ AgRrw8IzTRNEo1KVebuGacKkeJO9RJg= X-Google-Smtp-Source: ABdhPJylTnFIT2pLLuGRti5OWPEPVyv0lb3NaUhd8cSbiKT0NotsiXUD7NRVVO9koFTP3m6mjIY6pQ== X-Received: by 2002:a05:600c:2182:b0:397:58f5:c6cf with SMTP id e2-20020a05600c218200b0039758f5c6cfmr9249893wme.86.1653507146700; Wed, 25 May 2022 12:32:26 -0700 (PDT) Received: from ?IPV6:2001:8a0:f924:2600:209d:85e2:409e:8726? ([2001:8a0:f924:2600:209d:85e2:409e:8726]) by smtp.gmail.com with ESMTPSA id c4-20020adfc6c4000000b0020e5d2a9d0bsm3455292wrh.54.2022.05.25.12.32.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 25 May 2022 12:32:25 -0700 (PDT) Message-ID: <4c7a9504-83e0-6c02-fda6-0254ab4eede4@palves.net> Date: Wed, 25 May 2022 20:32:24 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH v2 1/2] Always show locations for breakpoints & show canonical location spec Content-Language: en-US To: Eli Zaretskii References: <20220519215552.3254012-1-pedro@palves.net> <20220519215552.3254012-2-pedro@palves.net> <834k1kd7ne.fsf@gnu.org> <625057b2-1691-a472-fa93-0dabacbddd39@palves.net> <83ilpv5bd3.fsf@gnu.org> From: Pedro Alves In-Reply-To: <83ilpv5bd3.fsf@gnu.org> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: gdb-patches@sourceware.org Errors-To: gdb-patches-bounces+public-inbox=simark.ca@sourceware.org Sender: "Gdb-patches" On 2022-05-24 14:06, Eli Zaretskii wrote: > >>>> +Enabled breakpoints and breakpoint locations are marked with @samp{y}. >>> >>> What does "enabled breakpoint location" mean? One cannot enable a >>> location. >> >> Sure we can. > > We are mis-communicating. "Location" is a frequently-used word that > has certain natural meaning in many contexts. That natural meaning > doesn't support the notion of "enabling". Well, yeah, but above the text is very clearly talking about "breakpoint locations", i.e., the locations in the program where the breakpoint is armed. Enabling/disabling one of these locations is really about enabling/disabling the arming of a breakpoint at such a location. There is no need to be concerned about the general term here, the context doesn't leave scope for ambiguity. Even though, the term does make sense here. > You use "location" in a > very special (and IMO problematic, as I tried to explain later in my > comments) meaning, and in that meaning a "location" indeed _can_ be > enabled. But when reading this text, that meaning was not yet clear > to me, if it ever becomes clear later, and thus this text made me > raise a brow. I described the surprise which I felt while reading the > text as a feedback in the hope that we will be able to make the manual > more clear and less confusing upon first, linear, top-to-bottom > reading, and avoid the need for the reader to repeatedly read the same > text before its meaning dawns on the reader. I think we can get rid of any confusion by naming the input location spec, the arguments you pass to commands, differently. I've just sent a separate patch about this, which cleans up the manual in that direction, and you'll notice that it isn't really about "info breakpoints". In fact, it barely touches that section. It's a lot more general. Please take a fresh look at that one. I'll then get back to these "info breakpoints" changes on top of it, and I think we will have a much better foundation. Here is the URL to the patch in question: [PATCH] gdb/manual: Introduce locspecs https://sourceware.org/pipermail/gdb-patches/2022-May/189417.html > >> So I'm actually proposing to improve things by calling the user-input >> text as "location specification", distinct from the actual locations >> the specification matches or resolves to. > > My suggestion is to use "location" (with or without additional > qualifiers) for only one of these two, if only because we tend to > shorthand that by using just "location". We could use it for what you > now call "location specification", which would allow us to use just > "location" as it shorthand. With your proposed text, "location" is > ambiguous, because it could allude to any of these two. The other > meaning of "location" I propose to call "address in the code" or > "address" for short -- because that's what it conceptually is. (If > you are bothered by the case of unloaded shared library and similar > situations, we could say "unresolved address" in those cases -- which > are rather marginal, so once explained, they can be ignored in the > rest of the text. By contrast, these two "locations" are pervasive: > we need to refer to them all over the manual. So it's much more > important IMO to make that part of our terminology clear and > unequivocal.) I really think that this is the wrong direction, and working on the patch that I pointed at above really convinced me so even more. I believe that that patch will convince you as well. Please take a look there. As another data point, after writing that other patch, i.e., just now, I went to look what other debuggers call similar things, and here's what I saw: - Delve, the Go debugger, calls what I'm suggesting to call "Location Specification" + shorthand "locspec", as "Location Specifier". See: https://github.com/go-delve/delve/blob/master/Documentation/cli/locspec.md That's very close -- specification vs specifier. I could go with "Specifier" too instead of "Specification" if people prefer it. Note how their "break" command is documented as: "break [name] [locspec]" https://github.com/go-delve/delve/tree/master/Documentation/cli#break That's very close to what I proposed in the patch I just sent, as it uses the same "locspec" shorthand. Nice coincidence. They also have wording using "matching", as in "If called with the locspec argument it will delete all the breakpoints matching the locspec. If locspec is omitted all breakpoints are deleted." ... which is the wording I used in that patch as well. I swear I didn't find this until after I wrote my patch. It should give some credence at this being a good candidate for a term of art. - LLDB calls "locations" to what GDB calls breakpoint locations too: See here: https://lldb.llvm.org/use/tutorial.html#setting-breakpoints They use the "resolve to one or more locations" wording too. See: "Note that setting a breakpoint creates a logical breakpoint, which could resolve to one or more locations. For instance, break by selector would set a breakpoint on all the methods that implement that selector in the classes in your program. Similarly, a file and line breakpoint might result in multiple locations if that file and line were inlined in different places in your code." Here, lldb debugging gdb, note how the user interface uses "location" just like GDB. Setting a breakpoint: (lldb) b main Breakpoint 1: 24 locations. ^^^^^^^^^^^^ Listing breakpoints: (lldb) break list Current breakpoints: 1: name = 'main', locations = 24 ^^^^^^^^^^^^^^ 1.1: where = gdb`main + 34 at gdb.c:28:10, address = gdb[0x00000000000ed07b], unresolved, hit count = 0 1.2: where = gdb`main + 8 at 13650.cc:45:9, address = gdb[0x0000000000869cf3], unresolved, hit count = 0 1.3: where = gdb`main + 8 at 1.cc:39:9, address = gdb[0x0000000000869eb3], unresolved, hit count = 0 1.4: where = gdb`main + 8 at 1.cc:160:9, address = gdb[0x000000000086a476], unresolved, hit count = 0 1.5: where = gdb`main + 8 at 2.cc:158:9, address = gdb[0x000000000086a9d8], unresolved, hit count = 0 1.6: where = gdb`main + 8 at 3.cc:158:9, address = gdb[0x000000000086b039], unresolved, hit count = 0 1.7: where = gdb`main + 8 at 4.cc:40:9, address = gdb[0x000000000086b0e4], unresolved, hit count = 0 1.8: where = gdb`main + 8 at 1.cc:90:9, address = gdb[0x000000000086b6b7], unresolved, hit count = 0 1.9: where = gdb`main + 8 at 2.cc:48:9, address = gdb[0x000000000086b910], unresolved, hit count = 0 1.10: where = gdb`main + 8 at 3.cc:63:9, address = gdb[0x000000000086bcc4], unresolved, hit count = 0 1.11: where = gdb`main + 8 at 1.cc:74:9, address = gdb[0x000000000086c001], unresolved, hit count = 0 1.12: where = gdb`main + 8 at 2.cc:366:9, address = gdb[0x000000000086de53], unresolved, hit count = 0 1.13: where = gdb`main + 8 at 1.cc:41:9, address = gdb[0x0000000000869de2], unresolved, hit count = 0 1.14: where = gdb`main + 8 at 1.cc:127:9, address = gdb[0x0000000000869bb0], unresolved, hit count = 0 1.15: where = gdb`main + 8 at 1.cc:58:9, address = gdb[0x00000000008693ed], unresolved, hit count = 0 1.16: where = gdb`main + 8 at 1.cc:58:9, address = gdb[0x00000000008692a1], unresolved, hit count = 0 1.17: where = gdb`main + 8 at 2.cc:84:9, address = gdb[0x0000000000869135], unresolved, hit count = 0 1.18: where = gdb`main + 8 at front_back.cc:38:9, address = gdb[0x0000000000868d09], unresolved, hit count = 0 1.19: where = gdb`main + 27 at empty.cc:27:22, address = gdb[0x0000000000868b0a], unresolved, hit count = 0 1.20: where = gdb`main + 8 at 1.cc:66:9, address = gdb[0x0000000000868ae3], unresolved, hit count = 0 1.21: where = gdb`main + 8 at 3.cc:34:9, address = gdb[0x0000000000868912], unresolved, hit count = 0 1.22: where = gdb`main + 8 at 2.cc:41:9, address = gdb[0x00000000008688ac], unresolved, hit count = 0 1.23: where = gdb`main + 8 at 1.cc:62:9, address = gdb[0x00000000008687bb], unresolved, hit count = 0 1.24: where = gdb`main + 8 at 1.cc:167:9, address = gdb[0x000000000086839c], unresolved, hit count = 0 2: name = 'handle_inferior_event', locations = 1 2.1: where = gdb`::handle_inferior_event(execution_control_state *) + 44 at infrun.c:5335:21, address = gdb[0x00000000004ea21d], unresolved, hit count = 0 (lldb) Disabling a location: (lldb) break disable 1.50 error: '1.50' is not a currently valid breakpoint/location id. ^^^^^^^^^^^^^^^^^^^ Curiously, a single-location breakpoint is shown just like multi-location breakpoints too, there's no distinction. See breakpoint 2 above. And they show the original breakpoint spec in the "name = ..." field, it's not lost for them like it is in gdb's "info breakpoints". OOC, here's what they show for a disabled breakpoint with enabled locations (except location 1.3, I disabled that one): (lldb) break list Current breakpoints: 1: name = 'main', locations = 24 Options: disabled 1.1: where = gdb`main + 34 at gdb.c:28:10, address = gdb[0x00000000000ed07b], unresolved, hit count = 0 1.2: where = gdb`main + 8 at 13650.cc:45:9, address = gdb[0x0000000000869cf3], unresolved, hit count = 0 1.3: where = gdb`main + 8 at 1.cc:39:9, address = gdb[0x0000000000869eb3], unresolved, hit count = 0 Options: disabled 1.4: where = gdb`main + 8 at 1.cc:160:9, address = gdb[0x000000000086a476], unresolved, hit count = 0 1.5: where = gdb`main + 8 at 2.cc:158:9, address = gdb[0x000000000086a9d8], unresolved, hit count = 0 1.6: where = gdb`main + 8 at 3.cc:158:9, address = gdb[0x000000000086b039], unresolved, hit count = 0 ... > > Can you tel concretely why you object to changing the "location" > terminology along these lines? Even if you think the current > terminology is okay, what will we lose by switching the terminology I > propose above? You lose the fact that it is the right name for it, and the fact that everyone understands what it means today, so you force everyone to learn something new for no good reason. Also, the list of breakpoint locations in MI is exposed in the field called ... "locations". And there is a proposal pending on the list to expose the breakpoint locations to Python, via a "locations" attribute... > >> Even GDB's source code uses "location" for both -- struct bp_location >> for the breakpoint locations, the "1.1", "1.2" things, stored in each >> struct breakpoint in the breakpoint::loc field, as a linked list: > > I know. But I hope we can avoid terminology ambiguities even though > they are present in the code. The reader of the GDB manual will not > necessarily read the GDB source at the same time, and is not required > to do so if all they want is to learn how to use GDB. My point is that the ambiguity already exists, is old, but I plan to do something about, including fixing the code, and was am telling you what my plan for the source code was. > Preexisting text is not sacred. It's okay to modify or even delete > parts of the old text where they are sub-optimal or aren't > contributing to the manual enough to justify their existence. Certainly. But it is also the case that you should expect that a contributor will assume that the terms used in current text will have already been vetted when the text was originally added, so reusing them or moving text around shouldn't require rewriting everything. > > Also, please understand that when presented with a large rewrite of > the manual, I normally cannot go back to the old text and refrain from > commenting on some parts that were more or less verbatim copied from > the old version. Instead, I read the new text as a whole and comment > on that, since that is what will the reader see after the changes are > installed. I somewhat understand that (don't understand the "cannot", you could do that for terms you find suspicious, if not for the whole text), though I also believe that can lead to people feeling like they're getting unfair requirements. >>> I'm not sure it is a good idea to talk about "resolved locations", >>> here and elsewhere. Why not "resolved addresses", as the original >>> text did? Using "location" here introduces a conceptual ambiguity, >>> whereby a "location" could be in the source code and also in the >>> executable code, and that overloading of concepts makes this complex >>> issue even harder to understand. My suggestion is to use "location" >>> for source-level location specs, and for nothing else. >> >> I disagree with that, because that's not how GDB behaves. >> [...] > > I'm disappointed by such a wholesale rejection of these parts of my > comments. The terminology issue is very important, at least IMO, and > the rest of the review circles around it and its various aspects. If > you disagree with it so strongly, almost all the non-trivial parts of > my review can be just ignored, because they are based on this > difficulty in the terminology. > > I'm not sure I know how to proceed, given that we have such a basic > disagreement on these aspects of the changes for the manual. It should be OK to have disagreements, I hope. I tried to explain why I disagreed. You can disagree back and explain why. At some point we should reach somewhere in the middle. But let me try again. When you set a breakpoint, you pass the "break" command some arguments, which can be a function name, a file name, a line number, a stap probe, an address, and more. There are three formats you can use to specify where you want the breakpoints. Linespecs, explicit locations, and address locations. Once you pass this input to GDB, gdb will find the matching locations, and will tell you how many it found. For example: (top-gdb) b main Breakpoint 3 at 0xed06c: main. (24 locations) It is not correct to think of the locations as just "addresses", because they are more than that. For example here are three breakpoints set at different functions, one of them inlined. Note that both breakpoints' locations have the same address, though different function names and line numbers: (gdb) b func_extern_caller Breakpoint 1 at 0x5555555552d3: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c, line 208. (gdb) b func_inline_caller Note: breakpoint 1 also set at pc 0x5555555552d3. Breakpoint 2 at 0x5555555552d3: file /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c, line 199. (gdb) info breakpoints Num Type Disp Enb Address What 1 breakpoint keep y 0x00005555555552d3 in func_extern_caller at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:208 2 breakpoint keep y 0x00005555555552d3 in func_inline_caller at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:199 ^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^ ^^^ same! different... different... So they should be considered different locations. Running to each of the breakpoints will make GDB present the stop differently: (gdb) disable (gdb) enable 1 (gdb) r Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.opt/inline-break/inline-break Breakpoint 1, func_extern_caller (x=1) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:208 208 return func_inline_caller (x); (gdb) disable (gdb) enable 2 (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/pedro/gdb/binutils-gdb/build/gdb/testsuite/outputs/gdb.opt/inline-break/inline-break Breakpoint 2, func_inline_caller (x=1) at /home/pedro/gdb/binutils-gdb/src/gdb/testsuite/gdb.opt/inline-break.c:199 199 return func_inline_callee (x); Also, the local variables you can access in each of the breakpoints's conditions are different, the breakpoints's commands will see a different context, etc., as the breakpoints are set in different functions, even though they have the same addresses. So calling breakpoint locations "addresses" instead if just incorrect. > >>>> +This field is present for ordinary breakpoints and tracepoints. >>> >>> What is an "ordinary" breakpoint, and how is it different from the >>> other kinds? (There are other places in the new text that use this >>> unexplained terminology.) >> >> I've just using the terminology already used in this table. See: > > When I say "not explained" I mean there's no text which says something > like "an ordinary breakpoint is ...", or some text that explains what > are the "not ordinary" breakpoints, and by doing that explains > implicitly what are the "ordinary" ones. If the old text didn't have > that, it means that old text was also sub-optimal. (And my reading of > these changes to the manual led me to believe that this distinction > will be more important once the behavior changes, because you modified > the original text even where the old one was not invalidated by the > changes in behavior.) Thus the comment. > OK, but you have to understand that if a contributor just copies a term from the surrounding text, they'll think that it should be OK to use it again, otherwise how did the term get into the manual in the first place? Requiring that contributions that just add more instances of the same clean up the prior mess isn't fair, because adding one more use of the same term will not make any difference to the user's understanding, nor to the work necessary to clarify elsewhere in some intro text what the term means. That can be done as an orthogonal change. > And this part of the patch: > >> -A location in a multi-location breakpoint is represented as a tuple with the >> -following fields: >> +Each location in an ordinary breakpoint or tracepoint is represented >> +as a tuple with the following fields: > > led me to believe that "ordinary" somehow is related to > "multi-location". > >> Thus "ordinary" here refers to actual code breakpoints, as opposed to the generic "breakpoint" >> term meaning all of code breakpoints, watchpoints, tracepoints, catchpoints. > > How about saying that explicitly, before we start relying on this > terminology? The text I'm changing _already_ relies on that terminology, so "before we start" doesn't apply. I agree it would be nice to clarify this, but I don't see why it has to be in this patch. I actually think it should be a separate patch. There are references to "regular breakpoints" too, to mean the same thing. And I'm starting to use "code breakpoints" more myself, as I think it has more meaning (contrast with "data breakpoint", which is a term many are familiar as alternative to our "watchpoint"). > >> Below is the updated patch. Only the documentation changed. > > Given that you rejected most of the important parts of my comments, > I'm not sure how to proceed with reviewing this new version. The main > part of my review is the suggestion to find better, more intuitive > terminology for "breakpoints" vs "header rows" in the "info break" > display, and also better terminology for the two types of "locations". > If this is left as it was in the original patch, what should I look at > in the revised patch? It wasn't left as it was, the v2 patch is different, as I had explained with some bullet points where I detailed the difference, which you've skipped. But nevermind, let's focus on discussing the patch I pointed at at the top of this email. The "info breakpoints" improvements changes should be much smaller once I rebase them on that patch (after it goes in). > > Please understand, like I explained above, that I read the new text > after the patch as a whole, and base my review on what it tells me and > how it describes the GDB features. It's not just diffs to me, it's a > more-or-less complete document, and it must make sense as a whole. > Currently, it doesn't, due to those terminology problems. Please understand that when writing the patch I too look at the manual as a whole. I constantly build the html of the manual when I'm editing it, and re-read the whole section or sections I am changing, trying to make sure everything fits together. I sometimes spend hours tweaking small details here and there until I have something that I think is good. The changes I proposed made sense to me from that angle. In the end, we all want the manual to improve. > > If we keep this terminology, the relevant sections of the manual will > keep confusing me. The result is that I will keep mentioning this in > my future reviews, and that makes little sense to me. I think the patch I posted to introduce locspecs will fix that. Please take a look at it.