From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from simark.ca by simark.ca with LMTP id Rsb9FD3ymWClMAAAWB0awg (envelope-from ) for ; Mon, 10 May 2021 22:55:57 -0400 Received: by simark.ca (Postfix, from userid 112) id 4872B1F11C; Mon, 10 May 2021 22:55:57 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on simark.ca X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPS id B38D71E54D for ; Mon, 10 May 2021 22:55:55 -0400 (EDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 51E303848039; Tue, 11 May 2021 02:55:55 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 51E303848039 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1620701755; bh=4O6/iDJrzdKA/4uigbHFcq59jZ0Mg/F13v0a+hRHKMc=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=w+D9E7nUubnprfKXGuzw9ptavK47mDwZWDy2To6OF7+ygQ6+7pfIfW54kS4k83lck si+PYsUZs2pRpetDwF+JWCXfDFwbQItRA1MZGjruJLOH3RRHMcq2oh7lbrviau0PfC en4hxGm34j2kFKgXowMTUulOvOfjBi8oREy9j7+w= Received: from smtp.polymtl.ca (smtp.polymtl.ca [132.207.4.11]) by sourceware.org (Postfix) with ESMTPS id 2AED63857C40 for ; Tue, 11 May 2021 02:55:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2AED63857C40 Received: from simark.ca (simark.ca [158.69.221.121]) (authenticated bits=0) by smtp.polymtl.ca (8.14.7/8.14.7) with ESMTP id 14B2tkdX030524 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 10 May 2021 22:55:51 -0400 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp.polymtl.ca 14B2tkdX030524 Received: from [10.0.0.11] (192-222-157-6.qc.cable.ebox.net [192.222.157.6]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by simark.ca (Postfix) with ESMTPSA id 144671E54D; Mon, 10 May 2021 22:55:45 -0400 (EDT) Subject: Re: Proposal: format GDB Python files with black To: Tom Tromey , Simon Marchi via Gdb-patches References: <87lf8p9pwg.fsf@tromey.com> Message-ID: Date: Mon, 10 May 2021 22:55:45 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: <87lf8p9pwg.fsf@tromey.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Poly-FromMTA: (simark.ca [158.69.221.121]) at Tue, 11 May 2021 02:55:46 +0000 X-BeenThere: gdb-patches@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gdb-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Simon Marchi via Gdb-patches Reply-To: Simon Marchi Errors-To: gdb-patches-bounces@sourceware.org Sender: "Gdb-patches" Changing the subject, so this gets more visibility. On 2021-05-08 12:00 p.m., Tom Tromey wrote: >>>>>> "Simon" == Simon Marchi via Gdb-patches writes: > > Simon> A few people I talked to about this and I have good experience > Simon> with the tool black to auto-format Python code. > > Inspired by this, I took another look at clang-format. If we could use > this, I think it would be a decent productivity improvement, because it > would mean contributors would no longer need to worry about the minutiae > of the GNU style. We could also stop reviewing this. I would love this. > My github has my current hacks, on the t/clang-format branch. This > branch just has .clang-format, I didn't reformat the sources. I'm using > clang-format 10.0.1. > > I managed to tweak the penalty values in the format to avoid the > bin-packing problem I was seeing previously. I haven't extensively > tested this (I usually only reformat gdb_bfd.c and look at it, since it > exercised several bugs before...), so I recommend that others play with > it a bit. I formatted the whole code base and checked a few files in Meld. In general, I would say that it does more good (fixing little mistakes, finding split lines that could actually be one line) than harm. I didn't see anything outrageously badly formatted (and if there is, we always have the option to disable formatting for a section of a file). There are a lot of changes that are neutral, where the before and after would be equally good. And of course, some changes where it does things not exactly like we would, like: gcc_type compile_cplus_instance::convert_reference_base (gcc_type base, enum gcc_cp_ref_qualifiers rquals) which becomes: gcc_type compile_cplus_instance::convert_reference_base ( gcc_type base, enum gcc_cp_ref_qualifiers rquals) But really, I don't think these should be blockers. It would take someone some really bad faith to say that the above changes how they read the code. Is the following what you don't like? In gdbsupport/common-debug.h, before: debug_prefixed_printf (m_module, m_func, "%s: <%s debugging was not enabled on entry>", m_end_prefix, m_module); after: debug_prefixed_printf ( m_module, m_func, "%s: <%s debugging was not enabled on entry>", m_end_prefix, m_module); Note: the current line is too long, my bad. I have only seen this when we make clang-format's life difficult by using long names or long string literals. If we don't like how it formats some piece of code, we always have the choice of tweaking it to make its like easier. It can be a good motivation to try to reduce indentation levels by splitting the code into functions, for example. I'm surprised you didn't use "UseTab: Always", that gives pretty much our scheme of tabs + spaces for indentation. By default it uses just spaces (I wouldn't mind using either, but since the goal is to get something close to our current style). I would use "IndentPPDirectives: AfterHash". Unlike what we use currently, it will indent with $IndentWidth columns, whereas we typically we indent these with a single column when we do it manually. But, I think I prefer 2 columns actually. You can try formatting gdbsupport/gdb_locale.h for a good example. Speaking of which, in order to be usable from gdbsupport and gdbserver, .clang-format would need to be in the top-level directory. Or maybe it could be in gdbsupport, and gdb/gdbserver symlink to it. We have a number of places that put the namespace brace on the same line: namespace foo { We could keep that by using "BreakBeforeBraces: Custom". But I don't think it's worth the trouble. It adds one extra line for the opening brace, but it's typically not in a region of the file that you care much about. I noticed this: btrace_block (CORE_ADDR begin, CORE_ADDR end) : begin (begin), end (end) { /* Nothing. */ } The initializer list is put on the same line as the constructor's prototype. If there's a way to tell it that we don't want that, it would be closer to our style. For AlignEscapedNewlines, I'd probably choose Right or DontAlign. The reason being that if you modify the longest line of a macro (make that line longer or shorter), all the macros line will change as a result. So it will be harder to spot what changed when reading the hunk in the diff. I have a slight preference for Right, because it puts the backslashes out of sight, thus improving readability. It's the kind of thing that is quite annoying to maintain by hand, but it's not if a tool does it. I would suggest setting SplitEmptyFunction, SplitEmptyRecord and SplitEmptyNamespace to false. We already do it for empty methods. Enough for now. > I did see a few issues, though perhaps they are minor. > > 1. It removes all the Control-L page breaks from the source. > I know these often confuse newcomers, so maybe it's fine. > I feel that Someone out there will cheer this :) Who dat? > > 2. clang-format can't handle GNU-style goto label out-denting. > There's a clang-format bug on this topic > https://bugs.llvm.org/show_bug.cgi?id=24118 Probably not a big deal for us, as we don't use goto much? > > 3. In gdb if we have a large 'if' condition that consists of expression > chained with "&&" or "||", we often put each sub-condition on its own > line. clang-format packs them instead, and I didn't see a way to > counteract this. I hate too long conditions anyway. If the condition spans too many lines or has deep parenthesis nesting, it would deserve to be refactored anyway. Like, put that code in a function and just have: if (my_new_function (...)) With a descriptive name for the function, the code ends up more readable. > 4. Gettext invocations like _("text") have a space added, like _ ("text"). > I didn't see a way to change this. A very little inconvenience IMO, compared to not having to think about formatting. We could live with this, and file a bug at the same time. There could be an option to add exceptions to SpaceBeforeParens. > Another possible problem is that, unlike with 'black', ensuring that > everyone has the same version is a pain. Indeed. It's perhaps easy for me to say, because I get to choose what Linux distro and version I work on (so I opt for something recent), but I would still lean towards just following whatever the current latest stable version is. There might be new options in the latest stable version we want to use, it would be nice not to have to wait years before we can use them. And I suppose there's a not too painful way to get it for all the major distros out there (and for Windows and macOS, there are binary releases). And you can always run it in Docker or something. But we would probably encourage people to use git-clang-format, which only reformats the lines you touched. So, it wouldn't be a big problem for people to use different versions, as it wouldn't create spurious changes elsewhere in the file. The differences in formatting would be in lines you touch anyway. Speaking of git-clang-format: from my experience, it is not super cool when the code base isn't already formatted with the target formatting style. It causes discrepancies in style, because you have a mix of new style (whatever lines were touched) and old style (whatever lines were not touched) which can be annoying and distracting. Or, it causes slightly bigger diffs than you would get otherwise. For example, I ran git-clang-format on this commit here: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=10e578d7e00d ... to see what would have been the resulting patch had the author used git-clang-format. This would have been the patch: https://pastebin.com/pxrZtnDU If you check the changes in mi/mi-cmd-break.c, you can see that git-clang-format re-formatted the whole enum and array initializer. Which is nice in itself, the resulting formatting is nice. But, the patch would be a bit harder to read. The other option is to start by doing a big re-format and then tell people to use git-clang-format. This way, there might be some small incoherences here and there due to different versions, but I think they'll be insignificant. With the Python code, we did a massive re-format. I don't think it did / will cause a big inconvenience, because there aren't many patches to the Python code. But for the C++ code, it might be different, as there are more patches in everyone's pipeline that touch that code. Another question is: would we re-format the C/C++ code in the testsuite? It might require a few adjustments to test cases (like there was one for black), but I would be fine with that. My concern is more if there are some tests where the formatting of the code was specifically important. We could disable formatting for them, but the difficulty is to find them. Simon