* [PATCH 1/2] gdb: make typing strict in gdb/copyright.py @ 2025-04-09 14:49 Simon Marchi 2025-04-09 14:49 ` [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns Simon Marchi 0 siblings, 1 reply; 6+ messages in thread From: Simon Marchi @ 2025-04-09 14:49 UTC (permalink / raw) To: gdb-patches; +Cc: Simon Marchi Add `pyright: strict` at the top of the file, then adjust the fallouts. This annotation is understood by pyright, and thus any IDE using pyright behind the scenes (VSCode and probably others). I presume that any GDB developer running this script is using a recent enough version of Python, so specify the type annotations using the actual types when possible (e.g. `list[str]` instead of `typing.List[str]`). I believe this required Python 3.9. Change-Id: I3698e28555e236a03126d4cd010dae4b5647ce48 --- gdb/copyright.py | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/gdb/copyright.py b/gdb/copyright.py index 5ec9944aae20..1c81358f1384 100755 --- a/gdb/copyright.py +++ b/gdb/copyright.py @@ -30,13 +30,15 @@ # # This removes the bulk of the changes which are most likely to be correct. +# pyright: strict + import argparse import locale import os import os.path import subprocess import sys -from typing import List, Optional +from typing import Iterable def get_update_list(): @@ -66,7 +68,7 @@ def get_update_list(): .split("\0") ) - def include_file(filename): + def include_file(filename: str): (dirname, basename) = os.path.split(filename) dirbasename = os.path.basename(dirname) return not ( @@ -83,7 +85,7 @@ def get_update_list(): return filter(include_file, result) -def update_files(update_list): +def update_files(update_list: Iterable[str]): """Update the copyright header of the files in the given list. We use gnulib's update-copyright script for that. @@ -128,7 +130,7 @@ def update_files(update_list): print("*** " + line) -def may_have_copyright_notice(filename): +def may_have_copyright_notice(filename: str): """Check that the given file does not seem to have a copyright notice. The filename is relative to the root directory. @@ -166,7 +168,7 @@ def get_parser() -> argparse.ArgumentParser: return parser -def main(argv: List[str]) -> Optional[int]: +def main(argv: list[str]) -> int | None: """The main subprogram.""" parser = get_parser() _ = parser.parse_args(argv) @@ -242,9 +244,9 @@ EXCLUDE_ALL_LIST = ( ) # The list of files to update by hand. -BY_HAND = ( +BY_HAND: list[str] = [ # Nothing at the moment :-). -) +] # Files containing multiple copyright headers. This script is only # fixing the first one it finds, so we need to finish the update base-commit: 2bbe439e72916c7d61bf0b7a8c898426bfabc0ba -- 2.49.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns 2025-04-09 14:49 [PATCH 1/2] gdb: make typing strict in gdb/copyright.py Simon Marchi @ 2025-04-09 14:49 ` Simon Marchi 2025-04-11 13:52 ` Guinevere Larsen 0 siblings, 1 reply; 6+ messages in thread From: Simon Marchi @ 2025-04-09 14:49 UTC (permalink / raw) To: gdb-patches; +Cc: Simon Marchi gdb/copyright.py currently changes some files that it shouldn't: - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does change the files under that directory - it is missing `sim/Makefile.in` Change the exclude list logic to use glob patterns. This makes it easier to specify exclusions of full directories or files by basename, while simplifying the code. Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason to keep them separate (they are treated identically). I kept the comment that explains that some files are excluded due to not being FSF-licensed. Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob patterns that match everywhere in the tree (e.g. `**/configure`). Tested by running the script on the parent commit of d01e823438c7 ("Update copyright dates to include 2025") and diff'ing the result with d01e823438c7. The only differences are: - the files that we don't want to modify (gnulib/import and sim/Makefile.in) - the files that need to be modified by hand Running the script on latest master produces no diff. Change-Id: I318dc3bff34e4b3a9b66ea305d0c3872f69cd072 --- gdb/copyright.py | 93 +++++++++++++++++++++--------------------------- 1 file changed, 40 insertions(+), 53 deletions(-) diff --git a/gdb/copyright.py b/gdb/copyright.py index 1c81358f1384..50902a8033d2 100755 --- a/gdb/copyright.py +++ b/gdb/copyright.py @@ -36,6 +36,7 @@ import argparse import locale import os import os.path +import pathlib import subprocess import sys from typing import Iterable @@ -68,19 +69,15 @@ def get_update_list(): .split("\0") ) + full_exclude_list = EXCLUDE_LIST + BY_HAND + def include_file(filename: str): - (dirname, basename) = os.path.split(filename) - dirbasename = os.path.basename(dirname) - return not ( - basename in EXCLUDE_ALL_LIST - or dirbasename in EXCLUDE_ALL_LIST - or dirname in EXCLUDE_LIST - or dirname in NOT_FSF_LIST - or dirname in BY_HAND - or filename in EXCLUDE_LIST - or filename in NOT_FSF_LIST - or filename in BY_HAND - ) + path = pathlib.Path(filename) + for pattern in full_exclude_list: + if path.full_match(pattern): + return False + + return True return filter(include_file, result) @@ -212,8 +209,14 @@ def main(argv: list[str]) -> int | None: # generated, non-FSF, or otherwise special (e.g. license text, # or test cases which must be sensitive to line numbering). # -# Filenames are relative to the root directory. -EXCLUDE_LIST = ( +# Entries are treated as glob patterns. +EXCLUDE_LIST = [ + "**/aclocal.m4", + "**/configure", + "**/COPYING.LIB", + "**/COPYING", + "**/fdl.texi", + "**/gpl.texi", "gdb/copying.c", "gdb/nat/glibc_thread_db.h", "gdb/CONTRIBUTE", @@ -221,45 +224,11 @@ EXCLUDE_LIST = ( "gdbsupport/unordered_dense.h", "gnulib/doc/gendocs_template", "gnulib/doc/gendocs_template_min", - "gnulib/import", + "gnulib/import/**", "gnulib/config.in", "gnulib/Makefile.in", -) - -# Files which should not be modified, either because they are -# generated, non-FSF, or otherwise special (e.g. license text, -# or test cases which must be sensitive to line numbering). -# -# Matches any file or directory name anywhere. Use with caution. -# This is mostly for files that can be found in multiple directories. -# Eg: We want all files named COPYING to be left untouched. - -EXCLUDE_ALL_LIST = ( - "COPYING", - "COPYING.LIB", - "configure", - "fdl.texi", - "gpl.texi", - "aclocal.m4", -) - -# The list of files to update by hand. -BY_HAND: list[str] = [ - # Nothing at the moment :-). -] - -# Files containing multiple copyright headers. This script is only -# fixing the first one it finds, so we need to finish the update -# by hand. -MULTIPLE_COPYRIGHT_HEADERS = ( - "gdb/doc/gdb.texinfo", - "gdb/doc/refcard.tex", - "gdb/syscalls/update-netbsd.sh", -) - -# The list of file which have a copyright, but not held by the FSF. -# Filenames are relative to the root directory. -NOT_FSF_LIST = ( + "sim/Makefile.in", + # The files below have a copyright, but not held by the FSF. "gdb/exc_request.defs", "gdb/gdbtk", "gdb/testsuite/gdb.gdbtk/", @@ -296,9 +265,27 @@ NOT_FSF_LIST = ( "sim/mips/sim-main.c", "sim/moxie/moxie-gdb.dts", # Not a single file in sim/ppc/ appears to be copyright FSF :-(. - "sim/ppc", + "sim/ppc/**", "sim/testsuite/mips/mips32-dsp2.s", -) +] + +# The list of files to update by hand. +# +# Entries are treated as glob patterns. +BY_HAND: list[str] = [ + # Nothing at the moment :-). +] + +# Files containing multiple copyright headers. This script is only +# fixing the first one it finds, so we need to finish the update +# by hand. +# +# Entries are treated as glob patterns. +MULTIPLE_COPYRIGHT_HEADERS = [ + "gdb/doc/gdb.texinfo", + "gdb/doc/refcard.tex", + "gdb/syscalls/update-netbsd.sh", +] if __name__ == "__main__": sys.exit(main(sys.argv[1:])) -- 2.49.0 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns 2025-04-09 14:49 ` [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns Simon Marchi @ 2025-04-11 13:52 ` Guinevere Larsen 2025-04-11 14:14 ` Simon Marchi 0 siblings, 1 reply; 6+ messages in thread From: Guinevere Larsen @ 2025-04-11 13:52 UTC (permalink / raw) To: Simon Marchi, gdb-patches On 4/9/25 11:49 AM, Simon Marchi wrote: > gdb/copyright.py currently changes some files that it shouldn't: > > - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does > change the files under that directory > - it is missing `sim/Makefile.in` > > Change the exclude list logic to use glob patterns. This makes it > easier to specify exclusions of full directories or files by basename, > while simplifying the code. > > Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason > to keep them separate (they are treated identically). I kept the > comment that explains that some files are excluded due to not being > FSF-licensed. > > Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob > patterns that match everywhere in the tree (e.g. `**/configure`). > > Tested by running the script on the parent commit of d01e823438c7 > ("Update copyright dates to include 2025") and diff'ing the result with > d01e823438c7. The only differences are: > > - the files that we don't want to modify (gnulib/import and > sim/Makefile.in) > - the files that need to be modified by hand > > Running the script on latest master produces no diff. Is there any reason why you turned all globals into lists instead of tuples? From what I can see in the script (and my understanding of it), there's no reason why the arrays should be mutable, so using the immutable tuple type would be better IMO. With that fixed, I think this is a good change, Reviewed-By: Guinevere Larsen <guinevere@redhat.com> > > Change-Id: I318dc3bff34e4b3a9b66ea305d0c3872f69cd072 > --- > gdb/copyright.py | 93 +++++++++++++++++++++--------------------------- > 1 file changed, 40 insertions(+), 53 deletions(-) > > diff --git a/gdb/copyright.py b/gdb/copyright.py > index 1c81358f1384..50902a8033d2 100755 > --- a/gdb/copyright.py > +++ b/gdb/copyright.py > @@ -36,6 +36,7 @@ import argparse > import locale > import os > import os.path > +import pathlib > import subprocess > import sys > from typing import Iterable > @@ -68,19 +69,15 @@ def get_update_list(): > .split("\0") > ) > > + full_exclude_list = EXCLUDE_LIST + BY_HAND > + > def include_file(filename: str): > - (dirname, basename) = os.path.split(filename) > - dirbasename = os.path.basename(dirname) > - return not ( > - basename in EXCLUDE_ALL_LIST > - or dirbasename in EXCLUDE_ALL_LIST > - or dirname in EXCLUDE_LIST > - or dirname in NOT_FSF_LIST > - or dirname in BY_HAND > - or filename in EXCLUDE_LIST > - or filename in NOT_FSF_LIST > - or filename in BY_HAND > - ) > + path = pathlib.Path(filename) > + for pattern in full_exclude_list: > + if path.full_match(pattern): > + return False > + > + return True > > return filter(include_file, result) > > @@ -212,8 +209,14 @@ def main(argv: list[str]) -> int | None: > # generated, non-FSF, or otherwise special (e.g. license text, > # or test cases which must be sensitive to line numbering). > # > -# Filenames are relative to the root directory. > -EXCLUDE_LIST = ( > +# Entries are treated as glob patterns. > +EXCLUDE_LIST = [ > + "**/aclocal.m4", > + "**/configure", > + "**/COPYING.LIB", > + "**/COPYING", > + "**/fdl.texi", > + "**/gpl.texi", > "gdb/copying.c", > "gdb/nat/glibc_thread_db.h", > "gdb/CONTRIBUTE", > @@ -221,45 +224,11 @@ EXCLUDE_LIST = ( > "gdbsupport/unordered_dense.h", > "gnulib/doc/gendocs_template", > "gnulib/doc/gendocs_template_min", > - "gnulib/import", > + "gnulib/import/**", > "gnulib/config.in", > "gnulib/Makefile.in", > -) > - > -# Files which should not be modified, either because they are > -# generated, non-FSF, or otherwise special (e.g. license text, > -# or test cases which must be sensitive to line numbering). > -# > -# Matches any file or directory name anywhere. Use with caution. > -# This is mostly for files that can be found in multiple directories. > -# Eg: We want all files named COPYING to be left untouched. > - > -EXCLUDE_ALL_LIST = ( > - "COPYING", > - "COPYING.LIB", > - "configure", > - "fdl.texi", > - "gpl.texi", > - "aclocal.m4", > -) > - > -# The list of files to update by hand. > -BY_HAND: list[str] = [ > - # Nothing at the moment :-). > -] > - > -# Files containing multiple copyright headers. This script is only > -# fixing the first one it finds, so we need to finish the update > -# by hand. > -MULTIPLE_COPYRIGHT_HEADERS = ( > - "gdb/doc/gdb.texinfo", > - "gdb/doc/refcard.tex", > - "gdb/syscalls/update-netbsd.sh", > -) > - > -# The list of file which have a copyright, but not held by the FSF. > -# Filenames are relative to the root directory. > -NOT_FSF_LIST = ( > + "sim/Makefile.in", > + # The files below have a copyright, but not held by the FSF. > "gdb/exc_request.defs", > "gdb/gdbtk", > "gdb/testsuite/gdb.gdbtk/", > @@ -296,9 +265,27 @@ NOT_FSF_LIST = ( > "sim/mips/sim-main.c", > "sim/moxie/moxie-gdb.dts", > # Not a single file in sim/ppc/ appears to be copyright FSF :-(. > - "sim/ppc", > + "sim/ppc/**", > "sim/testsuite/mips/mips32-dsp2.s", > -) > +] > + > +# The list of files to update by hand. > +# > +# Entries are treated as glob patterns. > +BY_HAND: list[str] = [ > + # Nothing at the moment :-). > +] > + > +# Files containing multiple copyright headers. This script is only > +# fixing the first one it finds, so we need to finish the update > +# by hand. > +# > +# Entries are treated as glob patterns. > +MULTIPLE_COPYRIGHT_HEADERS = [ > + "gdb/doc/gdb.texinfo", > + "gdb/doc/refcard.tex", > + "gdb/syscalls/update-netbsd.sh", > +] > > if __name__ == "__main__": > sys.exit(main(sys.argv[1:])) -- Cheers, Guinevere Larsen She/Her/Hers ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns 2025-04-11 13:52 ` Guinevere Larsen @ 2025-04-11 14:14 ` Simon Marchi 2025-04-11 14:45 ` Guinevere Larsen 0 siblings, 1 reply; 6+ messages in thread From: Simon Marchi @ 2025-04-11 14:14 UTC (permalink / raw) To: Guinevere Larsen, Simon Marchi, gdb-patches On 4/11/25 9:52 AM, Guinevere Larsen wrote: > On 4/9/25 11:49 AM, Simon Marchi wrote: >> gdb/copyright.py currently changes some files that it shouldn't: >> >> - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does >> change the files under that directory >> - it is missing `sim/Makefile.in` >> >> Change the exclude list logic to use glob patterns. This makes it >> easier to specify exclusions of full directories or files by basename, >> while simplifying the code. >> >> Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason >> to keep them separate (they are treated identically). I kept the >> comment that explains that some files are excluded due to not being >> FSF-licensed. >> >> Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob >> patterns that match everywhere in the tree (e.g. `**/configure`). >> >> Tested by running the script on the parent commit of d01e823438c7 >> ("Update copyright dates to include 2025") and diff'ing the result with >> d01e823438c7. The only differences are: >> >> - the files that we don't want to modify (gnulib/import and >> sim/Makefile.in) >> - the files that need to be modified by hand >> >> Running the script on latest master produces no diff. > > Is there any reason why you turned all globals into lists instead of tuples? > > From what I can see in the script (and my understanding of it), there's no reason why the arrays should be mutable, so using the immutable tuple type would be better IMO. BY_HAND being an empty tuple messes up the typing, so I made it a list[str] to say that if it contained something, it would be strings. And then because I'm adding EXCLUDE_LIST and BY_HAND, it's easier if EXCLUDE_LIST is also a list[str]. But I can work around that one by doing: full_exclude_list = list(EXCLUDE_LIST) + BY_HAND That all could be avoided if we just removed the BY_HAND thing, since it's unused at the moment. But I thought it would be cheap to keep it around just it case we need it in the future. Not sure which is better. Simon ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns 2025-04-11 14:14 ` Simon Marchi @ 2025-04-11 14:45 ` Guinevere Larsen 2025-04-11 15:28 ` Simon Marchi 0 siblings, 1 reply; 6+ messages in thread From: Guinevere Larsen @ 2025-04-11 14:45 UTC (permalink / raw) To: Simon Marchi, Simon Marchi, gdb-patches On 4/11/25 11:14 AM, Simon Marchi wrote: > On 4/11/25 9:52 AM, Guinevere Larsen wrote: >> On 4/9/25 11:49 AM, Simon Marchi wrote: >>> gdb/copyright.py currently changes some files that it shouldn't: >>> >>> - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does >>> change the files under that directory >>> - it is missing `sim/Makefile.in` >>> >>> Change the exclude list logic to use glob patterns. This makes it >>> easier to specify exclusions of full directories or files by basename, >>> while simplifying the code. >>> >>> Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason >>> to keep them separate (they are treated identically). I kept the >>> comment that explains that some files are excluded due to not being >>> FSF-licensed. >>> >>> Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob >>> patterns that match everywhere in the tree (e.g. `**/configure`). >>> >>> Tested by running the script on the parent commit of d01e823438c7 >>> ("Update copyright dates to include 2025") and diff'ing the result with >>> d01e823438c7. The only differences are: >>> >>> - the files that we don't want to modify (gnulib/import and >>> sim/Makefile.in) >>> - the files that need to be modified by hand >>> >>> Running the script on latest master produces no diff. >> Is there any reason why you turned all globals into lists instead of tuples? >> >> From what I can see in the script (and my understanding of it), there's no reason why the arrays should be mutable, so using the immutable tuple type would be better IMO. > BY_HAND being an empty tuple messes up the typing, so I made it a > list[str] to say that if it contained something, it would be strings. > And then because I'm adding EXCLUDE_LIST and BY_HAND, it's easier if > EXCLUDE_LIST is also a list[str]. But I can work around that one by > doing: > > full_exclude_list = list(EXCLUDE_LIST) + BY_HAND would flipping that expression to full_exclude_list = EXCLUDE_LIST + tuple(BY_HAND) also mess up the typing? If that's the case, I'm fine with keeping them as lists, it was a minor nit anyway > > That all could be avoided if we just removed the BY_HAND thing, since > it's unused at the moment. But I thought it would be cheap to keep it > around just it case we need it in the future. Not sure which is better. Yeah, I think recognizing that some files may need to be changed by hand, and keeping that easy is a good idea. > > Simon > -- Cheers, Guinevere Larsen She/Her/Hers ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns 2025-04-11 14:45 ` Guinevere Larsen @ 2025-04-11 15:28 ` Simon Marchi 0 siblings, 0 replies; 6+ messages in thread From: Simon Marchi @ 2025-04-11 15:28 UTC (permalink / raw) To: Guinevere Larsen, Simon Marchi, gdb-patches > would flipping that expression to > > full_exclude_list = EXCLUDE_LIST + tuple(BY_HAND) > > also mess up the typing? > > If that's the case, I'm fine with keeping them as lists, it was a minor nit anyway Your suggestion made me look at what the deduced type of `tuple(BY_HAND)` is, it is `tuple[str, ...]`. When I use that as the type for BY_HAND instead of `list[str]`, it all works fine. I sent a v2. Simon ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-11 15:29 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-04-09 14:49 [PATCH 1/2] gdb: make typing strict in gdb/copyright.py Simon Marchi 2025-04-09 14:49 ` [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns Simon Marchi 2025-04-11 13:52 ` Guinevere Larsen 2025-04-11 14:14 ` Simon Marchi 2025-04-11 14:45 ` Guinevere Larsen 2025-04-11 15:28 ` Simon Marchi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox