* [PATCH 1/2] gdb: make typing strict in gdb/copyright.py
@ 2025-04-09 14:49 Simon Marchi
2025-04-09 14:49 ` [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns Simon Marchi
0 siblings, 1 reply; 6+ messages in thread
From: Simon Marchi @ 2025-04-09 14:49 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
Add `pyright: strict` at the top of the file, then adjust the fallouts.
This annotation is understood by pyright, and thus any IDE using pyright
behind the scenes (VSCode and probably others).
I presume that any GDB developer running this script is using a recent
enough version of Python, so specify the type annotations using the
actual types when possible (e.g. `list[str]` instead of
`typing.List[str]`). I believe this required Python 3.9.
Change-Id: I3698e28555e236a03126d4cd010dae4b5647ce48
---
gdb/copyright.py | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/gdb/copyright.py b/gdb/copyright.py
index 5ec9944aae20..1c81358f1384 100755
--- a/gdb/copyright.py
+++ b/gdb/copyright.py
@@ -30,13 +30,15 @@
#
# This removes the bulk of the changes which are most likely to be correct.
+# pyright: strict
+
import argparse
import locale
import os
import os.path
import subprocess
import sys
-from typing import List, Optional
+from typing import Iterable
def get_update_list():
@@ -66,7 +68,7 @@ def get_update_list():
.split("\0")
)
- def include_file(filename):
+ def include_file(filename: str):
(dirname, basename) = os.path.split(filename)
dirbasename = os.path.basename(dirname)
return not (
@@ -83,7 +85,7 @@ def get_update_list():
return filter(include_file, result)
-def update_files(update_list):
+def update_files(update_list: Iterable[str]):
"""Update the copyright header of the files in the given list.
We use gnulib's update-copyright script for that.
@@ -128,7 +130,7 @@ def update_files(update_list):
print("*** " + line)
-def may_have_copyright_notice(filename):
+def may_have_copyright_notice(filename: str):
"""Check that the given file does not seem to have a copyright notice.
The filename is relative to the root directory.
@@ -166,7 +168,7 @@ def get_parser() -> argparse.ArgumentParser:
return parser
-def main(argv: List[str]) -> Optional[int]:
+def main(argv: list[str]) -> int | None:
"""The main subprogram."""
parser = get_parser()
_ = parser.parse_args(argv)
@@ -242,9 +244,9 @@ EXCLUDE_ALL_LIST = (
)
# The list of files to update by hand.
-BY_HAND = (
+BY_HAND: list[str] = [
# Nothing at the moment :-).
-)
+]
# Files containing multiple copyright headers. This script is only
# fixing the first one it finds, so we need to finish the update
base-commit: 2bbe439e72916c7d61bf0b7a8c898426bfabc0ba
--
2.49.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns
2025-04-09 14:49 [PATCH 1/2] gdb: make typing strict in gdb/copyright.py Simon Marchi
@ 2025-04-09 14:49 ` Simon Marchi
2025-04-11 13:52 ` Guinevere Larsen
0 siblings, 1 reply; 6+ messages in thread
From: Simon Marchi @ 2025-04-09 14:49 UTC (permalink / raw)
To: gdb-patches; +Cc: Simon Marchi
gdb/copyright.py currently changes some files that it shouldn't:
- despite having a `gnulib/import` entry in EXCLUDE_LIST, it does
change the files under that directory
- it is missing `sim/Makefile.in`
Change the exclude list logic to use glob patterns. This makes it
easier to specify exclusions of full directories or files by basename,
while simplifying the code.
Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason
to keep them separate (they are treated identically). I kept the
comment that explains that some files are excluded due to not being
FSF-licensed.
Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob
patterns that match everywhere in the tree (e.g. `**/configure`).
Tested by running the script on the parent commit of d01e823438c7
("Update copyright dates to include 2025") and diff'ing the result with
d01e823438c7. The only differences are:
- the files that we don't want to modify (gnulib/import and
sim/Makefile.in)
- the files that need to be modified by hand
Running the script on latest master produces no diff.
Change-Id: I318dc3bff34e4b3a9b66ea305d0c3872f69cd072
---
gdb/copyright.py | 93 +++++++++++++++++++++---------------------------
1 file changed, 40 insertions(+), 53 deletions(-)
diff --git a/gdb/copyright.py b/gdb/copyright.py
index 1c81358f1384..50902a8033d2 100755
--- a/gdb/copyright.py
+++ b/gdb/copyright.py
@@ -36,6 +36,7 @@ import argparse
import locale
import os
import os.path
+import pathlib
import subprocess
import sys
from typing import Iterable
@@ -68,19 +69,15 @@ def get_update_list():
.split("\0")
)
+ full_exclude_list = EXCLUDE_LIST + BY_HAND
+
def include_file(filename: str):
- (dirname, basename) = os.path.split(filename)
- dirbasename = os.path.basename(dirname)
- return not (
- basename in EXCLUDE_ALL_LIST
- or dirbasename in EXCLUDE_ALL_LIST
- or dirname in EXCLUDE_LIST
- or dirname in NOT_FSF_LIST
- or dirname in BY_HAND
- or filename in EXCLUDE_LIST
- or filename in NOT_FSF_LIST
- or filename in BY_HAND
- )
+ path = pathlib.Path(filename)
+ for pattern in full_exclude_list:
+ if path.full_match(pattern):
+ return False
+
+ return True
return filter(include_file, result)
@@ -212,8 +209,14 @@ def main(argv: list[str]) -> int | None:
# generated, non-FSF, or otherwise special (e.g. license text,
# or test cases which must be sensitive to line numbering).
#
-# Filenames are relative to the root directory.
-EXCLUDE_LIST = (
+# Entries are treated as glob patterns.
+EXCLUDE_LIST = [
+ "**/aclocal.m4",
+ "**/configure",
+ "**/COPYING.LIB",
+ "**/COPYING",
+ "**/fdl.texi",
+ "**/gpl.texi",
"gdb/copying.c",
"gdb/nat/glibc_thread_db.h",
"gdb/CONTRIBUTE",
@@ -221,45 +224,11 @@ EXCLUDE_LIST = (
"gdbsupport/unordered_dense.h",
"gnulib/doc/gendocs_template",
"gnulib/doc/gendocs_template_min",
- "gnulib/import",
+ "gnulib/import/**",
"gnulib/config.in",
"gnulib/Makefile.in",
-)
-
-# Files which should not be modified, either because they are
-# generated, non-FSF, or otherwise special (e.g. license text,
-# or test cases which must be sensitive to line numbering).
-#
-# Matches any file or directory name anywhere. Use with caution.
-# This is mostly for files that can be found in multiple directories.
-# Eg: We want all files named COPYING to be left untouched.
-
-EXCLUDE_ALL_LIST = (
- "COPYING",
- "COPYING.LIB",
- "configure",
- "fdl.texi",
- "gpl.texi",
- "aclocal.m4",
-)
-
-# The list of files to update by hand.
-BY_HAND: list[str] = [
- # Nothing at the moment :-).
-]
-
-# Files containing multiple copyright headers. This script is only
-# fixing the first one it finds, so we need to finish the update
-# by hand.
-MULTIPLE_COPYRIGHT_HEADERS = (
- "gdb/doc/gdb.texinfo",
- "gdb/doc/refcard.tex",
- "gdb/syscalls/update-netbsd.sh",
-)
-
-# The list of file which have a copyright, but not held by the FSF.
-# Filenames are relative to the root directory.
-NOT_FSF_LIST = (
+ "sim/Makefile.in",
+ # The files below have a copyright, but not held by the FSF.
"gdb/exc_request.defs",
"gdb/gdbtk",
"gdb/testsuite/gdb.gdbtk/",
@@ -296,9 +265,27 @@ NOT_FSF_LIST = (
"sim/mips/sim-main.c",
"sim/moxie/moxie-gdb.dts",
# Not a single file in sim/ppc/ appears to be copyright FSF :-(.
- "sim/ppc",
+ "sim/ppc/**",
"sim/testsuite/mips/mips32-dsp2.s",
-)
+]
+
+# The list of files to update by hand.
+#
+# Entries are treated as glob patterns.
+BY_HAND: list[str] = [
+ # Nothing at the moment :-).
+]
+
+# Files containing multiple copyright headers. This script is only
+# fixing the first one it finds, so we need to finish the update
+# by hand.
+#
+# Entries are treated as glob patterns.
+MULTIPLE_COPYRIGHT_HEADERS = [
+ "gdb/doc/gdb.texinfo",
+ "gdb/doc/refcard.tex",
+ "gdb/syscalls/update-netbsd.sh",
+]
if __name__ == "__main__":
sys.exit(main(sys.argv[1:]))
--
2.49.0
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns
2025-04-09 14:49 ` [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns Simon Marchi
@ 2025-04-11 13:52 ` Guinevere Larsen
2025-04-11 14:14 ` Simon Marchi
0 siblings, 1 reply; 6+ messages in thread
From: Guinevere Larsen @ 2025-04-11 13:52 UTC (permalink / raw)
To: Simon Marchi, gdb-patches
On 4/9/25 11:49 AM, Simon Marchi wrote:
> gdb/copyright.py currently changes some files that it shouldn't:
>
> - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does
> change the files under that directory
> - it is missing `sim/Makefile.in`
>
> Change the exclude list logic to use glob patterns. This makes it
> easier to specify exclusions of full directories or files by basename,
> while simplifying the code.
>
> Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason
> to keep them separate (they are treated identically). I kept the
> comment that explains that some files are excluded due to not being
> FSF-licensed.
>
> Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob
> patterns that match everywhere in the tree (e.g. `**/configure`).
>
> Tested by running the script on the parent commit of d01e823438c7
> ("Update copyright dates to include 2025") and diff'ing the result with
> d01e823438c7. The only differences are:
>
> - the files that we don't want to modify (gnulib/import and
> sim/Makefile.in)
> - the files that need to be modified by hand
>
> Running the script on latest master produces no diff.
Is there any reason why you turned all globals into lists instead of tuples?
From what I can see in the script (and my understanding of it), there's
no reason why the arrays should be mutable, so using the immutable tuple
type would be better IMO.
With that fixed, I think this is a good change, Reviewed-By: Guinevere
Larsen <guinevere@redhat.com>
>
> Change-Id: I318dc3bff34e4b3a9b66ea305d0c3872f69cd072
> ---
> gdb/copyright.py | 93 +++++++++++++++++++++---------------------------
> 1 file changed, 40 insertions(+), 53 deletions(-)
>
> diff --git a/gdb/copyright.py b/gdb/copyright.py
> index 1c81358f1384..50902a8033d2 100755
> --- a/gdb/copyright.py
> +++ b/gdb/copyright.py
> @@ -36,6 +36,7 @@ import argparse
> import locale
> import os
> import os.path
> +import pathlib
> import subprocess
> import sys
> from typing import Iterable
> @@ -68,19 +69,15 @@ def get_update_list():
> .split("\0")
> )
>
> + full_exclude_list = EXCLUDE_LIST + BY_HAND
> +
> def include_file(filename: str):
> - (dirname, basename) = os.path.split(filename)
> - dirbasename = os.path.basename(dirname)
> - return not (
> - basename in EXCLUDE_ALL_LIST
> - or dirbasename in EXCLUDE_ALL_LIST
> - or dirname in EXCLUDE_LIST
> - or dirname in NOT_FSF_LIST
> - or dirname in BY_HAND
> - or filename in EXCLUDE_LIST
> - or filename in NOT_FSF_LIST
> - or filename in BY_HAND
> - )
> + path = pathlib.Path(filename)
> + for pattern in full_exclude_list:
> + if path.full_match(pattern):
> + return False
> +
> + return True
>
> return filter(include_file, result)
>
> @@ -212,8 +209,14 @@ def main(argv: list[str]) -> int | None:
> # generated, non-FSF, or otherwise special (e.g. license text,
> # or test cases which must be sensitive to line numbering).
> #
> -# Filenames are relative to the root directory.
> -EXCLUDE_LIST = (
> +# Entries are treated as glob patterns.
> +EXCLUDE_LIST = [
> + "**/aclocal.m4",
> + "**/configure",
> + "**/COPYING.LIB",
> + "**/COPYING",
> + "**/fdl.texi",
> + "**/gpl.texi",
> "gdb/copying.c",
> "gdb/nat/glibc_thread_db.h",
> "gdb/CONTRIBUTE",
> @@ -221,45 +224,11 @@ EXCLUDE_LIST = (
> "gdbsupport/unordered_dense.h",
> "gnulib/doc/gendocs_template",
> "gnulib/doc/gendocs_template_min",
> - "gnulib/import",
> + "gnulib/import/**",
> "gnulib/config.in",
> "gnulib/Makefile.in",
> -)
> -
> -# Files which should not be modified, either because they are
> -# generated, non-FSF, or otherwise special (e.g. license text,
> -# or test cases which must be sensitive to line numbering).
> -#
> -# Matches any file or directory name anywhere. Use with caution.
> -# This is mostly for files that can be found in multiple directories.
> -# Eg: We want all files named COPYING to be left untouched.
> -
> -EXCLUDE_ALL_LIST = (
> - "COPYING",
> - "COPYING.LIB",
> - "configure",
> - "fdl.texi",
> - "gpl.texi",
> - "aclocal.m4",
> -)
> -
> -# The list of files to update by hand.
> -BY_HAND: list[str] = [
> - # Nothing at the moment :-).
> -]
> -
> -# Files containing multiple copyright headers. This script is only
> -# fixing the first one it finds, so we need to finish the update
> -# by hand.
> -MULTIPLE_COPYRIGHT_HEADERS = (
> - "gdb/doc/gdb.texinfo",
> - "gdb/doc/refcard.tex",
> - "gdb/syscalls/update-netbsd.sh",
> -)
> -
> -# The list of file which have a copyright, but not held by the FSF.
> -# Filenames are relative to the root directory.
> -NOT_FSF_LIST = (
> + "sim/Makefile.in",
> + # The files below have a copyright, but not held by the FSF.
> "gdb/exc_request.defs",
> "gdb/gdbtk",
> "gdb/testsuite/gdb.gdbtk/",
> @@ -296,9 +265,27 @@ NOT_FSF_LIST = (
> "sim/mips/sim-main.c",
> "sim/moxie/moxie-gdb.dts",
> # Not a single file in sim/ppc/ appears to be copyright FSF :-(.
> - "sim/ppc",
> + "sim/ppc/**",
> "sim/testsuite/mips/mips32-dsp2.s",
> -)
> +]
> +
> +# The list of files to update by hand.
> +#
> +# Entries are treated as glob patterns.
> +BY_HAND: list[str] = [
> + # Nothing at the moment :-).
> +]
> +
> +# Files containing multiple copyright headers. This script is only
> +# fixing the first one it finds, so we need to finish the update
> +# by hand.
> +#
> +# Entries are treated as glob patterns.
> +MULTIPLE_COPYRIGHT_HEADERS = [
> + "gdb/doc/gdb.texinfo",
> + "gdb/doc/refcard.tex",
> + "gdb/syscalls/update-netbsd.sh",
> +]
>
> if __name__ == "__main__":
> sys.exit(main(sys.argv[1:]))
--
Cheers,
Guinevere Larsen
She/Her/Hers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns
2025-04-11 13:52 ` Guinevere Larsen
@ 2025-04-11 14:14 ` Simon Marchi
2025-04-11 14:45 ` Guinevere Larsen
0 siblings, 1 reply; 6+ messages in thread
From: Simon Marchi @ 2025-04-11 14:14 UTC (permalink / raw)
To: Guinevere Larsen, Simon Marchi, gdb-patches
On 4/11/25 9:52 AM, Guinevere Larsen wrote:
> On 4/9/25 11:49 AM, Simon Marchi wrote:
>> gdb/copyright.py currently changes some files that it shouldn't:
>>
>> - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does
>> change the files under that directory
>> - it is missing `sim/Makefile.in`
>>
>> Change the exclude list logic to use glob patterns. This makes it
>> easier to specify exclusions of full directories or files by basename,
>> while simplifying the code.
>>
>> Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason
>> to keep them separate (they are treated identically). I kept the
>> comment that explains that some files are excluded due to not being
>> FSF-licensed.
>>
>> Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob
>> patterns that match everywhere in the tree (e.g. `**/configure`).
>>
>> Tested by running the script on the parent commit of d01e823438c7
>> ("Update copyright dates to include 2025") and diff'ing the result with
>> d01e823438c7. The only differences are:
>>
>> - the files that we don't want to modify (gnulib/import and
>> sim/Makefile.in)
>> - the files that need to be modified by hand
>>
>> Running the script on latest master produces no diff.
>
> Is there any reason why you turned all globals into lists instead of tuples?
>
> From what I can see in the script (and my understanding of it), there's no reason why the arrays should be mutable, so using the immutable tuple type would be better IMO.
BY_HAND being an empty tuple messes up the typing, so I made it a
list[str] to say that if it contained something, it would be strings.
And then because I'm adding EXCLUDE_LIST and BY_HAND, it's easier if
EXCLUDE_LIST is also a list[str]. But I can work around that one by
doing:
full_exclude_list = list(EXCLUDE_LIST) + BY_HAND
That all could be avoided if we just removed the BY_HAND thing, since
it's unused at the moment. But I thought it would be cheap to keep it
around just it case we need it in the future. Not sure which is better.
Simon
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns
2025-04-11 14:14 ` Simon Marchi
@ 2025-04-11 14:45 ` Guinevere Larsen
2025-04-11 15:28 ` Simon Marchi
0 siblings, 1 reply; 6+ messages in thread
From: Guinevere Larsen @ 2025-04-11 14:45 UTC (permalink / raw)
To: Simon Marchi, Simon Marchi, gdb-patches
On 4/11/25 11:14 AM, Simon Marchi wrote:
> On 4/11/25 9:52 AM, Guinevere Larsen wrote:
>> On 4/9/25 11:49 AM, Simon Marchi wrote:
>>> gdb/copyright.py currently changes some files that it shouldn't:
>>>
>>> - despite having a `gnulib/import` entry in EXCLUDE_LIST, it does
>>> change the files under that directory
>>> - it is missing `sim/Makefile.in`
>>>
>>> Change the exclude list logic to use glob patterns. This makes it
>>> easier to specify exclusions of full directories or files by basename,
>>> while simplifying the code.
>>>
>>> Merge EXCLUDE_LIST and NOT_FSF_LIST, since there's no fundamental reason
>>> to keep them separate (they are treated identically). I kept the
>>> comment that explains that some files are excluded due to not being
>>> FSF-licensed.
>>>
>>> Merge EXCLUDE_ALL_LIST in EXCLUDE_LIST, converting the entries to glob
>>> patterns that match everywhere in the tree (e.g. `**/configure`).
>>>
>>> Tested by running the script on the parent commit of d01e823438c7
>>> ("Update copyright dates to include 2025") and diff'ing the result with
>>> d01e823438c7. The only differences are:
>>>
>>> - the files that we don't want to modify (gnulib/import and
>>> sim/Makefile.in)
>>> - the files that need to be modified by hand
>>>
>>> Running the script on latest master produces no diff.
>> Is there any reason why you turned all globals into lists instead of tuples?
>>
>> From what I can see in the script (and my understanding of it), there's no reason why the arrays should be mutable, so using the immutable tuple type would be better IMO.
> BY_HAND being an empty tuple messes up the typing, so I made it a
> list[str] to say that if it contained something, it would be strings.
> And then because I'm adding EXCLUDE_LIST and BY_HAND, it's easier if
> EXCLUDE_LIST is also a list[str]. But I can work around that one by
> doing:
>
> full_exclude_list = list(EXCLUDE_LIST) + BY_HAND
would flipping that expression to
full_exclude_list = EXCLUDE_LIST + tuple(BY_HAND)
also mess up the typing?
If that's the case, I'm fine with keeping them as lists, it was a minor
nit anyway
>
> That all could be avoided if we just removed the BY_HAND thing, since
> it's unused at the moment. But I thought it would be cheap to keep it
> around just it case we need it in the future. Not sure which is better.
Yeah, I think recognizing that some files may need to be changed by
hand, and keeping that easy is a good idea.
>
> Simon
>
--
Cheers,
Guinevere Larsen
She/Her/Hers
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns
2025-04-11 14:45 ` Guinevere Larsen
@ 2025-04-11 15:28 ` Simon Marchi
0 siblings, 0 replies; 6+ messages in thread
From: Simon Marchi @ 2025-04-11 15:28 UTC (permalink / raw)
To: Guinevere Larsen, Simon Marchi, gdb-patches
> would flipping that expression to
>
> full_exclude_list = EXCLUDE_LIST + tuple(BY_HAND)
>
> also mess up the typing?
>
> If that's the case, I'm fine with keeping them as lists, it was a minor nit anyway
Your suggestion made me look at what the deduced type of
`tuple(BY_HAND)` is, it is `tuple[str, ...]`. When I use that as the
type for BY_HAND instead of `list[str]`, it all works fine.
I sent a v2.
Simon
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-04-11 15:29 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-09 14:49 [PATCH 1/2] gdb: make typing strict in gdb/copyright.py Simon Marchi
2025-04-09 14:49 ` [PATCH 2/2] gdb: fix bugs in gdb/copyright.py, make it use glob patterns Simon Marchi
2025-04-11 13:52 ` Guinevere Larsen
2025-04-11 14:14 ` Simon Marchi
2025-04-11 14:45 ` Guinevere Larsen
2025-04-11 15:28 ` Simon Marchi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox