Mirror of the gdb mailing list
 help / color / mirror / Atom feed
* [RFC] Allowing GDB to use a more recent version of Python at runtime than it was compiled with
@ 2025-05-29 16:14 Matthieu Longo via Gdb
  2025-05-29 17:42 ` Tom Tromey
  0 siblings, 1 reply; 3+ messages in thread
From: Matthieu Longo via Gdb @ 2025-05-29 16:14 UTC (permalink / raw)
  To: gdb
  Cc: Tamar Christina, Andre Simoes Dias Vieira, Tom Tromey,
	Simon Marchi, Luis Machado, Andrew Burgess

## End goal

Allowing GDB to use a more recent version of Python at runtime than it 
was compiled with.

## Arm's context

We have a CI building release artifacts of Arm GNU toolchains (GDB is 
shipped along the toolchain) on a Linux distribution with the oldest 
glibc version among all the distributions that we want to support.

The rationale for using the distribution with the oldest glibc version 
is to leverage the ABI backward compatibility of glibc. If we build a C 
binary on the old distribution, we expect it to work on more recent 
distributions.
Same thing for C++ artifacts. See 
https://gcc.gnu.org/onlinedocs/libstdc++/manual/abi.html for more 
information.

However, CPython does not provide backward compatibility guarantees as 
glibc or libstdc++ does. This means that a GDB compiled against an old 
CPython won't run with a more recent CPython due to ABI divergences 
between the Python versions [1].

This situation obliges us to provide GDB with two flavors, one with 
Python support and one without (in case there is a conflicting version 
of Python already installed on the system).

## General packaging difficulties

Usually, Linux distributions embed a default Python version and don't 
change it much along the life time of a release. Providing a GDB for a 
distribution means that we would have to compile and bundle a GDB for 
each of the supported distributions. Given the number of distributions 
out there, this approach does not look very sustainable for us (even 
using Docker [2]).

Others platforms, like Windows, don't ship Python by default. This means 
that we cannot know the version of Python that the user has installed or 
will install on his machine.

## How does CPython propose to tackle this issue ?

PEP 384 – Defining a Stable ABI [3] and PEP 652 – Maintaining the Stable 
ABI [4] defined a Limited API and Stable ABI, which allow extenders and 
embedders of CPython to compile extension modules that are 
binary-compatible with any subsequent version of 3.x

More details about the current state of this limited API is available at 
https://docs.python.org/3/c-api/stable.html

### Experiment: build the master branch of GDB with the Python limited API

I defined `#define Py_LIMITED_API 0x03020000` (Python 3.2: introduction 
of the limited API) before `#include <Python.h>` in 
gdb/python/python-internal.h, and tried to recompile GDB to have an 
rough idea of how many incompatibilities there is.

I got errors about Py_buffer and PyBuffer_Release (part of the Stable 
ABI (including all members) since version 3.11, so changed 
Py_LIMITED_API to 0x030b0000): 
https://docs.python.org/3/c-api/buffer.html#c.Py_buffer

All the remaining errors seems to be related to PyTypeObject: 
https://docs.python.org/3/c-api/type.html#c.PyTypeObject
The Python doc says:
> Part of the Limited API (as an opaque struct).

Except that GDB uses it as a non-opaque structure.

### Experiment summary

At a first glance, :
- The main bottle neck to make GDB compliant with the limited Python API 
is PyTypeObject, which is used in a lot of places (41 c files out of 51 
in gdb/python).
- The secondary issue to support older version than 3.11 is Py_buffer 
and related functions. Pragmatically, this does not look like an issue 
we want to tackle unless we absolutely want to support older versions 
than 3.11. For information, Python 3.11 should reach its EOL in 2027 [5].

There might be more issues as the build process stopped and didn't go 
over all the files in gdb/python.

Regarding the first issue, from what I could understand of the usages of 
PyTypeObject in the GDB code base, they seem to be used to declare 
"static types" to Python, i.e. exposing C data structure to Python. More 
information at https://docs.python.org/3/c-api/typeobj.html#static-types

Making the usage of PyTypeObject opaque would consist in transforming 
the declaration of those PyTypeObjects to a static PyType_Slot and 
PyType_Spec equivalent, and then creating a PyObject* by calling 
PyType_FromSpec() instead of PyModule_AddObject(), which is, by the way, 
"soft deprecated" since Python 3.13.

The Python extensions seem to have a reasonable test coverage (tests in 
gdb/testsuite/gdb.python), but I don't have any information on its 
completeness. The testsuite should be a good enough safety net to at 
least validate the approach, but might require new tests to make sure no 
regression are introduced.

Given the data above, and after reading an example of such a transition 
to the limited Python ABI in Qt [6], making GDB use the limited C API 
does not seem out of reach.

## Shared library naming issue in CPython

For several years, it seems that the CPython project has an issue when 
generating libpython3.so, the expected name of the Python shared library 
exposing the Python limited C API. The file is empty and doesn't contain 
any symbol.

Ticket: https://github.com/python/cpython/issues/104612 (status: OPEN)

 From what I could understand, libpython3.so is supposed to only expose 
the symbols of the Python limited C API, and have a runtime dependency 
against libpython3.x.so. However, it has never been implemented and in 
the meantime, someone decided to make it an empty file. Since nobody 
followed up on this, Python build still generates an empty file.
So packagers came with a workaround (for those who cared) for people 
trying to use the Python limited C API, i.e. with a DT_NEEDED reference 
in the binary set to libpython3.so. They either created a symbolic link 
libpython3.so to libpython3.x.so, or copied and renamed libpython3.x.so
With this solution, the limited C API is not isolated from the 
whole(=unstable) Python C API, but a program using only the limited C 
API should work in the same way.

### How does it impact GDB ?

Let's assume that GDB can compile only using the Python limited C API.

Following the logic of PEP-384, linking against the Python limited C API 
means linking against libpython3.so, not libpython3.x.so

With in the current state of Python packaging, this creates issues at 
different stages:
1. Most of packagers don't expose a symbolic link libpython3.so to 
libpython3.x.so, or a copy of libpython3.x.so named libpython3.so
2. If it is not exposed on the build environment, we can still link 
against libpython3.x.so, and patch manually the reference in DT_NEEDED 
before bundling the artifact in the distributed archive.
```
   patchelf --replace-needed libpython3.12.so.1.0 libpython3.so gdb
```
3. If the runtime environment does not have libpython3.so, the user 
needs to create a symbolic link himself.

### A light of hope from some packagers

Brew (on Linux, I didn't test on MacOS) exposes a symbolic link 
libpython3.so (instead of the actual shared library for the limited C 
API) which points to libpython3.x.so

There are probably others good examples out there that I am not aware of.

## Conclusion

Clearly, today the user experience of GDB with Python enabled can vary a 
lot depending on how Python was installed on the end system, and how GDB 
was built by the packagers.

Allowing GDB to use a more recent version of Python at runtime than it 
was compiled with, would allow to improve both the user experience, and 
also the packagers experience.

 From my understanding, this goal can be achieved thanks to the two 
following conditions:
- one internal to GDB: adapting GDB to use the Python limited C API.
- one external to GDB: lobbying the packagers to provide a default 
Python installation with libpython3.so so that a binary using the Python 
limited C API is supported without change to the build and runtime 
environment.

## RFC

Please let me know your thoughts on the above.
On the recommendation of Luis Machado, I added in CC Andrew Burgess, Tom 
Tromer and Simon Marchi. Feel free to add anyone you think relevant into 
the loop.

I am particularly interested in:
- any experience using the Python limited C API.
- stories of transition to the Python limited C API, and any pain point 
you met on the road.
- testing. I have had a look at the test coverage of the Python GDB API, 
and it seems ok at a first glance. Please let me know if you can see any 
difficulty with the current coverage.
- any performance impact caused by this transition that I should be 
aware of. And how did you measure it ? How critical are performances 
regarding the Python limited C API ?
- implementation details in GDB where you think that we might face 
problems with this approach.


## References

[1] This is true unless the program uses the Python limited C API, and 
the name of the shared library in DT_NEEDED matches libpython.so.

[2] This topic was already raised on the GDB mailing list in the past:
https://inbox.sourceware.org/gdb/0aa701db281b$b8ac9df0$2a05d9d0$@symas.com/
The proposed "right way" of packaging GDB consists in building GDB for 
every supported distributions against the default Python version of that 
distribution. This approach can certainly be made easier using Docker 
instead of physical machine.

[3] PEP 384 – Defining a Stable ABI, https://peps.python.org/pep-0384/

[4] PEP 652 – Maintaining the Stable ABI, https://peps.python.org/pep-0652/

[5] Status of Python versions, https://devguide.python.org/versions/

[6] Qt: The Transition To The Limited Python API (PEP384), 
https://doc.qt.io/qtforpython-6/developer/limited_api.html


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-06-13 14:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-29 16:14 [RFC] Allowing GDB to use a more recent version of Python at runtime than it was compiled with Matthieu Longo via Gdb
2025-05-29 17:42 ` Tom Tromey
2025-06-13 14:04   ` Andrew Burgess via Gdb

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox