On 01/09/2012 03:43 PM, Ulrich Weigand wrote:
> Pedro Alves wrote:
>> On 01/05/2012 07:53 PM, Ulrich Weigand wrote:
>>> Well, I guess one could implement some heuristics along those lines
>>> that do indeed work with Linux native and gdbserver targets.
>>> But this would still make a number of assumptions about how those fields
>>> are used -- which may be true at the moment, but not guaranteed.
>>>
>>> In particular, it seems to me that the remote protocol specification
>>> considers TID values to be defined by the remote side; the host side
>>> should only use them as identifiers without interpreting them.
>>
>> Yes, that is true in general.
>>
>>> While gdbserver does use the LWPID here, I don't think it is guaranteed that
>>> other implementations of the remote protocol do the same, even on
>>> Linux targets (i.e. where linux-tdep files get used).
>>
>> ... however, let's revisit the problem with a fresh start.
>>
>> We have a command "info proc FOO PID", that accepts any target process ID,
>> whose description is:
>>
>>   (gdb) help info proc
>>   Show /proc process information about any running process.
>>   Specify any process id, or use the program being debugged by default.
>>
>> Let's ignore the "or use the program being debugged by default" part
>> for now; it is a (big) convenience, and a separate problem, one of mapping
>> the program being debugged to a process id.  (The dropping of the
>> support for specifying any process id in the proposed implementations
>> seems more like accommodating the implementation, so let's step back on
>> that too.)
> 
> It seems to me we're getting to the core of the problem here.  For the
> existing "info proc" command, it indeed makes sense to talk about
> "process IDs", *because* the command is specific to a (native) target
> that supports processes and uses process IDs.
> 
> Except for this, common GDB user interface commands do not currently
> refer to "process IDs"; while GDB obviously used "ptid_t" internally,
> it is *interpreted* only by target code, common code at most uses it
> for comparions (is this the same process as another one).
> 
> [ The one obvious exception is the "attach" command which also takes
> a process ID.  However, this is arguably also a target-specific command,
> in that the generic implementation quickly calls through to target code
> that implements the operation, starting with *parsing the argument*. ]
> 
> Instead of refering to process IDs, GDB commands usually operate on
> the "current inferior", or else take GDB inferior (or thread) IDs as
> argument, which have no direct connection to the underlying OS
> process IDs.
> 
> It seems to me this was a deliberate decision to try and isolate the
> common GDB code and user interfaces from OS implementation details:
> GDB is supposed to work just the same on targets where there are no
> processes, or where they are identified by different means that just
> an integer ID.
> 
> 
> Now, given the "anomaly" of the "info proc" command as is, I guess
> there's two ways to look at how to move forward:
> 
> - We could say the anomaly was a mistake that we want to get rid of.
>   In this view, it naturally follows that we remove the PID argument
>   option and have the command operate on the "current inferior" as
>   all other commands.  This also makes an underlying interface that
>   would retrieve proc file content of the current inferior natural.
>   (This is what my patch set has implemented.)
> 
> - We say instead that, yes, we *want* OS PIDs to be used as first-
>   class user interface elements.  But this means to me that we
>   need to more generally make OS PIDs present in GDB common code
>   so that common code is at least able to associate its internal
>   notions of "inferior" with those IDs.  At a minimum, this would
>   imply we no longer consider "ptid_t" values to be opaque to
>   GDB common code, but rather enforce that they agree with user-
>   visible OS PID values.

Note that I'm not talking about the notion of a "process id" escape
escape to common/core code, but only to the linux gdbarch/osabi
specific bits; or other code that can deal with it, guarded by
"is there a notion of a process id on this target.

> Now, I'm not necessarily opposed to the second approach, I'm just
> not sure I see all the consequences ...   At a minimum, the use
> of the "magic 42000" becomes problematic if there is any chance
> the ptid_t gets exposed to the user at any point.
> 
>> So this sub-problem can be simply specified as:
>>
>>    Given a gdb inferior or thread, return it's target process id.
>>
>> Now, we can come up with a new method to retrieve that info from
>> the server.  However, all the Linux targets I know about that have
>> /proc contents access (gdbserver), have a 1:1 mapping in the RSP
>> between RSP process, and target process, so we might as well start
>> out with defaulting to assume that, and add such a mechanism when
>> we find a need for it.  IMO.
> 
> I don't quite understand what you mean here.  Could you elaborate
> how you propose to implement this routine "return the target process
> ID of a given GDB inferior/thread" without remote interface changes?
> This was exactly the "magic 42000" problem I was running into ...

As mentioned before, by broadcasting support for multi-process extensions
even with "target remote" (but don't allow debugging multiple processes),
and defaulting to assume a 1:1 mapping between target process id
and RSP process id, until some target needs to do something else, at
which point we define a new packet.

I've spent a bit today prototyping /proc access this way the way
I was imagining it.  See the attached patch series.  This is far from
complete.  It's just enough to get something working.

> 
>>  > So all in all, this still looks a violation of abstraction layers
>>  > to me,
>>
>> So I see this as two distinct, but related problems.  And when I look
>> at the /proc retrieval part alone, I favor the remote filesystem
>> access, as the most natural way to transparently make the command
>> work with remote targets.  And for cores too, as you yourself
>> mentioned, the kernel can put /proc contents in cores, and, I could
>> imagine taking a snapshot of /proc for the whole process tree (either
>> above a given process (children, siblings, etc.), or starting at
>> PID 1, and storing that in, or along the core, to be something useful.
> 
> I could imaging a core file of a process containing snapshots of /proc
> files *for that process*.  It would seem rather odd to me to have /proc
> files of *other* processes as part of a core file.  But that's certainly
> a side discussion ...
> 
>>  > which is the main reason why I advocated going back to the
>>  > TARGET_OBJECT_PROC approach ...
>>
>> If you're still not convinced, and others don't support my view,
>> than I will step back and won't block your going forward with that.
>> However, what did you think of my earlier comments regarding
>> splitting that into separate objects?
> 
> I must admit I don't see what the benefit of this is supposed to be.
> This seems to me to be the exact use case that "annex" is there to
> cover: a bunch of information with related content semantics, which
> are all accessed the same way, and the exact set is somewhat dynamic.
> Not using the annex would mean defining a whole bunch of new packet
> types, duplicated boilerplate code in GDB and gdbserver to hook them
> up, and then still the drawback that every new /proc file that may
> come up in the future will require extra code (including new gdbserver-side
> code!) to support.  And for all those drawbacks, I don't see any single
> benefit ...  Maybe you can elaborate?

- Decoupling of the objects in question from a "/proc" idea, so they
can be more generally used in other scenarios, like e.g., a remote
protocol implementation of target_pid_to_str (TARGET_OBJECT_PROC/exe).
- Let GDB have a finer grained idea of what subset of /proc-ish objects
are supported upfront (through qSupported), without any new mechanism.

> 
> Bye,
> Ulrich
>