From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Blandy To: "Ben Combee" Cc: , Subject: Re: IA32: printing FP register variables Date: Fri, 09 Jul 1999 13:50:00 -0000 Message-id: References: <199907091724.SAA31114@phal.cygnus.co.uk> <00d401beca31$3d752c10$3404010a@metrowerks.com> X-SW-Source: 1999-q3/msg00030.html I'm sorry, Ben, but I'm still confused about how you find the FP register containing a variable. I think there's a fundamental, difficult problem here which you don't seem to be mentioning at all. Perhaps you have solved it already, and assume we have too. In code generated by your compiler, is the value of TOP (the three-bit FPU stack pointer) at function entry known at compile time? Or does its value depend on the caller, and the caller's caller, etc.? For a function like this: double dot_product (int n, double *a, double *b) { int i; double sum = 0; for (i = 0; i < n; i++) sum += a[i] * b[i]; return sum; } EGCS generates code like this: 00000000 : 0: 55 pushl %ebp 1: 31 c0 xorl %eax,%eax 3: 89 e5 movl %esp,%ebp 5: 53 pushl %ebx 6: 8b 5d 08 movl 0x8(%ebp),%ebx 9: d9 ee fldz b: 8b 4d 0c movl 0xc(%ebp),%ecx e: 8b 55 10 movl 0x10(%ebp),%edx 11: 39 d8 cmpl %ebx,%eax 13: 7d 18 jnl 2d 15: 8d 74 26 00 leal 0x0(%esi,1),%esi 19: 8d bc 27 00 00 leal 0x0(%edi,1),%edi 1e: 00 00 20: dd 04 c1 fldl (%ecx,%eax,8) 23: dc 0c c2 fmull (%edx,%eax,8) 26: 40 incl %eax 27: 39 d8 cmpl %ebx,%eax 29: de c1 faddp %st,%st(1) 2b: 7c f3 jl 20 2d: 5b popl %ebx 2e: 89 ec movl %ebp,%esp 30: 5d popl %ebp 31: c3 ret 32: 8d b4 26 00 00 leal 0x0(%esi,1),%esi 37: 00 00 39: 8d bc 27 00 00 leal 0x0(%edi,1),%edi 3e: 00 00 (The bizarre nop leal instructions are generated by the assembler to get the loop aligned right. Ignore them.) The `fldz' at 9 pushes the value of `sum' on the FP stack. That stack register is where `sum' lives. However, since we don't know the value of TOP upon function entry, we don't know which physical register that is. So GCC can't use physical FP register numbers to communicate the variable's location to GDB. At different points in the function, the stack will have different depths. From b to 20, `sum' is in ST(0). But at 20 we push a[i] on the stack, so `sum' is now in ST(1). At 29 we add the product into `sum', and pop it off the stack, so from 2b to 31, `sum' is in ST(0) again. I should be able to step through this function, one instruction at a time, and print `sum' correctly at each point. But whether GDB should fetch ST(0) or ST(1) depends on where I am in the code. Note that there is no "frame pointer" for the FP stack. If I hit a breakpoint in the middle of the function, GDB has no way of knowing what TOP was at function entry. GDB could figure this out by disassembling the instruction stream and looking for FP instructions, but that's a pain. GCC always knows how to find the variables, of course, but it doesn't include its knowledge in the debug info. >From jimb@cygnus.com Fri Jul 09 13:52:00 1999 From: Jim Blandy To: "Ben Combee" Cc: , Subject: Re: IA32: printing FP register variables Date: Fri, 09 Jul 1999 13:52:00 -0000 Message-id: References: <199907090356.WAA01337@zwingli.cygnus.com> <000d01bec9c2$06f4fdb0$3404010a@metrowerks.com> <014f01beca41$c42b4e50$3404010a@metrowerks.com> X-SW-Source: 1999-q3/msg00031.html Content-length: 384 > At each instruction, the invariant that a is 0 from the bottom and b is 1 > from the bottom holds. Note, the bottom of the stack can be known by the > debugger via scanning the FPU tag word from the TOP element looking for a FP > register that is empty. All this info is provided by an FSAVE or FSTENV. That could work, but what if you happen to have all eight registers in use? >From jimb@cygnus.com Fri Jul 09 13:57:00 1999 From: Jim Blandy To: Joern Rennecke Cc: law@cygnus.com, egcs@egcs.cygnus.com, gdb@sourceware.cygnus.com Subject: Re: IA32: printing FP register variables Date: Fri, 09 Jul 1999 13:57:00 -0000 Message-id: References: <199907091822.TAA31184@phal.cygnus.co.uk> X-SW-Source: 1999-q3/msg00032.html Content-length: 853 > > STABS's live range splitting notation can certainly do the job > > correctly, but I wonder whether it can do it efficiently. For every > > instruction that changes TOP, you have to start a new range for every > > variable. So the size of debug info is O(number of insns * average > > number of live variables). > > You only have to care about variables that are live in the register stack > before or after the operation. So you can use an upper bound of 8. > Hence the deubg info is O(number of insns). Okay, but do you really want to emit up to eight stabs at every FP instruction that changes the stack? I dunno --- maybe it's not a big deal. It may not be worth creating a whole new form of debugging info to save that space. It is certainly a problem the various LRS representations handle. I'd certainly agree we should try it first. >From bcombee@metrowerks.com Fri Jul 09 14:00:00 1999 From: "Ben Combee" To: "Jim Blandy" Cc: , Subject: Re: IA32: printing FP register variables Date: Fri, 09 Jul 1999 14:00:00 -0000 Message-id: <01cc01beca4d$f76868a0$3404010a@metrowerks.com> References: <199907090356.WAA01337@zwingli.cygnus.com> <000d01bec9c2$06f4fdb0$3404010a@metrowerks.com> <014f01beca41$c42b4e50$3404010a@metrowerks.com> X-SW-Source: 1999-q3/msg00033.html Content-length: 999 > > At each instruction, the invariant that a is 0 from the bottom and b is 1 > > from the bottom holds. Note, the bottom of the stack can be known by the > > debugger via scanning the FPU tag word from the TOP element looking for a FP > > register that is empty. All this info is provided by an FSAVE or FSTENV. > > That could work, but what if you happen to have all eight registers in > use? Then the scan code sees that there are no empty registers, so the eighth register is the bottom of the stack. The IA32 FPU itself uses this logic -- if you push something on the stack such that TOP gets incremented to point to a non-empty value, you get a stack overflow. Of course, all this is also based on the assumption, explicit in the Win32 ABI, that the FPU stack is empty on entry to a function. I think this holds for the x86 Linux ABI as well -- I can't see any logical other way to do it since you don't generally know how many levels of nesting there are when your function is called.