RFA: prologue value modules

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* RFA: prologue value modules
@ 2006-03-23  5:20 Jim Blandy
  2006-03-23  5:24 ` Daniel Jacobowitz
  2006-03-24 14:48 ` Eli Zaretskii
  0 siblings, 2 replies; 17+ messages in thread
From: Jim Blandy @ 2006-03-23  5:20 UTC (permalink / raw)
  To: gdb-patches


Okay, time to get things started again here.

To recap: this patch adds prologue-value.[ch], support libraries for
the new prologue analysis strategy used in the Renesas M32C port and
some internal Red Hat ports.  I think they make prologue analysis
simpler to write and more robust.  The S390 port already uses functions
like these internally, with a slightly different interface.

The last time I posted these, Daniel J. had some suggestions about how
the interface might be changed; in particular, there is logic in the
m32c port (to which I referred folks as an example) which ideally
would be shared somewhere.  I was skeptical that those could be easily
put behind a reusable interface.  Daniel tried using this code as it
stands in another port he was doing, and concluded that it was enough
of a step forward as is that it should go in.

Okay to commit?

src/gdb/ChangeLog:
2006-03-22  Jim Blandy  <jimb@codesourcery.com>

	* prologue-value.c, prologue-value.h: New files.
	* Makefile.in (prologue_value_h): New variable.
	(HFILES_NO_SRCDIR): List prologue-value.h.
	(ALLDEPFILES): List prologue-value.c.
	(prologue-value.o): New rule.

Index: src/gdb/prologue-value.c
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.c
@@ -0,0 +1,591 @@
+/* Prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#include "defs.h"
+#include "gdb_string.h"
+#include "gdb_assert.h"
+#include "prologue-value.h"
+#include "regcache.h"
+
+\f
+/* Constructors.  */
+
+pv_t
+pv_unknown (void)
+{
+  pv_t v = { pvk_unknown, 0, 0 };
+
+  return v;
+}
+
+
+pv_t
+pv_constant (CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_constant;
+  v.reg = -1;                   /* for debugging */
+  v.k = k;
+
+  return v;
+}
+
+
+pv_t
+pv_register (int reg, CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_register;
+  v.reg = reg;
+  v.k = k;
+
+  return v;
+}
+
+
+\f
+/* Arithmetic operations.  */
+
+/* If one of *A and *B is a constant, and the other isn't, swap the
+   values as necessary to ensure that *B is the constant.  This can
+   reduce the number of cases we need to analyze in the functions
+   below.  */
+static void
+constant_last (pv_t *a, pv_t *b)
+{
+  if (a->kind == pvk_constant
+      && b->kind != pvk_constant)
+    {
+      pv_t temp = *a;
+      *a = *b;
+      *b = temp;
+    }
+}
+
+
+pv_t
+pv_add (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can add a constant to a register.  */
+  if (a.kind == pvk_register
+      && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k + b.k);
+
+  /* We can add a constant to another constant.  */
+  else if (a.kind == pvk_constant
+           && b.kind == pvk_constant)
+    return pv_constant (a.k + b.k);
+
+  /* Anything else we don't know how to add.  We don't have a
+     representation for, say, the sum of two registers, or a multiple
+     of a register's value (adding a register to itself).  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_add_constant (pv_t v, CORE_ADDR k)
+{
+  /* Rather than thinking of all the cases we can and can't handle,
+     we'll just let pv_add take care of that for us.  */
+  return pv_add (v, pv_constant (k));
+}
+
+
+pv_t
+pv_subtract (pv_t a, pv_t b)
+{
+  /* This isn't quite the same as negating B and adding it to A, since
+     we don't have a representation for the negation of anything but a
+     constant.  For example, we can't negate { pvk_register, R1, 10 },
+     but we do know that { pvk_register, R1, 10 } minus { pvk_register,
+     R1, 5 } is { pvk_constant, <ignored>, 5 }.
+
+     This means, for example, that we could subtract two stack
+     addresses; they're both relative to the original SP.  Since the
+     frame pointer is set based on the SP, its value will be the
+     original SP plus some constant (probably zero), so we can use its
+     value just fine, too.  */
+
+  constant_last (&a, &b);
+
+  /* We can subtract two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k - b.k);
+
+  /* We can subtract a constant from a register.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k - b.k);
+
+  /* We can subtract a register from itself, yielding a constant.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg)
+    return pv_constant (a.k - b.k);
+
+  /* We don't know how to subtract anything else.  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_logical_and (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can 'and' two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k & b.k);
+
+  /* We can 'and' anything with the constant zero.  */
+  else if (b.kind == pvk_constant
+           && b.k == 0)
+    return pv_constant (0);
+
+  /* We can 'and' anything with ~0.  */
+  else if (b.kind == pvk_constant
+           && b.k == ~ (CORE_ADDR) 0)
+    return a;
+
+  /* We can 'and' a register with itself.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg
+           && a.k == b.k)
+    return a;
+
+  /* Otherwise, we don't know.  */
+  else
+    return pv_unknown ();
+}
+
+
+\f
+/* Examining prologue values.  */
+
+int
+pv_is_identical (pv_t a, pv_t b)
+{
+  if (a.kind != b.kind)
+    return 0;
+
+  switch (a.kind)
+    {
+    case pvk_unknown:
+      return 1;
+    case pvk_constant:
+      return (a.k == b.k);
+    case pvk_register:
+      return (a.reg == b.reg && a.k == b.k);
+    default:
+      gdb_assert (0);
+    }
+}
+
+
+int
+pv_is_constant (pv_t a)
+{
+  return (a.kind == pvk_constant);
+}
+
+
+int
+pv_is_register (pv_t a, int r)
+{
+  return (a.kind == pvk_register
+          && a.reg == r);
+}
+
+
+int
+pv_is_register_k (pv_t a, int r, CORE_ADDR k)
+{
+  return (a.kind == pvk_register
+          && a.reg == r
+          && a.k == k);
+}
+
+
+enum pv_boolean
+pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                 pv_t array_addr, CORE_ADDR array_len,
+                 CORE_ADDR elt_size,
+                 int *i)
+{
+  /* Note that, since .k is a CORE_ADDR, and CORE_ADDR is unsigned, if
+     addr is *before* the start of the array, then this isn't going to
+     be negative...  */
+  pv_t offset = pv_subtract (addr, array_addr);
+
+  if (offset.kind == pvk_constant)
+    {
+      /* This is a rather odd test.  We want to know if the SIZE bytes
+         at ADDR don't overlap the array at all, so you'd expect it to
+         be an || expression: "if we're completely before || we're
+         completely after".  But with unsigned arithmetic, things are
+         different: since it's a number circle, not a number line, the
+         right values for offset.k are actually one contiguous range.  */
+      if (offset.k <= -size
+          && offset.k >= array_len * elt_size)
+        return pv_definite_no;
+      else if (offset.k % elt_size != 0
+               || size != elt_size)
+        return pv_maybe;
+      else
+        {
+          *i = offset.k / elt_size;
+          return pv_definite_yes;
+        }
+    }
+  else
+    return pv_maybe;
+}
+
+
+\f
+/* Areas.  */
+
+
+/* A particular value known to be stored in an area.
+
+   Entries form a ring, sorted by unsigned offset from the area's base
+   register's value.  Since entries can straddle the wrap-around point,
+   unsigned offsets form a circle, not a number line, so the list
+   itself is structured the same way --- there is no inherent head.
+   The entry with the lowest offset simply follows the entry with the
+   highest offset.  Entries may abut, but never overlap.  The area's
+   'entry' pointer points to an arbitrary node in the ring.  */
+struct area_entry
+{
+  /* Links in the doubly-linked ring.  */
+  struct area_entry *prev, *next;
+
+  /* Offset of this entry's address from the value of the base
+     register.  */
+  CORE_ADDR offset;
+
+  /* The size of this entry.  Note that an entry may wrap around from
+     the end of the address space to the beginning.  */
+  CORE_ADDR size;
+
+  /* The value stored here.  */
+  pv_t value;
+};
+
+
+struct pv_area
+{
+  /* This area's base register.  */
+  int base_reg;
+
+  /* The mask to apply to addresses, to make the wrap-around happen at
+     the right place.  */
+  CORE_ADDR addr_mask;
+
+  /* An element of the doubly-linked ring of entries, or zero if we
+     have none.  */
+  struct area_entry *entry;
+};
+
+
+struct pv_area *
+make_pv_area (int base_reg)
+{
+  struct pv_area *a = (struct pv_area *) xmalloc (sizeof (*a));
+
+  memset (a, 0, sizeof (*a));
+
+  a->base_reg = base_reg;
+  a->entry = 0;
+
+  /* Remember that shift amounts equal to the type's width are
+     undefined.  */
+  a->addr_mask = ((((CORE_ADDR) 1 << (TARGET_ADDR_BIT - 1)) - 1) << 1) | 1;
+
+  return a;
+}
+
+
+/* Delete all entries from AREA.  */
+static void
+clear_entries (struct pv_area *area)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    {
+      /* This needs to be a do-while loop, in order to actually
+         process the node being checked for in the terminating
+         condition.  */
+      do
+        {
+          struct area_entry *next = e->next;
+          xfree (e);
+        }
+      while (e != area->entry);
+
+      area->entry = 0;
+    }
+}
+
+
+void
+free_pv_area (struct pv_area *area)
+{
+  clear_entries (area);
+  xfree (area);
+}
+
+
+static void
+do_free_pv_area_cleanup (void *arg)
+{
+  free_pv_area ((struct pv_area *) arg);
+}
+
+
+struct cleanup *
+make_cleanup_free_pv_area (struct pv_area *area)
+{
+  return make_cleanup (do_free_pv_area_cleanup, (void *) area);
+}
+
+
+int
+pv_area_store_would_trash (struct pv_area *area, pv_t addr)
+{
+  /* It may seem odd that pvk_constant appears here --- after all,
+     that's the case where we know the most about the address!  But
+     pv_areas are always relative to a register, and we don't know the
+     value of the register, so we can't compare entry addresses to
+     constants.  */
+  return (addr.kind == pvk_unknown
+          || addr.kind == pvk_constant
+          || (addr.kind == pvk_register && addr.reg != area->base_reg));
+}
+
+
+/* Return a pointer to the first entry we hit in AREA starting at
+   OFFSET and going forward.
+
+   This may return zero, if AREA has no entries.
+
+   And since the entries are a ring, this may return an entry that
+   entirely preceeds OFFSET.  This is the correct behavior: depending
+   on the sizes involved, we could still overlap such an area, with
+   wrap-around.  */
+static struct area_entry *
+find_entry (struct pv_area *area, CORE_ADDR offset)
+{
+  struct area_entry *e = area->entry;
+
+  if (! e)
+    return 0;
+
+  /* If the next entry would be better than the current one, then scan
+     forward.  Since we use '<' in this loop, it always terminates.
+
+     Note that, even setting aside the addr_mask stuff, we must not
+     simplify this, in high school algebra fashion, to
+     (e->next->offset < e->offset), because of the way < interacts
+     with wrap-around.  We have to subtract offset from both sides to
+     make sure both things we're comparing are on the same side of the
+     discontinuity.  */
+  while (((e->next->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->next;
+
+  /* If the previous entry would be better than the current one, then
+     scan backwards.  */
+  while (((e->prev->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->prev;
+
+  /* In case there's some locality to the searches, set the area's
+     pointer to the entry we've found.  */
+  area->entry = e;
+
+  return e;
+}
+
+
+/* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY;
+   return zero otherwise.  AREA is the area to which ENTRY belongs.  */
+static int
+overlaps (struct pv_area *area,
+          struct area_entry *entry,
+          CORE_ADDR offset,
+          CORE_ADDR size)
+{
+  /* Think carefully about wrap-around before simplifying this.  */
+  return (((entry->offset - offset) & area->addr_mask) < size
+          || ((offset - entry->offset) & area->addr_mask) < entry->size);
+}
+
+
+void
+pv_area_store (struct pv_area *area,
+               pv_t addr,
+               CORE_ADDR size,
+               pv_t value)
+{
+  /* Remove any (potentially) overlapping entries.  */
+  if (pv_area_store_would_trash (area, addr))
+    clear_entries (area);
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* Delete all entries that we would overlap.  */
+      while (e && overlaps (area, e, offset, size))
+        {
+          struct area_entry *next = (e->next == e) ? 0 : e->next;
+          e->prev->next = e->next;
+          e->next->prev = e->prev;
+
+          xfree (e);
+          e = next;
+        }
+
+      /* Move the area's pointer to the next remaining entry.  This
+         will also zero the pointer if we've deleted all the entries.  */
+      area->entry = e;
+    }
+
+  /* Now, there are no entries overlapping us, and area->entry is
+     either zero or pointing at the closest entry after us.  We can
+     just insert ourselves before that.
+
+     But if we're storing an unknown value, don't bother --- that's
+     the default.  */
+  if (value.kind == pvk_unknown)
+    return;
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = (struct area_entry *) xmalloc (sizeof (*e));
+      e->offset = offset;
+      e->size = size;
+      e->value = value;
+
+      if (area->entry)
+        {
+          e->prev = area->entry->prev;
+          e->next = area->entry;
+          e->prev->next = e->next->prev = e;
+        }
+      else
+        {
+          e->prev = e->next = e;
+          area->entry = e;
+        }
+    }
+}
+
+
+pv_t
+pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size)
+{
+  /* If we have no entries, or we can't decide how ADDR relates to the
+     entries we do have, then the value is unknown.  */
+  if (! area->entry
+      || pv_area_store_would_trash (area, addr))
+    return pv_unknown ();
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* If this entry exactly matches what we're looking for, then
+         we're set.  Otherwise, say it's unknown.  */
+      if (e->offset == offset && e->size == size)
+        return e->value;
+      else
+        return pv_unknown ();
+    }
+}
+
+
+int
+pv_area_find_reg (struct pv_area *area,
+                  struct gdbarch *gdbarch,
+                  int reg,
+                  CORE_ADDR *offset_p)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    do
+      {
+        if (e->value.kind == pvk_register
+            && e->value.reg == reg
+            && e->value.k == 0
+            && e->size == register_size (gdbarch, reg))
+          {
+            if (offset_p)
+              *offset_p = e->offset;
+            return 1;
+          }
+
+        e = e->next;
+      }
+    while (e != area->entry);
+
+  return 0;
+}
+
+
+void
+pv_area_scan (struct pv_area *area,
+              void (*func) (void *closure,
+                            pv_t addr,
+                            CORE_ADDR size,
+                            pv_t value),
+              void *closure)
+{
+  struct area_entry *e = area->entry;
+  pv_t addr;
+
+  addr.kind = pvk_register;
+  addr.reg = area->base_reg;
+
+  if (e)
+    do
+      {
+        addr.k = e->offset;
+        func (closure, addr, e->size, e->value);
+        e = e->next;
+      }
+    while (e != area->entry);
+}
Index: src/gdb/prologue-value.h
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.h
@@ -0,0 +1,302 @@
+/* Interface to prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#ifndef PROLOGUE_VALUE_H
+#define PROLOGUE_VALUE_H
+
+/* When we analyze a prologue, we're really doing 'abstract
+   interpretation' or 'pseudo-evaluation': running the function's code
+   in simulation, but using conservative approximations of the values
+   it would have when it actually runs.  For example, if our function
+   starts with the instruction:
+
+      addi r1, 42     # add 42 to r1
+
+   we don't know exactly what value will be in r1 after executing this
+   instruction, but we do know it'll be 42 greater than its original
+   value.
+
+   If we then see an instruction like:
+
+      addi r1, 22     # add 22 to r1
+
+   we still don't know what r1's value is, but again, we can say it is
+   now 64 greater than its original value.
+
+   If the next instruction were:
+
+      mov r2, r1      # set r2 to r1's value
+
+   then we can say that r2's value is now the original value of r1
+   plus 64.
+
+   It's common for prologues to save registers on the stack, so we'll
+   need to track the values of stack frame slots, as well as the
+   registers.  So after an instruction like this:
+
+      mov (fp+4), r2
+
+   Then we'd know that the stack slot four bytes above the frame
+   pointer holds the original value of r1 plus 64.
+
+   And so on.
+
+   Of course, this can only go so far before it gets unreasonable.  If
+   we wanted to be able to say anything about the value of r1 after
+   the instruction:
+
+      xor r1, r3      # exclusive-or r1 and r3, place result in r1
+
+   then things would get pretty complex.  But remember, we're just
+   doing a conservative approximation; if exclusive-or instructions
+   aren't relevant to prologues, we can just say r1's value is now
+   'unknown'.  We can ignore things that are too complex, if that loss
+   of information is acceptable for our application.
+
+   So when I say "conservative approximation" here, what I mean is an
+   approximation that is either accurate, or marked "unknown", but
+   never inaccurate.
+
+   Once you've reached the current PC, or an instruction that you
+   don't know how to simulate, you stop.  Now you can examine the
+   state of the registers and stack slots you've kept track of.
+
+   - To see how large your stack frame is, just check the value of the
+     stack pointer register; if it's the original value of the SP
+     minus a constant, then that constant is the stack frame's size.
+     If the SP's value has been marked as 'unknown', then that means
+     the prologue has done something too complex for us to track, and
+     we don't know the frame size.
+
+   - To see where we've saved the previous frame's registers, we just
+     search the values we've tracked --- stack slots, usually, but
+     registers, too, if you want --- for something equal to the
+     register's original value.  If the ABI suggests a standard place
+     to save a given register, then we can check there first, but
+     really, anything that will get us back the original value will
+     probably work.
+
+   Sure, this takes some work.  But prologue analyzers aren't
+   quick-and-simple pattern patching to recognize a few fixed prologue
+   forms any more; they're big, hairy functions.  Along with inferior
+   function calls, prologue analysis accounts for a substantial
+   portion of the time needed to stabilize a GDB port.  So I think
+   it's worthwhile to look for an approach that will be easier to
+   understand and maintain.  In the approach used here:
+
+   - It's easier to see that the analyzer is correct: you just see
+     whether the analyzer properly (albiet conservatively) simulates
+     the effect of each instruction.
+
+   - It's easier to extend the analyzer: you can add support for new
+     instructions, and know that you haven't broken anything that
+     wasn't already broken before.
+
+   - It's orthogonal: to gather new information, you don't need to
+     complicate the code for each instruction.  As long as your domain
+     of conservative values is already detailed enough to tell you
+     what you need, then all the existing instruction simulations are
+     already gathering the right data for you.
+
+   A 'struct prologue_value' is a conservative approximation of the
+   real value the register or stack slot will have.  */
+
+struct prologue_value {
+
+  /* What sort of value is this?  This determines the interpretation
+     of subsequent fields.  */
+  enum {
+
+    /* We don't know anything about the value.  This is also used for
+       values we could have kept track of, when doing so would have
+       been too complex and we don't want to bother.  The bottom of
+       our lattice.  */
+    pvk_unknown,
+
+    /* A known constant.  K is its value.  */
+    pvk_constant,
+
+    /* The value that register REG originally had *UPON ENTRY TO THE
+       FUNCTION*, plus K.  If K is zero, this means, obviously, just
+       the value REG had upon entry to the function.  REG is a GDB
+       register number.  Before we start interpreting, we initialize
+       every register R to { pvk_register, R, 0 }.  */
+    pvk_register,
+
+  } kind;
+
+  /* The meanings of the following fields depend on 'kind'; see the
+     comments for the specific 'kind' values.  */
+  int reg;
+  CORE_ADDR k;
+};
+
+typedef struct prologue_value pv_t;
+
+
+/* Return the unknown prologue value --- { pvk_unknown, ?, ? }.  */
+pv_t pv_unknown (void);
+
+/* Return the prologue value representing the constant K.  */
+pv_t pv_constant (CORE_ADDR k);
+
+/* Return the prologue value representing the original value of
+   register REG, plus the constant K.  */
+pv_t pv_register (int reg, CORE_ADDR k);
+
+
+/* Return conservative approximations of the results of the following
+   operations.  */
+pv_t pv_add (pv_t a, pv_t b);               /* a + b */
+pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */
+pv_t pv_subtract (pv_t a, pv_t b);          /* a - b */
+pv_t pv_logical_and (pv_t a, pv_t b);       /* a & b */
+
+
+/* Return non-zero iff A and B are identical expressions.
+
+   This is not the same as asking if the two values are equal; the
+   result of such a comparison would have to be a pv_boolean, and
+   asking whether two 'unknown' values were equal would give you
+   pv_maybe.  Same for comparing, say, { pvk_register, R1, 0 } and {
+   pvk_register, R2, 0}.
+
+   Instead, this function asks whether the two representations are the
+   same.  */
+int pv_is_identical (pv_t a, pv_t b);
+
+
+/* Return non-zero if A is known to be a constant.  */
+int pv_is_constant (pv_t a);
+
+/* Return non-zero if A is the original value of register number R
+   plus some constant, zero otherwise.  */
+int pv_is_register (pv_t a, int r);
+
+
+/* Return non-zero if A is the original value of register R plus the
+   constant K.  */
+int pv_is_register_k (pv_t a, int r, CORE_ADDR k);
+
+/* A conservative boolean type, including "maybe", when we can't
+   figure out whether something is true or not.  */
+enum pv_boolean {
+  pv_maybe,
+  pv_definite_yes,
+  pv_definite_no,
+};
+
+
+/* Decide whether a reference to SIZE bytes at ADDR refers exactly to
+   an element of an array.  The array starts at ARRAY_ADDR, and has
+   ARRAY_LEN values of ELT_SIZE bytes each.  If ADDR definitely does
+   refer to an array element, set *I to the index of the referenced
+   element in the array, and return pv_definite_yes.  If it definitely
+   doesn't, return pv_definite_no.  If we can't tell, return pv_maybe.
+
+   If the reference does touch the array, but doesn't fall exactly on
+   an element boundary, or doesn't refer to the whole element, return
+   pv_maybe.  */
+enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                                 pv_t array_addr, CORE_ADDR array_len,
+                                 CORE_ADDR elt_size,
+                                 int *i);
+
+
+/* A 'struct pv_area' keeps track of values stored in a particular
+   region of memory.  */
+struct pv_area;
+
+/* Create a new area, tracking stores relative to the original value
+   of BASE_REG.  If BASE_REG is SP, then this effectively records the
+   contents of the stack frame: the original value of the SP is the
+   frame's CFA, or some constant offset from it.
+
+   Stores to constant addresses, unknown addresses, or to addresses
+   relative to registers other than BASE_REG will trash this area; see
+   pv_area_store_would_trash.  */
+struct pv_area *make_pv_area (int base_reg);
+
+/* Free AREA.  */
+void free_pv_area (struct pv_area *area);
+
+
+/* Register a cleanup to free AREA.  */
+struct cleanup *make_cleanup_free_pv_area (struct pv_area *area);
+
+
+/* Store the SIZE-byte value VALUE at ADDR in AREA.
+
+   If ADDR is not relative to the same base register we used in
+   creating AREA, then we can't tell which values here the stored
+   value might overlap, and we'll have to mark everything as
+   unknown.  */
+void pv_area_store (struct pv_area *area,
+                    pv_t addr,
+                    CORE_ADDR size,
+                    pv_t value);
+
+/* Return the SIZE-byte value at ADDR in AREA.  This may return
+   pv_unknown ().  */
+pv_t pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size);
+
+/* Return true if storing to address ADDR in AREA would force us to
+   mark the contents of the entire area as unknown.  This could happen
+   if, say, ADDR is unknown, since we could be storing anywhere.  Or,
+   it could happen if ADDR is relative to a different register than
+   the other stores base register, since we don't know the relative
+   values of the two registers.
+
+   If you've reached such a store, it may be better to simply stop the
+   prologue analysis, and return the information you've gathered,
+   instead of losing all that information, most of which is probably
+   okay.  */
+int pv_area_store_would_trash (struct pv_area *area, pv_t addr);
+
+
+/* Search AREA for the original value of REGISTER.  If we can't find
+   it, return zero; if we can find it, return a non-zero value, and if
+   OFFSET_P is non-zero, set *OFFSET_P to the register's offset within
+   AREA.  GDBARCH is the architecture of which REGISTER is a member.
+
+   In the worst case, this takes time proportional to the number of
+   items stored in AREA.  If you plan to gather a lot of information
+   about registers saved in AREA, consider calling pv_area_scan
+   instead, and collecting all your information in one pass.  */
+int pv_area_find_reg (struct pv_area *area,
+                      struct gdbarch *gdbarch,
+                      int register,
+                      CORE_ADDR *offset_p);
+
+
+/* For every part of AREA whose value we know, apply FUNC to CLOSURE,
+   the value's address, its size, and the value itself.  */
+void pv_area_scan (struct pv_area *area,
+                   void (*func) (void *closure,
+                                 pv_t addr,
+                                 CORE_ADDR size,
+                                 pv_t value),
+                   void *closure);
+
+
+#endif /* PROLOGUE_VALUE_H */


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-23  5:20 RFA: prologue value modules Jim Blandy
@ 2006-03-23  5:24 ` Daniel Jacobowitz
  2006-03-23  5:32   ` Randolph Chung
  2006-03-23  5:39   ` Jim Blandy
  2006-03-24 14:48 ` Eli Zaretskii
  1 sibling, 2 replies; 17+ messages in thread
From: Daniel Jacobowitz @ 2006-03-23  5:24 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

On Wed, Mar 22, 2006 at 08:37:20PM -0800, Jim Blandy wrote:
> Okay to commit?

As far as I'm concerned, yes; anyone else have comments on the new
code?

> 	* prologue-value.c, prologue-value.h: New files.
> 	* Makefile.in (prologue_value_h): New variable.
> 	(HFILES_NO_SRCDIR): List prologue-value.h.
> 	(ALLDEPFILES): List prologue-value.c.
> 	(prologue-value.o): New rule.

Hmm - why not always build it?  It won't get linked in if nothing needs
it, and then ports won't need to mention it in TDEPFILES.  And that'll
encourage people to use it I hope :-)

The other port I was using this code for was ARM - I only rewrote the
Thumb analyzer, not the ARM one, but it was simpler and better.  Once
this goes in I may try to find some time to do ARM mode too, and submit
it.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-23  5:24 ` Daniel Jacobowitz
@ 2006-03-23  5:32   ` Randolph Chung
  2006-03-23  5:39   ` Jim Blandy
  1 sibling, 0 replies; 17+ messages in thread
From: Randolph Chung @ 2006-03-23  5:32 UTC (permalink / raw)
  To: gdb-patches

> As far as I'm concerned, yes; anyone else have comments on the new
> code?

Not comments on the code per se, but I'm interested in converting the
hppa prologue analyzer to use this. It's got to be one of the ugliest of
all the ones in gdb (though it seems to work quite well in practice)....
we'll see how it goes :)

randolph

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-23  5:24 ` Daniel Jacobowitz
  2006-03-23  5:32   ` Randolph Chung
@ 2006-03-23  5:39   ` Jim Blandy
  2006-03-23 14:06     ` Jim Blandy
  2006-03-24  1:03     ` Jim Blandy
  1 sibling, 2 replies; 17+ messages in thread
From: Jim Blandy @ 2006-03-23  5:39 UTC (permalink / raw)
  To: gdb-patches


Daniel Jacobowitz <drow@false.org> writes:
> On Wed, Mar 22, 2006 at 08:37:20PM -0800, Jim Blandy wrote:
>> Okay to commit?
>
> As far as I'm concerned, yes; anyone else have comments on the new
> code?
>
>> 	* prologue-value.c, prologue-value.h: New files.
>> 	* Makefile.in (prologue_value_h): New variable.
>> 	(HFILES_NO_SRCDIR): List prologue-value.h.
>> 	(ALLDEPFILES): List prologue-value.c.
>> 	(prologue-value.o): New rule.
>
> Hmm - why not always build it?  It won't get linked in if nothing needs
> it, and then ports won't need to mention it in TDEPFILES.  And that'll
> encourage people to use it I hope :-)

Okay.  This patch always builds prologue-value.o.

src/gdb/ChangeLog:
2006-03-22  Jim Blandy  <jimb@codesourcery.com>

	* prologue-value.c, prologue-value.h: New files.
	* Makefile.in (prologue_value_h): New variable.
	(HFILES_NO_SRCDIR): List prologue-value.h.
	(SFILES): List prologue-value.c.
	(COMMON_OBS): List prologue-value.o.
	(prologue-value.o): New rule.

Index: src/gdb/prologue-value.c
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.c
@@ -0,0 +1,591 @@
+/* Prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#include "defs.h"
+#include "gdb_string.h"
+#include "gdb_assert.h"
+#include "prologue-value.h"
+#include "regcache.h"
+
+\f
+/* Constructors.  */
+
+pv_t
+pv_unknown (void)
+{
+  pv_t v = { pvk_unknown, 0, 0 };
+
+  return v;
+}
+
+
+pv_t
+pv_constant (CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_constant;
+  v.reg = -1;                   /* for debugging */
+  v.k = k;
+
+  return v;
+}
+
+
+pv_t
+pv_register (int reg, CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_register;
+  v.reg = reg;
+  v.k = k;
+
+  return v;
+}
+
+
+\f
+/* Arithmetic operations.  */
+
+/* If one of *A and *B is a constant, and the other isn't, swap the
+   values as necessary to ensure that *B is the constant.  This can
+   reduce the number of cases we need to analyze in the functions
+   below.  */
+static void
+constant_last (pv_t *a, pv_t *b)
+{
+  if (a->kind == pvk_constant
+      && b->kind != pvk_constant)
+    {
+      pv_t temp = *a;
+      *a = *b;
+      *b = temp;
+    }
+}
+
+
+pv_t
+pv_add (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can add a constant to a register.  */
+  if (a.kind == pvk_register
+      && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k + b.k);
+
+  /* We can add a constant to another constant.  */
+  else if (a.kind == pvk_constant
+           && b.kind == pvk_constant)
+    return pv_constant (a.k + b.k);
+
+  /* Anything else we don't know how to add.  We don't have a
+     representation for, say, the sum of two registers, or a multiple
+     of a register's value (adding a register to itself).  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_add_constant (pv_t v, CORE_ADDR k)
+{
+  /* Rather than thinking of all the cases we can and can't handle,
+     we'll just let pv_add take care of that for us.  */
+  return pv_add (v, pv_constant (k));
+}
+
+
+pv_t
+pv_subtract (pv_t a, pv_t b)
+{
+  /* This isn't quite the same as negating B and adding it to A, since
+     we don't have a representation for the negation of anything but a
+     constant.  For example, we can't negate { pvk_register, R1, 10 },
+     but we do know that { pvk_register, R1, 10 } minus { pvk_register,
+     R1, 5 } is { pvk_constant, <ignored>, 5 }.
+
+     This means, for example, that we could subtract two stack
+     addresses; they're both relative to the original SP.  Since the
+     frame pointer is set based on the SP, its value will be the
+     original SP plus some constant (probably zero), so we can use its
+     value just fine, too.  */
+
+  constant_last (&a, &b);
+
+  /* We can subtract two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k - b.k);
+
+  /* We can subtract a constant from a register.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k - b.k);
+
+  /* We can subtract a register from itself, yielding a constant.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg)
+    return pv_constant (a.k - b.k);
+
+  /* We don't know how to subtract anything else.  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_logical_and (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can 'and' two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k & b.k);
+
+  /* We can 'and' anything with the constant zero.  */
+  else if (b.kind == pvk_constant
+           && b.k == 0)
+    return pv_constant (0);
+
+  /* We can 'and' anything with ~0.  */
+  else if (b.kind == pvk_constant
+           && b.k == ~ (CORE_ADDR) 0)
+    return a;
+
+  /* We can 'and' a register with itself.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg
+           && a.k == b.k)
+    return a;
+
+  /* Otherwise, we don't know.  */
+  else
+    return pv_unknown ();
+}
+
+
+\f
+/* Examining prologue values.  */
+
+int
+pv_is_identical (pv_t a, pv_t b)
+{
+  if (a.kind != b.kind)
+    return 0;
+
+  switch (a.kind)
+    {
+    case pvk_unknown:
+      return 1;
+    case pvk_constant:
+      return (a.k == b.k);
+    case pvk_register:
+      return (a.reg == b.reg && a.k == b.k);
+    default:
+      gdb_assert (0);
+    }
+}
+
+
+int
+pv_is_constant (pv_t a)
+{
+  return (a.kind == pvk_constant);
+}
+
+
+int
+pv_is_register (pv_t a, int r)
+{
+  return (a.kind == pvk_register
+          && a.reg == r);
+}
+
+
+int
+pv_is_register_k (pv_t a, int r, CORE_ADDR k)
+{
+  return (a.kind == pvk_register
+          && a.reg == r
+          && a.k == k);
+}
+
+
+enum pv_boolean
+pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                 pv_t array_addr, CORE_ADDR array_len,
+                 CORE_ADDR elt_size,
+                 int *i)
+{
+  /* Note that, since .k is a CORE_ADDR, and CORE_ADDR is unsigned, if
+     addr is *before* the start of the array, then this isn't going to
+     be negative...  */
+  pv_t offset = pv_subtract (addr, array_addr);
+
+  if (offset.kind == pvk_constant)
+    {
+      /* This is a rather odd test.  We want to know if the SIZE bytes
+         at ADDR don't overlap the array at all, so you'd expect it to
+         be an || expression: "if we're completely before || we're
+         completely after".  But with unsigned arithmetic, things are
+         different: since it's a number circle, not a number line, the
+         right values for offset.k are actually one contiguous range.  */
+      if (offset.k <= -size
+          && offset.k >= array_len * elt_size)
+        return pv_definite_no;
+      else if (offset.k % elt_size != 0
+               || size != elt_size)
+        return pv_maybe;
+      else
+        {
+          *i = offset.k / elt_size;
+          return pv_definite_yes;
+        }
+    }
+  else
+    return pv_maybe;
+}
+
+
+\f
+/* Areas.  */
+
+
+/* A particular value known to be stored in an area.
+
+   Entries form a ring, sorted by unsigned offset from the area's base
+   register's value.  Since entries can straddle the wrap-around point,
+   unsigned offsets form a circle, not a number line, so the list
+   itself is structured the same way --- there is no inherent head.
+   The entry with the lowest offset simply follows the entry with the
+   highest offset.  Entries may abut, but never overlap.  The area's
+   'entry' pointer points to an arbitrary node in the ring.  */
+struct area_entry
+{
+  /* Links in the doubly-linked ring.  */
+  struct area_entry *prev, *next;
+
+  /* Offset of this entry's address from the value of the base
+     register.  */
+  CORE_ADDR offset;
+
+  /* The size of this entry.  Note that an entry may wrap around from
+     the end of the address space to the beginning.  */
+  CORE_ADDR size;
+
+  /* The value stored here.  */
+  pv_t value;
+};
+
+
+struct pv_area
+{
+  /* This area's base register.  */
+  int base_reg;
+
+  /* The mask to apply to addresses, to make the wrap-around happen at
+     the right place.  */
+  CORE_ADDR addr_mask;
+
+  /* An element of the doubly-linked ring of entries, or zero if we
+     have none.  */
+  struct area_entry *entry;
+};
+
+
+struct pv_area *
+make_pv_area (int base_reg)
+{
+  struct pv_area *a = (struct pv_area *) xmalloc (sizeof (*a));
+
+  memset (a, 0, sizeof (*a));
+
+  a->base_reg = base_reg;
+  a->entry = 0;
+
+  /* Remember that shift amounts equal to the type's width are
+     undefined.  */
+  a->addr_mask = ((((CORE_ADDR) 1 << (TARGET_ADDR_BIT - 1)) - 1) << 1) | 1;
+
+  return a;
+}
+
+
+/* Delete all entries from AREA.  */
+static void
+clear_entries (struct pv_area *area)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    {
+      /* This needs to be a do-while loop, in order to actually
+         process the node being checked for in the terminating
+         condition.  */
+      do
+        {
+          struct area_entry *next = e->next;
+          xfree (e);
+        }
+      while (e != area->entry);
+
+      area->entry = 0;
+    }
+}
+
+
+void
+free_pv_area (struct pv_area *area)
+{
+  clear_entries (area);
+  xfree (area);
+}
+
+
+static void
+do_free_pv_area_cleanup (void *arg)
+{
+  free_pv_area ((struct pv_area *) arg);
+}
+
+
+struct cleanup *
+make_cleanup_free_pv_area (struct pv_area *area)
+{
+  return make_cleanup (do_free_pv_area_cleanup, (void *) area);
+}
+
+
+int
+pv_area_store_would_trash (struct pv_area *area, pv_t addr)
+{
+  /* It may seem odd that pvk_constant appears here --- after all,
+     that's the case where we know the most about the address!  But
+     pv_areas are always relative to a register, and we don't know the
+     value of the register, so we can't compare entry addresses to
+     constants.  */
+  return (addr.kind == pvk_unknown
+          || addr.kind == pvk_constant
+          || (addr.kind == pvk_register && addr.reg != area->base_reg));
+}
+
+
+/* Return a pointer to the first entry we hit in AREA starting at
+   OFFSET and going forward.
+
+   This may return zero, if AREA has no entries.
+
+   And since the entries are a ring, this may return an entry that
+   entirely preceeds OFFSET.  This is the correct behavior: depending
+   on the sizes involved, we could still overlap such an area, with
+   wrap-around.  */
+static struct area_entry *
+find_entry (struct pv_area *area, CORE_ADDR offset)
+{
+  struct area_entry *e = area->entry;
+
+  if (! e)
+    return 0;
+
+  /* If the next entry would be better than the current one, then scan
+     forward.  Since we use '<' in this loop, it always terminates.
+
+     Note that, even setting aside the addr_mask stuff, we must not
+     simplify this, in high school algebra fashion, to
+     (e->next->offset < e->offset), because of the way < interacts
+     with wrap-around.  We have to subtract offset from both sides to
+     make sure both things we're comparing are on the same side of the
+     discontinuity.  */
+  while (((e->next->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->next;
+
+  /* If the previous entry would be better than the current one, then
+     scan backwards.  */
+  while (((e->prev->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->prev;
+
+  /* In case there's some locality to the searches, set the area's
+     pointer to the entry we've found.  */
+  area->entry = e;
+
+  return e;
+}
+
+
+/* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY;
+   return zero otherwise.  AREA is the area to which ENTRY belongs.  */
+static int
+overlaps (struct pv_area *area,
+          struct area_entry *entry,
+          CORE_ADDR offset,
+          CORE_ADDR size)
+{
+  /* Think carefully about wrap-around before simplifying this.  */
+  return (((entry->offset - offset) & area->addr_mask) < size
+          || ((offset - entry->offset) & area->addr_mask) < entry->size);
+}
+
+
+void
+pv_area_store (struct pv_area *area,
+               pv_t addr,
+               CORE_ADDR size,
+               pv_t value)
+{
+  /* Remove any (potentially) overlapping entries.  */
+  if (pv_area_store_would_trash (area, addr))
+    clear_entries (area);
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* Delete all entries that we would overlap.  */
+      while (e && overlaps (area, e, offset, size))
+        {
+          struct area_entry *next = (e->next == e) ? 0 : e->next;
+          e->prev->next = e->next;
+          e->next->prev = e->prev;
+
+          xfree (e);
+          e = next;
+        }
+
+      /* Move the area's pointer to the next remaining entry.  This
+         will also zero the pointer if we've deleted all the entries.  */
+      area->entry = e;
+    }
+
+  /* Now, there are no entries overlapping us, and area->entry is
+     either zero or pointing at the closest entry after us.  We can
+     just insert ourselves before that.
+
+     But if we're storing an unknown value, don't bother --- that's
+     the default.  */
+  if (value.kind == pvk_unknown)
+    return;
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = (struct area_entry *) xmalloc (sizeof (*e));
+      e->offset = offset;
+      e->size = size;
+      e->value = value;
+
+      if (area->entry)
+        {
+          e->prev = area->entry->prev;
+          e->next = area->entry;
+          e->prev->next = e->next->prev = e;
+        }
+      else
+        {
+          e->prev = e->next = e;
+          area->entry = e;
+        }
+    }
+}
+
+
+pv_t
+pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size)
+{
+  /* If we have no entries, or we can't decide how ADDR relates to the
+     entries we do have, then the value is unknown.  */
+  if (! area->entry
+      || pv_area_store_would_trash (area, addr))
+    return pv_unknown ();
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* If this entry exactly matches what we're looking for, then
+         we're set.  Otherwise, say it's unknown.  */
+      if (e->offset == offset && e->size == size)
+        return e->value;
+      else
+        return pv_unknown ();
+    }
+}
+
+
+int
+pv_area_find_reg (struct pv_area *area,
+                  struct gdbarch *gdbarch,
+                  int reg,
+                  CORE_ADDR *offset_p)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    do
+      {
+        if (e->value.kind == pvk_register
+            && e->value.reg == reg
+            && e->value.k == 0
+            && e->size == register_size (gdbarch, reg))
+          {
+            if (offset_p)
+              *offset_p = e->offset;
+            return 1;
+          }
+
+        e = e->next;
+      }
+    while (e != area->entry);
+
+  return 0;
+}
+
+
+void
+pv_area_scan (struct pv_area *area,
+              void (*func) (void *closure,
+                            pv_t addr,
+                            CORE_ADDR size,
+                            pv_t value),
+              void *closure)
+{
+  struct area_entry *e = area->entry;
+  pv_t addr;
+
+  addr.kind = pvk_register;
+  addr.reg = area->base_reg;
+
+  if (e)
+    do
+      {
+        addr.k = e->offset;
+        func (closure, addr, e->size, e->value);
+        e = e->next;
+      }
+    while (e != area->entry);
+}
Index: src/gdb/prologue-value.h
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.h
@@ -0,0 +1,302 @@
+/* Interface to prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#ifndef PROLOGUE_VALUE_H
+#define PROLOGUE_VALUE_H
+
+/* When we analyze a prologue, we're really doing 'abstract
+   interpretation' or 'pseudo-evaluation': running the function's code
+   in simulation, but using conservative approximations of the values
+   it would have when it actually runs.  For example, if our function
+   starts with the instruction:
+
+      addi r1, 42     # add 42 to r1
+
+   we don't know exactly what value will be in r1 after executing this
+   instruction, but we do know it'll be 42 greater than its original
+   value.
+
+   If we then see an instruction like:
+
+      addi r1, 22     # add 22 to r1
+
+   we still don't know what r1's value is, but again, we can say it is
+   now 64 greater than its original value.
+
+   If the next instruction were:
+
+      mov r2, r1      # set r2 to r1's value
+
+   then we can say that r2's value is now the original value of r1
+   plus 64.
+
+   It's common for prologues to save registers on the stack, so we'll
+   need to track the values of stack frame slots, as well as the
+   registers.  So after an instruction like this:
+
+      mov (fp+4), r2
+
+   Then we'd know that the stack slot four bytes above the frame
+   pointer holds the original value of r1 plus 64.
+
+   And so on.
+
+   Of course, this can only go so far before it gets unreasonable.  If
+   we wanted to be able to say anything about the value of r1 after
+   the instruction:
+
+      xor r1, r3      # exclusive-or r1 and r3, place result in r1
+
+   then things would get pretty complex.  But remember, we're just
+   doing a conservative approximation; if exclusive-or instructions
+   aren't relevant to prologues, we can just say r1's value is now
+   'unknown'.  We can ignore things that are too complex, if that loss
+   of information is acceptable for our application.
+
+   So when I say "conservative approximation" here, what I mean is an
+   approximation that is either accurate, or marked "unknown", but
+   never inaccurate.
+
+   Once you've reached the current PC, or an instruction that you
+   don't know how to simulate, you stop.  Now you can examine the
+   state of the registers and stack slots you've kept track of.
+
+   - To see how large your stack frame is, just check the value of the
+     stack pointer register; if it's the original value of the SP
+     minus a constant, then that constant is the stack frame's size.
+     If the SP's value has been marked as 'unknown', then that means
+     the prologue has done something too complex for us to track, and
+     we don't know the frame size.
+
+   - To see where we've saved the previous frame's registers, we just
+     search the values we've tracked --- stack slots, usually, but
+     registers, too, if you want --- for something equal to the
+     register's original value.  If the ABI suggests a standard place
+     to save a given register, then we can check there first, but
+     really, anything that will get us back the original value will
+     probably work.
+
+   Sure, this takes some work.  But prologue analyzers aren't
+   quick-and-simple pattern patching to recognize a few fixed prologue
+   forms any more; they're big, hairy functions.  Along with inferior
+   function calls, prologue analysis accounts for a substantial
+   portion of the time needed to stabilize a GDB port.  So I think
+   it's worthwhile to look for an approach that will be easier to
+   understand and maintain.  In the approach used here:
+
+   - It's easier to see that the analyzer is correct: you just see
+     whether the analyzer properly (albiet conservatively) simulates
+     the effect of each instruction.
+
+   - It's easier to extend the analyzer: you can add support for new
+     instructions, and know that you haven't broken anything that
+     wasn't already broken before.
+
+   - It's orthogonal: to gather new information, you don't need to
+     complicate the code for each instruction.  As long as your domain
+     of conservative values is already detailed enough to tell you
+     what you need, then all the existing instruction simulations are
+     already gathering the right data for you.
+
+   A 'struct prologue_value' is a conservative approximation of the
+   real value the register or stack slot will have.  */
+
+struct prologue_value {
+
+  /* What sort of value is this?  This determines the interpretation
+     of subsequent fields.  */
+  enum {
+
+    /* We don't know anything about the value.  This is also used for
+       values we could have kept track of, when doing so would have
+       been too complex and we don't want to bother.  The bottom of
+       our lattice.  */
+    pvk_unknown,
+
+    /* A known constant.  K is its value.  */
+    pvk_constant,
+
+    /* The value that register REG originally had *UPON ENTRY TO THE
+       FUNCTION*, plus K.  If K is zero, this means, obviously, just
+       the value REG had upon entry to the function.  REG is a GDB
+       register number.  Before we start interpreting, we initialize
+       every register R to { pvk_register, R, 0 }.  */
+    pvk_register,
+
+  } kind;
+
+  /* The meanings of the following fields depend on 'kind'; see the
+     comments for the specific 'kind' values.  */
+  int reg;
+  CORE_ADDR k;
+};
+
+typedef struct prologue_value pv_t;
+
+
+/* Return the unknown prologue value --- { pvk_unknown, ?, ? }.  */
+pv_t pv_unknown (void);
+
+/* Return the prologue value representing the constant K.  */
+pv_t pv_constant (CORE_ADDR k);
+
+/* Return the prologue value representing the original value of
+   register REG, plus the constant K.  */
+pv_t pv_register (int reg, CORE_ADDR k);
+
+
+/* Return conservative approximations of the results of the following
+   operations.  */
+pv_t pv_add (pv_t a, pv_t b);               /* a + b */
+pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */
+pv_t pv_subtract (pv_t a, pv_t b);          /* a - b */
+pv_t pv_logical_and (pv_t a, pv_t b);       /* a & b */
+
+
+/* Return non-zero iff A and B are identical expressions.
+
+   This is not the same as asking if the two values are equal; the
+   result of such a comparison would have to be a pv_boolean, and
+   asking whether two 'unknown' values were equal would give you
+   pv_maybe.  Same for comparing, say, { pvk_register, R1, 0 } and {
+   pvk_register, R2, 0}.
+
+   Instead, this function asks whether the two representations are the
+   same.  */
+int pv_is_identical (pv_t a, pv_t b);
+
+
+/* Return non-zero if A is known to be a constant.  */
+int pv_is_constant (pv_t a);
+
+/* Return non-zero if A is the original value of register number R
+   plus some constant, zero otherwise.  */
+int pv_is_register (pv_t a, int r);
+
+
+/* Return non-zero if A is the original value of register R plus the
+   constant K.  */
+int pv_is_register_k (pv_t a, int r, CORE_ADDR k);
+
+/* A conservative boolean type, including "maybe", when we can't
+   figure out whether something is true or not.  */
+enum pv_boolean {
+  pv_maybe,
+  pv_definite_yes,
+  pv_definite_no,
+};
+
+
+/* Decide whether a reference to SIZE bytes at ADDR refers exactly to
+   an element of an array.  The array starts at ARRAY_ADDR, and has
+   ARRAY_LEN values of ELT_SIZE bytes each.  If ADDR definitely does
+   refer to an array element, set *I to the index of the referenced
+   element in the array, and return pv_definite_yes.  If it definitely
+   doesn't, return pv_definite_no.  If we can't tell, return pv_maybe.
+
+   If the reference does touch the array, but doesn't fall exactly on
+   an element boundary, or doesn't refer to the whole element, return
+   pv_maybe.  */
+enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                                 pv_t array_addr, CORE_ADDR array_len,
+                                 CORE_ADDR elt_size,
+                                 int *i);
+
+
+/* A 'struct pv_area' keeps track of values stored in a particular
+   region of memory.  */
+struct pv_area;
+
+/* Create a new area, tracking stores relative to the original value
+   of BASE_REG.  If BASE_REG is SP, then this effectively records the
+   contents of the stack frame: the original value of the SP is the
+   frame's CFA, or some constant offset from it.
+
+   Stores to constant addresses, unknown addresses, or to addresses
+   relative to registers other than BASE_REG will trash this area; see
+   pv_area_store_would_trash.  */
+struct pv_area *make_pv_area (int base_reg);
+
+/* Free AREA.  */
+void free_pv_area (struct pv_area *area);
+
+
+/* Register a cleanup to free AREA.  */
+struct cleanup *make_cleanup_free_pv_area (struct pv_area *area);
+
+
+/* Store the SIZE-byte value VALUE at ADDR in AREA.
+
+   If ADDR is not relative to the same base register we used in
+   creating AREA, then we can't tell which values here the stored
+   value might overlap, and we'll have to mark everything as
+   unknown.  */
+void pv_area_store (struct pv_area *area,
+                    pv_t addr,
+                    CORE_ADDR size,
+                    pv_t value);
+
+/* Return the SIZE-byte value at ADDR in AREA.  This may return
+   pv_unknown ().  */
+pv_t pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size);
+
+/* Return true if storing to address ADDR in AREA would force us to
+   mark the contents of the entire area as unknown.  This could happen
+   if, say, ADDR is unknown, since we could be storing anywhere.  Or,
+   it could happen if ADDR is relative to a different register than
+   the other stores base register, since we don't know the relative
+   values of the two registers.
+
+   If you've reached such a store, it may be better to simply stop the
+   prologue analysis, and return the information you've gathered,
+   instead of losing all that information, most of which is probably
+   okay.  */
+int pv_area_store_would_trash (struct pv_area *area, pv_t addr);
+
+
+/* Search AREA for the original value of REGISTER.  If we can't find
+   it, return zero; if we can find it, return a non-zero value, and if
+   OFFSET_P is non-zero, set *OFFSET_P to the register's offset within
+   AREA.  GDBARCH is the architecture of which REGISTER is a member.
+
+   In the worst case, this takes time proportional to the number of
+   items stored in AREA.  If you plan to gather a lot of information
+   about registers saved in AREA, consider calling pv_area_scan
+   instead, and collecting all your information in one pass.  */
+int pv_area_find_reg (struct pv_area *area,
+                      struct gdbarch *gdbarch,
+                      int register,
+                      CORE_ADDR *offset_p);
+
+
+/* For every part of AREA whose value we know, apply FUNC to CLOSURE,
+   the value's address, its size, and the value itself.  */
+void pv_area_scan (struct pv_area *area,
+                   void (*func) (void *closure,
+                                 pv_t addr,
+                                 CORE_ADDR size,
+                                 pv_t value),
+                   void *closure);
+
+
+#endif /* PROLOGUE_VALUE_H */


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-23  5:39   ` Jim Blandy
@ 2006-03-23 14:06     ` Jim Blandy
  2006-03-24  1:03     ` Jim Blandy
  1 sibling, 0 replies; 17+ messages in thread
From: Jim Blandy @ 2006-03-23 14:06 UTC (permalink / raw)
  To: gdb-patches


Jim Blandy <jimb@codesourcery.com> writes:
> Okay.  This patch always builds prologue-value.o.

Wait, no it doesn't.  Quilt is playing games with me...


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-23  5:39   ` Jim Blandy
  2006-03-23 14:06     ` Jim Blandy
@ 2006-03-24  1:03     ` Jim Blandy
  2006-03-24  2:40       ` Jim Blandy
  1 sibling, 1 reply; 17+ messages in thread
From: Jim Blandy @ 2006-03-24  1:03 UTC (permalink / raw)
  To: gdb-patches


Jim Blandy <jimb@codesourcery.com> writes:
> Daniel Jacobowitz <drow@false.org> writes:
>> On Wed, Mar 22, 2006 at 08:37:20PM -0800, Jim Blandy wrote:
>>> Okay to commit?
>>
>> As far as I'm concerned, yes; anyone else have comments on the new
>> code?
>>
>>> 	* prologue-value.c, prologue-value.h: New files.
>>> 	* Makefile.in (prologue_value_h): New variable.
>>> 	(HFILES_NO_SRCDIR): List prologue-value.h.
>>> 	(ALLDEPFILES): List prologue-value.c.
>>> 	(prologue-value.o): New rule.
>>
>> Hmm - why not always build it?  It won't get linked in if nothing needs
>> it, and then ports won't need to mention it in TDEPFILES.  And that'll
>> encourage people to use it I hope :-)
>
> Okay.  This patch always builds prologue-value.o.

Okay.  *This* patch always builds prologue-value.o.

src/gdb/ChangeLog:
2006-03-22  Jim Blandy  <jimb@codesourcery.com>

	* prologue-value.c, prologue-value.h: New files.
	* Makefile.in (prologue_value_h): New variable.
	(HFILES_NO_SRCDIR): List prologue-value.h.
	(SFILES): List prologue-value.c.
	(COMMON_OBS): List prologue-value.o.
	(prologue-value.o): New rule.

Index: src/gdb/prologue-value.c
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.c
@@ -0,0 +1,591 @@
+/* Prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#include "defs.h"
+#include "gdb_string.h"
+#include "gdb_assert.h"
+#include "prologue-value.h"
+#include "regcache.h"
+
+\f
+/* Constructors.  */
+
+pv_t
+pv_unknown (void)
+{
+  pv_t v = { pvk_unknown, 0, 0 };
+
+  return v;
+}
+
+
+pv_t
+pv_constant (CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_constant;
+  v.reg = -1;                   /* for debugging */
+  v.k = k;
+
+  return v;
+}
+
+
+pv_t
+pv_register (int reg, CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_register;
+  v.reg = reg;
+  v.k = k;
+
+  return v;
+}
+
+
+\f
+/* Arithmetic operations.  */
+
+/* If one of *A and *B is a constant, and the other isn't, swap the
+   values as necessary to ensure that *B is the constant.  This can
+   reduce the number of cases we need to analyze in the functions
+   below.  */
+static void
+constant_last (pv_t *a, pv_t *b)
+{
+  if (a->kind == pvk_constant
+      && b->kind != pvk_constant)
+    {
+      pv_t temp = *a;
+      *a = *b;
+      *b = temp;
+    }
+}
+
+
+pv_t
+pv_add (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can add a constant to a register.  */
+  if (a.kind == pvk_register
+      && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k + b.k);
+
+  /* We can add a constant to another constant.  */
+  else if (a.kind == pvk_constant
+           && b.kind == pvk_constant)
+    return pv_constant (a.k + b.k);
+
+  /* Anything else we don't know how to add.  We don't have a
+     representation for, say, the sum of two registers, or a multiple
+     of a register's value (adding a register to itself).  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_add_constant (pv_t v, CORE_ADDR k)
+{
+  /* Rather than thinking of all the cases we can and can't handle,
+     we'll just let pv_add take care of that for us.  */
+  return pv_add (v, pv_constant (k));
+}
+
+
+pv_t
+pv_subtract (pv_t a, pv_t b)
+{
+  /* This isn't quite the same as negating B and adding it to A, since
+     we don't have a representation for the negation of anything but a
+     constant.  For example, we can't negate { pvk_register, R1, 10 },
+     but we do know that { pvk_register, R1, 10 } minus { pvk_register,
+     R1, 5 } is { pvk_constant, <ignored>, 5 }.
+
+     This means, for example, that we could subtract two stack
+     addresses; they're both relative to the original SP.  Since the
+     frame pointer is set based on the SP, its value will be the
+     original SP plus some constant (probably zero), so we can use its
+     value just fine, too.  */
+
+  constant_last (&a, &b);
+
+  /* We can subtract two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k - b.k);
+
+  /* We can subtract a constant from a register.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k - b.k);
+
+  /* We can subtract a register from itself, yielding a constant.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg)
+    return pv_constant (a.k - b.k);
+
+  /* We don't know how to subtract anything else.  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_logical_and (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can 'and' two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k & b.k);
+
+  /* We can 'and' anything with the constant zero.  */
+  else if (b.kind == pvk_constant
+           && b.k == 0)
+    return pv_constant (0);
+
+  /* We can 'and' anything with ~0.  */
+  else if (b.kind == pvk_constant
+           && b.k == ~ (CORE_ADDR) 0)
+    return a;
+
+  /* We can 'and' a register with itself.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg
+           && a.k == b.k)
+    return a;
+
+  /* Otherwise, we don't know.  */
+  else
+    return pv_unknown ();
+}
+
+
+\f
+/* Examining prologue values.  */
+
+int
+pv_is_identical (pv_t a, pv_t b)
+{
+  if (a.kind != b.kind)
+    return 0;
+
+  switch (a.kind)
+    {
+    case pvk_unknown:
+      return 1;
+    case pvk_constant:
+      return (a.k == b.k);
+    case pvk_register:
+      return (a.reg == b.reg && a.k == b.k);
+    default:
+      gdb_assert (0);
+    }
+}
+
+
+int
+pv_is_constant (pv_t a)
+{
+  return (a.kind == pvk_constant);
+}
+
+
+int
+pv_is_register (pv_t a, int r)
+{
+  return (a.kind == pvk_register
+          && a.reg == r);
+}
+
+
+int
+pv_is_register_k (pv_t a, int r, CORE_ADDR k)
+{
+  return (a.kind == pvk_register
+          && a.reg == r
+          && a.k == k);
+}
+
+
+enum pv_boolean
+pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                 pv_t array_addr, CORE_ADDR array_len,
+                 CORE_ADDR elt_size,
+                 int *i)
+{
+  /* Note that, since .k is a CORE_ADDR, and CORE_ADDR is unsigned, if
+     addr is *before* the start of the array, then this isn't going to
+     be negative...  */
+  pv_t offset = pv_subtract (addr, array_addr);
+
+  if (offset.kind == pvk_constant)
+    {
+      /* This is a rather odd test.  We want to know if the SIZE bytes
+         at ADDR don't overlap the array at all, so you'd expect it to
+         be an || expression: "if we're completely before || we're
+         completely after".  But with unsigned arithmetic, things are
+         different: since it's a number circle, not a number line, the
+         right values for offset.k are actually one contiguous range.  */
+      if (offset.k <= -size
+          && offset.k >= array_len * elt_size)
+        return pv_definite_no;
+      else if (offset.k % elt_size != 0
+               || size != elt_size)
+        return pv_maybe;
+      else
+        {
+          *i = offset.k / elt_size;
+          return pv_definite_yes;
+        }
+    }
+  else
+    return pv_maybe;
+}
+
+
+\f
+/* Areas.  */
+
+
+/* A particular value known to be stored in an area.
+
+   Entries form a ring, sorted by unsigned offset from the area's base
+   register's value.  Since entries can straddle the wrap-around point,
+   unsigned offsets form a circle, not a number line, so the list
+   itself is structured the same way --- there is no inherent head.
+   The entry with the lowest offset simply follows the entry with the
+   highest offset.  Entries may abut, but never overlap.  The area's
+   'entry' pointer points to an arbitrary node in the ring.  */
+struct area_entry
+{
+  /* Links in the doubly-linked ring.  */
+  struct area_entry *prev, *next;
+
+  /* Offset of this entry's address from the value of the base
+     register.  */
+  CORE_ADDR offset;
+
+  /* The size of this entry.  Note that an entry may wrap around from
+     the end of the address space to the beginning.  */
+  CORE_ADDR size;
+
+  /* The value stored here.  */
+  pv_t value;
+};
+
+
+struct pv_area
+{
+  /* This area's base register.  */
+  int base_reg;
+
+  /* The mask to apply to addresses, to make the wrap-around happen at
+     the right place.  */
+  CORE_ADDR addr_mask;
+
+  /* An element of the doubly-linked ring of entries, or zero if we
+     have none.  */
+  struct area_entry *entry;
+};
+
+
+struct pv_area *
+make_pv_area (int base_reg)
+{
+  struct pv_area *a = (struct pv_area *) xmalloc (sizeof (*a));
+
+  memset (a, 0, sizeof (*a));
+
+  a->base_reg = base_reg;
+  a->entry = 0;
+
+  /* Remember that shift amounts equal to the type's width are
+     undefined.  */
+  a->addr_mask = ((((CORE_ADDR) 1 << (TARGET_ADDR_BIT - 1)) - 1) << 1) | 1;
+
+  return a;
+}
+
+
+/* Delete all entries from AREA.  */
+static void
+clear_entries (struct pv_area *area)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    {
+      /* This needs to be a do-while loop, in order to actually
+         process the node being checked for in the terminating
+         condition.  */
+      do
+        {
+          struct area_entry *next = e->next;
+          xfree (e);
+        }
+      while (e != area->entry);
+
+      area->entry = 0;
+    }
+}
+
+
+void
+free_pv_area (struct pv_area *area)
+{
+  clear_entries (area);
+  xfree (area);
+}
+
+
+static void
+do_free_pv_area_cleanup (void *arg)
+{
+  free_pv_area ((struct pv_area *) arg);
+}
+
+
+struct cleanup *
+make_cleanup_free_pv_area (struct pv_area *area)
+{
+  return make_cleanup (do_free_pv_area_cleanup, (void *) area);
+}
+
+
+int
+pv_area_store_would_trash (struct pv_area *area, pv_t addr)
+{
+  /* It may seem odd that pvk_constant appears here --- after all,
+     that's the case where we know the most about the address!  But
+     pv_areas are always relative to a register, and we don't know the
+     value of the register, so we can't compare entry addresses to
+     constants.  */
+  return (addr.kind == pvk_unknown
+          || addr.kind == pvk_constant
+          || (addr.kind == pvk_register && addr.reg != area->base_reg));
+}
+
+
+/* Return a pointer to the first entry we hit in AREA starting at
+   OFFSET and going forward.
+
+   This may return zero, if AREA has no entries.
+
+   And since the entries are a ring, this may return an entry that
+   entirely preceeds OFFSET.  This is the correct behavior: depending
+   on the sizes involved, we could still overlap such an area, with
+   wrap-around.  */
+static struct area_entry *
+find_entry (struct pv_area *area, CORE_ADDR offset)
+{
+  struct area_entry *e = area->entry;
+
+  if (! e)
+    return 0;
+
+  /* If the next entry would be better than the current one, then scan
+     forward.  Since we use '<' in this loop, it always terminates.
+
+     Note that, even setting aside the addr_mask stuff, we must not
+     simplify this, in high school algebra fashion, to
+     (e->next->offset < e->offset), because of the way < interacts
+     with wrap-around.  We have to subtract offset from both sides to
+     make sure both things we're comparing are on the same side of the
+     discontinuity.  */
+  while (((e->next->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->next;
+
+  /* If the previous entry would be better than the current one, then
+     scan backwards.  */
+  while (((e->prev->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->prev;
+
+  /* In case there's some locality to the searches, set the area's
+     pointer to the entry we've found.  */
+  area->entry = e;
+
+  return e;
+}
+
+
+/* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY;
+   return zero otherwise.  AREA is the area to which ENTRY belongs.  */
+static int
+overlaps (struct pv_area *area,
+          struct area_entry *entry,
+          CORE_ADDR offset,
+          CORE_ADDR size)
+{
+  /* Think carefully about wrap-around before simplifying this.  */
+  return (((entry->offset - offset) & area->addr_mask) < size
+          || ((offset - entry->offset) & area->addr_mask) < entry->size);
+}
+
+
+void
+pv_area_store (struct pv_area *area,
+               pv_t addr,
+               CORE_ADDR size,
+               pv_t value)
+{
+  /* Remove any (potentially) overlapping entries.  */
+  if (pv_area_store_would_trash (area, addr))
+    clear_entries (area);
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* Delete all entries that we would overlap.  */
+      while (e && overlaps (area, e, offset, size))
+        {
+          struct area_entry *next = (e->next == e) ? 0 : e->next;
+          e->prev->next = e->next;
+          e->next->prev = e->prev;
+
+          xfree (e);
+          e = next;
+        }
+
+      /* Move the area's pointer to the next remaining entry.  This
+         will also zero the pointer if we've deleted all the entries.  */
+      area->entry = e;
+    }
+
+  /* Now, there are no entries overlapping us, and area->entry is
+     either zero or pointing at the closest entry after us.  We can
+     just insert ourselves before that.
+
+     But if we're storing an unknown value, don't bother --- that's
+     the default.  */
+  if (value.kind == pvk_unknown)
+    return;
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = (struct area_entry *) xmalloc (sizeof (*e));
+      e->offset = offset;
+      e->size = size;
+      e->value = value;
+
+      if (area->entry)
+        {
+          e->prev = area->entry->prev;
+          e->next = area->entry;
+          e->prev->next = e->next->prev = e;
+        }
+      else
+        {
+          e->prev = e->next = e;
+          area->entry = e;
+        }
+    }
+}
+
+
+pv_t
+pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size)
+{
+  /* If we have no entries, or we can't decide how ADDR relates to the
+     entries we do have, then the value is unknown.  */
+  if (! area->entry
+      || pv_area_store_would_trash (area, addr))
+    return pv_unknown ();
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* If this entry exactly matches what we're looking for, then
+         we're set.  Otherwise, say it's unknown.  */
+      if (e->offset == offset && e->size == size)
+        return e->value;
+      else
+        return pv_unknown ();
+    }
+}
+
+
+int
+pv_area_find_reg (struct pv_area *area,
+                  struct gdbarch *gdbarch,
+                  int reg,
+                  CORE_ADDR *offset_p)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    do
+      {
+        if (e->value.kind == pvk_register
+            && e->value.reg == reg
+            && e->value.k == 0
+            && e->size == register_size (gdbarch, reg))
+          {
+            if (offset_p)
+              *offset_p = e->offset;
+            return 1;
+          }
+
+        e = e->next;
+      }
+    while (e != area->entry);
+
+  return 0;
+}
+
+
+void
+pv_area_scan (struct pv_area *area,
+              void (*func) (void *closure,
+                            pv_t addr,
+                            CORE_ADDR size,
+                            pv_t value),
+              void *closure)
+{
+  struct area_entry *e = area->entry;
+  pv_t addr;
+
+  addr.kind = pvk_register;
+  addr.reg = area->base_reg;
+
+  if (e)
+    do
+      {
+        addr.k = e->offset;
+        func (closure, addr, e->size, e->value);
+        e = e->next;
+      }
+    while (e != area->entry);
+}
Index: src/gdb/prologue-value.h
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.h
@@ -0,0 +1,302 @@
+/* Interface to prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#ifndef PROLOGUE_VALUE_H
+#define PROLOGUE_VALUE_H
+
+/* When we analyze a prologue, we're really doing 'abstract
+   interpretation' or 'pseudo-evaluation': running the function's code
+   in simulation, but using conservative approximations of the values
+   it would have when it actually runs.  For example, if our function
+   starts with the instruction:
+
+      addi r1, 42     # add 42 to r1
+
+   we don't know exactly what value will be in r1 after executing this
+   instruction, but we do know it'll be 42 greater than its original
+   value.
+
+   If we then see an instruction like:
+
+      addi r1, 22     # add 22 to r1
+
+   we still don't know what r1's value is, but again, we can say it is
+   now 64 greater than its original value.
+
+   If the next instruction were:
+
+      mov r2, r1      # set r2 to r1's value
+
+   then we can say that r2's value is now the original value of r1
+   plus 64.
+
+   It's common for prologues to save registers on the stack, so we'll
+   need to track the values of stack frame slots, as well as the
+   registers.  So after an instruction like this:
+
+      mov (fp+4), r2
+
+   Then we'd know that the stack slot four bytes above the frame
+   pointer holds the original value of r1 plus 64.
+
+   And so on.
+
+   Of course, this can only go so far before it gets unreasonable.  If
+   we wanted to be able to say anything about the value of r1 after
+   the instruction:
+
+      xor r1, r3      # exclusive-or r1 and r3, place result in r1
+
+   then things would get pretty complex.  But remember, we're just
+   doing a conservative approximation; if exclusive-or instructions
+   aren't relevant to prologues, we can just say r1's value is now
+   'unknown'.  We can ignore things that are too complex, if that loss
+   of information is acceptable for our application.
+
+   So when I say "conservative approximation" here, what I mean is an
+   approximation that is either accurate, or marked "unknown", but
+   never inaccurate.
+
+   Once you've reached the current PC, or an instruction that you
+   don't know how to simulate, you stop.  Now you can examine the
+   state of the registers and stack slots you've kept track of.
+
+   - To see how large your stack frame is, just check the value of the
+     stack pointer register; if it's the original value of the SP
+     minus a constant, then that constant is the stack frame's size.
+     If the SP's value has been marked as 'unknown', then that means
+     the prologue has done something too complex for us to track, and
+     we don't know the frame size.
+
+   - To see where we've saved the previous frame's registers, we just
+     search the values we've tracked --- stack slots, usually, but
+     registers, too, if you want --- for something equal to the
+     register's original value.  If the ABI suggests a standard place
+     to save a given register, then we can check there first, but
+     really, anything that will get us back the original value will
+     probably work.
+
+   Sure, this takes some work.  But prologue analyzers aren't
+   quick-and-simple pattern patching to recognize a few fixed prologue
+   forms any more; they're big, hairy functions.  Along with inferior
+   function calls, prologue analysis accounts for a substantial
+   portion of the time needed to stabilize a GDB port.  So I think
+   it's worthwhile to look for an approach that will be easier to
+   understand and maintain.  In the approach used here:
+
+   - It's easier to see that the analyzer is correct: you just see
+     whether the analyzer properly (albiet conservatively) simulates
+     the effect of each instruction.
+
+   - It's easier to extend the analyzer: you can add support for new
+     instructions, and know that you haven't broken anything that
+     wasn't already broken before.
+
+   - It's orthogonal: to gather new information, you don't need to
+     complicate the code for each instruction.  As long as your domain
+     of conservative values is already detailed enough to tell you
+     what you need, then all the existing instruction simulations are
+     already gathering the right data for you.
+
+   A 'struct prologue_value' is a conservative approximation of the
+   real value the register or stack slot will have.  */
+
+struct prologue_value {
+
+  /* What sort of value is this?  This determines the interpretation
+     of subsequent fields.  */
+  enum {
+
+    /* We don't know anything about the value.  This is also used for
+       values we could have kept track of, when doing so would have
+       been too complex and we don't want to bother.  The bottom of
+       our lattice.  */
+    pvk_unknown,
+
+    /* A known constant.  K is its value.  */
+    pvk_constant,
+
+    /* The value that register REG originally had *UPON ENTRY TO THE
+       FUNCTION*, plus K.  If K is zero, this means, obviously, just
+       the value REG had upon entry to the function.  REG is a GDB
+       register number.  Before we start interpreting, we initialize
+       every register R to { pvk_register, R, 0 }.  */
+    pvk_register,
+
+  } kind;
+
+  /* The meanings of the following fields depend on 'kind'; see the
+     comments for the specific 'kind' values.  */
+  int reg;
+  CORE_ADDR k;
+};
+
+typedef struct prologue_value pv_t;
+
+
+/* Return the unknown prologue value --- { pvk_unknown, ?, ? }.  */
+pv_t pv_unknown (void);
+
+/* Return the prologue value representing the constant K.  */
+pv_t pv_constant (CORE_ADDR k);
+
+/* Return the prologue value representing the original value of
+   register REG, plus the constant K.  */
+pv_t pv_register (int reg, CORE_ADDR k);
+
+
+/* Return conservative approximations of the results of the following
+   operations.  */
+pv_t pv_add (pv_t a, pv_t b);               /* a + b */
+pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */
+pv_t pv_subtract (pv_t a, pv_t b);          /* a - b */
+pv_t pv_logical_and (pv_t a, pv_t b);       /* a & b */
+
+
+/* Return non-zero iff A and B are identical expressions.
+
+   This is not the same as asking if the two values are equal; the
+   result of such a comparison would have to be a pv_boolean, and
+   asking whether two 'unknown' values were equal would give you
+   pv_maybe.  Same for comparing, say, { pvk_register, R1, 0 } and {
+   pvk_register, R2, 0}.
+
+   Instead, this function asks whether the two representations are the
+   same.  */
+int pv_is_identical (pv_t a, pv_t b);
+
+
+/* Return non-zero if A is known to be a constant.  */
+int pv_is_constant (pv_t a);
+
+/* Return non-zero if A is the original value of register number R
+   plus some constant, zero otherwise.  */
+int pv_is_register (pv_t a, int r);
+
+
+/* Return non-zero if A is the original value of register R plus the
+   constant K.  */
+int pv_is_register_k (pv_t a, int r, CORE_ADDR k);
+
+/* A conservative boolean type, including "maybe", when we can't
+   figure out whether something is true or not.  */
+enum pv_boolean {
+  pv_maybe,
+  pv_definite_yes,
+  pv_definite_no,
+};
+
+
+/* Decide whether a reference to SIZE bytes at ADDR refers exactly to
+   an element of an array.  The array starts at ARRAY_ADDR, and has
+   ARRAY_LEN values of ELT_SIZE bytes each.  If ADDR definitely does
+   refer to an array element, set *I to the index of the referenced
+   element in the array, and return pv_definite_yes.  If it definitely
+   doesn't, return pv_definite_no.  If we can't tell, return pv_maybe.
+
+   If the reference does touch the array, but doesn't fall exactly on
+   an element boundary, or doesn't refer to the whole element, return
+   pv_maybe.  */
+enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                                 pv_t array_addr, CORE_ADDR array_len,
+                                 CORE_ADDR elt_size,
+                                 int *i);
+
+
+/* A 'struct pv_area' keeps track of values stored in a particular
+   region of memory.  */
+struct pv_area;
+
+/* Create a new area, tracking stores relative to the original value
+   of BASE_REG.  If BASE_REG is SP, then this effectively records the
+   contents of the stack frame: the original value of the SP is the
+   frame's CFA, or some constant offset from it.
+
+   Stores to constant addresses, unknown addresses, or to addresses
+   relative to registers other than BASE_REG will trash this area; see
+   pv_area_store_would_trash.  */
+struct pv_area *make_pv_area (int base_reg);
+
+/* Free AREA.  */
+void free_pv_area (struct pv_area *area);
+
+
+/* Register a cleanup to free AREA.  */
+struct cleanup *make_cleanup_free_pv_area (struct pv_area *area);
+
+
+/* Store the SIZE-byte value VALUE at ADDR in AREA.
+
+   If ADDR is not relative to the same base register we used in
+   creating AREA, then we can't tell which values here the stored
+   value might overlap, and we'll have to mark everything as
+   unknown.  */
+void pv_area_store (struct pv_area *area,
+                    pv_t addr,
+                    CORE_ADDR size,
+                    pv_t value);
+
+/* Return the SIZE-byte value at ADDR in AREA.  This may return
+   pv_unknown ().  */
+pv_t pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size);
+
+/* Return true if storing to address ADDR in AREA would force us to
+   mark the contents of the entire area as unknown.  This could happen
+   if, say, ADDR is unknown, since we could be storing anywhere.  Or,
+   it could happen if ADDR is relative to a different register than
+   the other stores base register, since we don't know the relative
+   values of the two registers.
+
+   If you've reached such a store, it may be better to simply stop the
+   prologue analysis, and return the information you've gathered,
+   instead of losing all that information, most of which is probably
+   okay.  */
+int pv_area_store_would_trash (struct pv_area *area, pv_t addr);
+
+
+/* Search AREA for the original value of REGISTER.  If we can't find
+   it, return zero; if we can find it, return a non-zero value, and if
+   OFFSET_P is non-zero, set *OFFSET_P to the register's offset within
+   AREA.  GDBARCH is the architecture of which REGISTER is a member.
+
+   In the worst case, this takes time proportional to the number of
+   items stored in AREA.  If you plan to gather a lot of information
+   about registers saved in AREA, consider calling pv_area_scan
+   instead, and collecting all your information in one pass.  */
+int pv_area_find_reg (struct pv_area *area,
+                      struct gdbarch *gdbarch,
+                      int register,
+                      CORE_ADDR *offset_p);
+
+
+/* For every part of AREA whose value we know, apply FUNC to CLOSURE,
+   the value's address, its size, and the value itself.  */
+void pv_area_scan (struct pv_area *area,
+                   void (*func) (void *closure,
+                                 pv_t addr,
+                                 CORE_ADDR size,
+                                 pv_t value),
+                   void *closure);
+
+
+#endif /* PROLOGUE_VALUE_H */
Index: src/gdb/Makefile.in
===================================================================
--- src.orig/gdb/Makefile.in
+++ src/gdb/Makefile.in
@@ -542,6 +542,7 @@ SFILES = ada-exp.y ada-lang.c ada-typepr
 	objc-exp.y objc-lang.c \
 	objfiles.c osabi.c observer.c \
 	p-exp.y p-lang.c p-typeprint.c p-valprint.c parse.c printcmd.c \
+	prologue-value.c \
 	regcache.c reggroups.c remote.c remote-fileio.c \
 	scm-exp.c scm-lang.c scm-valprint.c \
 	sentinel-frame.c \
@@ -756,6 +757,7 @@ ppcnbsd_tdep_h = ppcnbsd-tdep.h
 ppcobsd_tdep_h = ppcobsd-tdep.h
 ppc_tdep_h = ppc-tdep.h
 proc_utils_h = proc-utils.h
+prologue_value_h = prologue-value.h
 regcache_h = regcache.h
 reggroups_h = reggroups.h
 regset_h = regset.h
@@ -866,6 +868,7 @@ HFILES_NO_SRCDIR = bcache.h buildsym.h c
 	symfile.h stabsread.h target.h terminal.h typeprint.h \
 	xcoffsolib.h \
 	macrotab.h macroexp.h macroscope.h \
+	prologue-value.h \
 	ada-lang.h c-lang.h f-lang.h \
 	jv-lang.h \
 	m2-lang.h  p-lang.h \
@@ -947,7 +950,8 @@ COMMON_OBS = $(DEPFILES) $(CONFIG_OBS) $
 	reggroups.o regset.o \
 	trad-frame.o \
 	tramp-frame.o \
-	solib.o solib-null.o
+	solib.o solib-null.o \
+	prologue-value.o
 
 TSOBS = inflow.o
 
@@ -2434,6 +2438,8 @@ procfs.o: procfs.c $(defs_h) $(inferior_
 proc-service.o: proc-service.c $(defs_h) $(gdb_proc_service_h) $(inferior_h) \
 	$(symtab_h) $(target_h) $(gregset_h)
 proc-why.o: proc-why.c $(defs_h) $(proc_utils_h)
+prologue-value.o: prologue-value.c $(defs_h) $(gdb_string_h) $(gdb_assert_h) \
+	$(prologue_value_h) $(regcache_h)
 p-typeprint.o: p-typeprint.c $(defs_h) $(gdb_obstack_h) $(bfd_h) $(symtab_h) \
 	$(gdbtypes_h) $(expression_h) $(value_h) $(gdbcore_h) $(target_h) \
 	$(language_h) $(p_lang_h) $(typeprint_h) $(gdb_string_h)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-24  1:03     ` Jim Blandy
@ 2006-03-24  2:40       ` Jim Blandy
  2006-03-28  7:51         ` Jim Blandy
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Blandy @ 2006-03-24  2:40 UTC (permalink / raw)
  To: gdb-patches


Jim Blandy <jimb@codesourcery.com> writes:
> Okay.  *This* patch always builds prologue-value.o.
>
> src/gdb/ChangeLog:
> 2006-03-22  Jim Blandy  <jimb@codesourcery.com>
>
> 	* prologue-value.c, prologue-value.h: New files.
> 	* Makefile.in (prologue_value_h): New variable.
> 	(HFILES_NO_SRCDIR): List prologue-value.h.
> 	(SFILES): List prologue-value.c.
> 	(COMMON_OBS): List prologue-value.o.
> 	(prologue-value.o): New rule.

As an aside --- I'm actually going to commit these ChangeLog entries
using my old Red Hat email address.  I wrote this code while working
at Red Hat, and I'd like that to be apparent from the ChangeLog.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-23  5:20 RFA: prologue value modules Jim Blandy
  2006-03-23  5:24 ` Daniel Jacobowitz
@ 2006-03-24 14:48 ` Eli Zaretskii
  2006-03-24 22:43   ` Jim Blandy
  1 sibling, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2006-03-24 14:48 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

> From: Jim Blandy <jimb@codesourcery.com>
> Date: Wed, 22 Mar 2006 20:37:20 -0800
> 
> To recap: this patch adds prologue-value.[ch], support libraries for
> the new prologue analysis strategy used in the Renesas M32C port and
> some internal Red Hat ports.  I think they make prologue analysis
> simpler to write and more robust.

Are we going to have this documented in gdbint.texinfo?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-24 14:48 ` Eli Zaretskii
@ 2006-03-24 22:43   ` Jim Blandy
  2006-03-28  2:24     ` Jim Blandy
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Blandy @ 2006-03-24 22:43 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches


Eli Zaretskii <eliz@gnu.org> writes:
>> From: Jim Blandy <jimb@codesourcery.com>
>> Date: Wed, 22 Mar 2006 20:37:20 -0800
>> 
>> To recap: this patch adds prologue-value.[ch], support libraries for
>> the new prologue analysis strategy used in the Renesas M32C port and
>> some internal Red Hat ports.  I think they make prologue analysis
>> simpler to write and more robust.
>
> Are we going to have this documented in gdbint.texinfo?

Yes --- I promised that before.  Thanks for reminding me.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-24 22:43   ` Jim Blandy
@ 2006-03-28  2:24     ` Jim Blandy
  2006-03-28 18:58       ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Blandy @ 2006-03-28  2:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches


Jim Blandy <jimb@codesourcery.com> writes:
> Eli Zaretskii <eliz@gnu.org> writes:
>>> From: Jim Blandy <jimb@codesourcery.com>
>>> Date: Wed, 22 Mar 2006 20:37:20 -0800
>>> 
>>> To recap: this patch adds prologue-value.[ch], support libraries for
>>> the new prologue analysis strategy used in the Renesas M32C port and
>>> some internal Red Hat ports.  I think they make prologue analysis
>>> simpler to write and more robust.
>>
>> Are we going to have this documented in gdbint.texinfo?
>
> Yes --- I promised that before.  Thanks for reminding me.

How's this doc patch?

src/gdb/doc/ChangeLog:
2006-03-27  Jim Blandy  <jimb@redhat.com>

	* gdbint.texinfo (Prologue Analysis): New section.

Index: src/gdb/doc/gdbint.texinfo
===================================================================
--- src.orig/gdb/doc/gdbint.texinfo
+++ src/gdb/doc/gdbint.texinfo
@@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame 
 @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and
 @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame.
 
+@section Prologue Analysis
+
+@cindex prologue analysis
+@cindex call frame information
+To produce a backtrace and allow the user to manipulate older frames'
+variables and arguments, @value{GDBN} needs to find the base addresses
+of older frames, and discover where those frames' registers have been
+saved.  Since a frame's ``callee-saves'' registers get saved by
+younger frames if and when they're reused, a frame's registers may be
+scattered unpredictably across younger frames.  This means that
+changing the value of a register-allocated variable in an older frame
+may actually entail writing to a save slot in some younger frame.
+
+Modern versions of GCC emit Dwarf call frame information (``CFI'')
+that gives @value{GDBN} all the information it needs to do this.  But
+CFI is not always available, so as a fallback @value{GDBN} uses a
+technique called @dfn{prologue analysis} to find frame sizes and saved
+registers.  A prologue analyzer disassembles the function's machine
+code starting from its entry point, and looks for instructions that
+allocate frame space, save the stack pointer in a frame pointer
+register, save registers, and so on.  Obviously, this can't be done
+accurately in general, but it's tractible to do well enough to be very
+helpful.  Prologue analysis predates the GNU toolchain's support for
+CFI; at one time, prologue analysis was the only mechanism
+@value{GDBN} used for stack unwinding at all, when the function
+calling conventions didn't specify a fixed frame layout.
+
+In the olden days, function prologues were generated by hand-written,
+target-specific code in GCC, and treated as opaque and untouchable by
+optimizers.  Looking at this code, it was usually straightforward to
+write a prologue analyzer for @value{GDBN} that would accurately
+understand all the prologues GCC would generate.  However, over time
+GCC became more aggressive about instruction scheduling, and began to
+understand more about the semantics of the prologue instructions
+themselves; in response, @value{GDBN}'s analyzers became more complex
+and fragile.  Keeping the prologue analyzers working as GCC (and the
+ISA's themselves) evolved became a substantial task.
+
+@cindex @file{prologue-value.c}
+@cindex abstract interpretation of function prologues
+@cindex pseudo-evaluation of function prologues
+To try to address this problem, the code in
+@file{gdb/prologue-value.h} and @file{gdb/prologue-value.c} provide a
+general framework for writing prologue analyzers that are simpler and
+more robust than ad-hoc analyzers.  When we analyze a prologue using
+the prologue-value framework, we're really doing ``abstract
+interpretation'' or ``pseudo-evaluation'': running the function's code
+in simulation, but using conservative approximations of the values
+registers and memory would hold when the code actually runs.  For
+example, if our function starts with the instruction:
+
+@example
+addi r1, 42     # add 42 to r1
+@end example
+@noindent
+we don't know exactly what value will be in @code{r1} after executing
+this instruction, but we do know it'll be 42 greater than its original
+value.
+
+If we then see an instruction like:
+
+@example
+addi r1, 22     # add 22 to r1
+@end example
+@noindent
+we still don't know what @code{r1's} value is, but again, we can say
+it is now 64 greater than its original value.
+
+If the next instruction were:
+
+@example
+   mov r2, r1      # set r2 to r1's value
+@end example
+@noindent
+then we can say that @code{r2's} value is now the original value of
+@code{r1} plus 64.
+
+It's common for prologues to save registers on the stack, so we'll
+need to track the values of stack frame slots, as well as the
+registers.  So after an instruction like this:
+
+@example
+mov (fp+4), r2
+@end example
+@noindent
+Then we'd know that the stack slot four bytes above the frame pointer
+holds the original value of @code{r1} plus 64.
+
+And so on.
+
+Of course, this can only go so far before it gets unreasonable.  If we
+wanted to be able to say anything about the value of @code{r1} after
+the instruction:
+
+@example
+xor r1, r3      # exclusive-or r1 and r3, place result in r1
+@end example
+@noindent
+then things would get pretty complex.  But remember, we're just doing
+a conservative approximation; if exclusive-or instructions aren't
+relevant to prologues, we can just say @code{r1}'s value is now
+``unknown''.  We can ignore things that are too complex, if that loss of
+information is acceptable for our application.
+
+So when we say ``conservative approximation'' here, what we mean is an
+approximation that is either accurate, or marked ``unknown'', but
+never inaccurate.
+
+Using this framework, a prologue analyzer is simply an interpreter for
+machine code, but one that uses conservative approximations for the
+contents of registers and memory instead of actual values.  Starting
+from the function's entry point, you simulate instructions up to the
+current PC, or an instruction that you don't know how to simulate.
+Now you can examine the state of the registers and stack slots you've
+kept track of.
+
+@itemize @bullet
+
+@item
+To see how large your stack frame is, just check the value of the
+stack pointer register; if it's the original value of the SP
+minus a constant, then that constant is the stack frame's size.
+If the SP's value has been marked as ``unknown'', then that means
+the prologue has done something too complex for us to track, and
+we don't know the frame size.
+
+@item
+To see where we've saved the previous frame's registers, we just
+search the values we've tracked --- stack slots, usually, but
+registers, too, if you want --- for something equal to the
+register's original value.  If the ABI suggests a standard place
+to save a given register, then we can check there first, but
+really, anything that will get us back the original value will
+probably work.
+@end itemize
+
+This does take some work.  But prologue analyzers aren't
+quick-and-simple pattern patching to recognize a few fixed prologue
+forms any more; they're big, hairy functions.  Along with inferior
+function calls, prologue analysis accounts for a substantial portion
+of the time needed to stabilize a @value{GDBN} port.  So I think it's
+worthwhile to look for an approach that will be easier to understand
+and maintain.  In the approach used here:
+
+@itemize @bullet
+
+@item
+It's easier to see that the analyzer is correct: you just see
+whether the analyzer properly (albiet conservatively) simulates
+the effect of each instruction.
+
+@item
+It's easier to extend the analyzer: you can add support for new
+instructions, and know that you haven't broken anything that
+wasn't already broken before.
+
+@item
+It's orthogonal: to gather new information, you don't need to
+complicate the code for each instruction.  As long as your domain
+of conservative values is already detailed enough to tell you
+what you need, then all the existing instruction simulations are
+already gathering the right data for you.
+
+@end itemize
+
+The file @file{prologue-value.h} contains detailed comments explaining
+the framework and how to use it.
+
+
 @section Breakpoint Handling
 
 @cindex breakpoints


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-24  2:40       ` Jim Blandy
@ 2006-03-28  7:51         ` Jim Blandy
  0 siblings, 0 replies; 17+ messages in thread
From: Jim Blandy @ 2006-03-28  7:51 UTC (permalink / raw)
  To: gdb-patches


Jim Blandy <jimb@codesourcery.com> writes:
> Jim Blandy <jimb@codesourcery.com> writes:
>> Okay.  *This* patch always builds prologue-value.o.
>>
>> src/gdb/ChangeLog:
>> 2006-03-22  Jim Blandy  <jimb@codesourcery.com>
>>
>> 	* prologue-value.c, prologue-value.h: New files.
>> 	* Makefile.in (prologue_value_h): New variable.
>> 	(HFILES_NO_SRCDIR): List prologue-value.h.
>> 	(SFILES): List prologue-value.c.
>> 	(COMMON_OBS): List prologue-value.o.
>> 	(prologue-value.o): New rule.

I haven't seen any further comments on this, other that people saying
they might like to use it, so once Eli has signed off on the docs, I'll commit this.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-28  2:24     ` Jim Blandy
@ 2006-03-28 18:58       ` Eli Zaretskii
  2006-03-28 19:15         ` Jim Blandy
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2006-03-28 18:58 UTC (permalink / raw)
  To: Jim Blandy; +Cc: gdb-patches

> Cc: gdb-patches@sources.redhat.com
> From: Jim Blandy <jimb@codesourcery.com>
> Date: Mon, 27 Mar 2006 15:59:33 -0800
> 
> >>
> >> Are we going to have this documented in gdbint.texinfo?
> >
> > Yes --- I promised that before.  Thanks for reminding me.
> 
> How's this doc patch?

It's fine, modulo some minor comments below.  Thanks.

> +Modern versions of GCC emit Dwarf call frame information (``CFI'')
> +that gives @value{GDBN} all the information it needs to do this.

AFAICS, this is the first time the manual mentions "CFI", so I think
at least an index entry is in order, if not a short explanation of
what it is (perhaps in a footnote for now, pending some real
documentation in the future), or a pointer to some doc on the net.

> +and fragile.  Keeping the prologue analyzers working as GCC (and the
> +ISA's themselves) evolved became a substantial task.

Ditto for "ISA".

> +@cindex @file{prologue-value.c}
> +@cindex abstract interpretation of function prologues
> +@cindex pseudo-evaluation of function prologues
> +To try to address this problem, the code in
> +@file{gdb/prologue-value.h} and @file{gdb/prologue-value.c} provide a

Should these file names include the leading "gdb/" directory?  I'm not
sure it's really required; OTOH, having too long strings in @file{}
might produce ugly printed version, because TeX does not know how to
hyphenate inside @file{}.

> +@example
> +   mov r2, r1      # set r2 to r1's value
> +@end example

This @example is indented differently than the rest.  Any reason?

> +@example
> +mov (fp+4), r2
> +@end example
> +@noindent
> +Then we'd know that the stack slot four bytes above the frame pointer
   ^^^^
This should be a lowercase "then", since it doesn't start a sentence.

> +register's original value.  If the ABI suggests a standard place

We have a section about the ABI, so I think a cross-reference there
will be a good idea.

>                                                      So I think it's
> +worthwhile to look for an approach that will be easier to understand
> +and maintain.  In the approach used here:

I think we need to reword ``I'' and ``here'', so that they look
natural in the context of the manual (as opposed to a mail message or
a source file).

> +The file @file{prologue-value.h} contains detailed comments explaining
> +the framework and how to use it.

Would it be a good idea to have the listing of the API in the manual,
with short explanations?  I'm not saying it is necessarily required,
but please give it a thought.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-28 18:58       ` Eli Zaretskii
@ 2006-03-28 19:15         ` Jim Blandy
  2006-03-28 21:17           ` Eli Zaretskii
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Blandy @ 2006-03-28 19:15 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Jim Blandy, gdb-patches

On 3/28/06, Eli Zaretskii <eliz@gnu.org> wrote:
> > +Modern versions of GCC emit Dwarf call frame information (``CFI'')
> > +that gives @value{GDBN} all the information it needs to do this.
>
> AFAICS, this is the first time the manual mentions "CFI", so I think
> at least an index entry is in order, if not a short explanation of
> what it is (perhaps in a footnote for now, pending some real
> documentation in the future), or a pointer to some doc on the net.

I had an entry for "call frame information"; I've added an entry for
"CFI" next to it.  The preceding paragraph is an explanation of what
call frame information does; I've changed the text to make that a bit
clearer.

> > +and fragile.  Keeping the prologue analyzers working as GCC (and the
> > +ISA's themselves) evolved became a substantial task.
>
> Ditto for "ISA".

I just wrote out "instruction set".  I was being cryptic, for no good reason.

> > +@cindex @file{prologue-value.c}
> > +@cindex abstract interpretation of function prologues
> > +@cindex pseudo-evaluation of function prologues
> > +To try to address this problem, the code in
> > +@file{gdb/prologue-value.h} and @file{gdb/prologue-value.c} provide a
>
> Should these file names include the leading "gdb/" directory?  I'm not
> sure it's really required; OTOH, having too long strings in @file{}
> might produce ugly printed version, because TeX does not know how to
> hyphenate inside @file{}.

Good point; I've dropped the "gdb/".

> > +@example
> > +   mov r2, r1      # set r2 to r1's value
> > +@end example
>
> This @example is indented differently than the rest.  Any reason?

Oversight.

> > +@example
> > +mov (fp+4), r2
> > +@end example
> > +@noindent
> > +Then we'd know that the stack slot four bytes above the frame pointer
>    ^^^^
> This should be a lowercase "then", since it doesn't start a sentence.

Right you are.

> > +register's original value.  If the ABI suggests a standard place
>
> We have a section about the ABI, so I think a cross-reference there
> will be a good idea.

"ABI" is a bad term here, because it includes all sorts of stuff that
isn't relevant to prologue analysis at all.  I've replaced it with
"calling conventions".

> >                                                      So I think it's
> > +worthwhile to look for an approach that will be easier to understand
> > +and maintain.  In the approach used here:
>
> I think we need to reword ``I'' and ``here'', so that they look
> natural in the context of the manual (as opposed to a mail message or
> a source file).

Sorry --- I'd meant to fix the first-person references.  I changed
"here" to "above", which is more writerly.

> > +The file @file{prologue-value.h} contains detailed comments explaining
> > +the framework and how to use it.
>
> Would it be a good idea to have the listing of the API in the manual,
> with short explanations?  I'm not saying it is necessarily required,
> but please give it a thought.

I'd rather have the API described in the header file.  I don't think
we, as a project, can afford to maintain two copies of that sort of
documentation, and I think the header files are best place for
detailed, function-by-function information.

How does this look?

Index: src/gdb/doc/gdbint.texinfo
===================================================================
--- src.orig/gdb/doc/gdbint.texinfo
+++ src/gdb/doc/gdbint.texinfo
@@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame
 @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and
 @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame.

+@section Prologue Analysis
+
+@cindex prologue analysis
+@cindex call frame information
+@cindex CFI (call frame information)
+To produce a backtrace and allow the user to manipulate older frames'
+variables and arguments, @value{GDBN} needs to find the base addresses
+of older frames, and discover where those frames' registers have been
+saved.  Since a frame's ``callee-saves'' registers get saved by
+younger frames if and when they're reused, a frame's registers may be
+scattered unpredictably across younger frames.  This means that
+changing the value of a register-allocated variable in an older frame
+may actually entail writing to a save slot in some younger frame.
+
+Modern versions of GCC emit Dwarf call frame information (``CFI''),
+which describes how to find frame base addresses and saved registers.
+But CFI is not always available, so as a fallback @value{GDBN} uses a
+technique called @dfn{prologue analysis} to find frame sizes and saved
+registers.  A prologue analyzer disassembles the function's machine
+code starting from its entry point, and looks for instructions that
+allocate frame space, save the stack pointer in a frame pointer
+register, save registers, and so on.  Obviously, this can't be done
+accurately in general, but it's tractible to do well enough to be very
+helpful.  Prologue analysis predates the GNU toolchain's support for
+CFI; at one time, prologue analysis was the only mechanism
+@value{GDBN} used for stack unwinding at all, when the function
+calling conventions didn't specify a fixed frame layout.
+
+In the olden days, function prologues were generated by hand-written,
+target-specific code in GCC, and treated as opaque and untouchable by
+optimizers.  Looking at this code, it was usually straightforward to
+write a prologue analyzer for @value{GDBN} that would accurately
+understand all the prologues GCC would generate.  However, over time
+GCC became more aggressive about instruction scheduling, and began to
+understand more about the semantics of the prologue instructions
+themselves; in response, @value{GDBN}'s analyzers became more complex
+and fragile.  Keeping the prologue analyzers working as GCC (and the
+instruction sets themselves) evolved became a substantial task.
+
+@cindex @file{prologue-value.c}
+@cindex abstract interpretation of function prologues
+@cindex pseudo-evaluation of function prologues
+To try to address this problem, the code in @file{prologue-value.h}
+and @file{prologue-value.c} provides a general framework for writing
+prologue analyzers that are simpler and more robust than ad-hoc
+analyzers.  When we analyze a prologue using the prologue-value
+framework, we're really doing ``abstract interpretation'' or
+``pseudo-evaluation'': running the function's code in simulation, but
+using conservative approximations of the values registers and memory
+would hold when the code actually runs.  For example, if our function
+starts with the instruction:
+
+@example
+addi r1, 42     # add 42 to r1
+@end example
+@noindent
+we don't know exactly what value will be in @code{r1} after executing
+this instruction, but we do know it'll be 42 greater than its original
+value.
+
+If we then see an instruction like:
+
+@example
+addi r1, 22     # add 22 to r1
+@end example
+@noindent
+we still don't know what @code{r1's} value is, but again, we can say
+it is now 64 greater than its original value.
+
+If the next instruction were:
+
+@example
+mov r2, r1      # set r2 to r1's value
+@end example
+@noindent
+then we can say that @code{r2's} value is now the original value of
+@code{r1} plus 64.
+
+It's common for prologues to save registers on the stack, so we'll
+need to track the values of stack frame slots, as well as the
+registers.  So after an instruction like this:
+
+@example
+mov (fp+4), r2
+@end example
+@noindent
+then we'd know that the stack slot four bytes above the frame pointer
+holds the original value of @code{r1} plus 64.
+
+And so on.
+
+Of course, this can only go so far before it gets unreasonable.  If we
+wanted to be able to say anything about the value of @code{r1} after
+the instruction:
+
+@example
+xor r1, r3      # exclusive-or r1 and r3, place result in r1
+@end example
+@noindent
+then things would get pretty complex.  But remember, we're just doing
+a conservative approximation; if exclusive-or instructions aren't
+relevant to prologues, we can just say @code{r1}'s value is now
+``unknown''.  We can ignore things that are too complex, if that loss of
+information is acceptable for our application.
+
+So when we say ``conservative approximation'' here, what we mean is an
+approximation that is either accurate, or marked ``unknown'', but
+never inaccurate.
+
+Using this framework, a prologue analyzer is simply an interpreter for
+machine code, but one that uses conservative approximations for the
+contents of registers and memory instead of actual values.  Starting
+from the function's entry point, you simulate instructions up to the
+current PC, or an instruction that you don't know how to simulate.
+Now you can examine the state of the registers and stack slots you've
+kept track of.
+
+@itemize @bullet
+
+@item
+To see how large your stack frame is, just check the value of the
+stack pointer register; if it's the original value of the SP
+minus a constant, then that constant is the stack frame's size.
+If the SP's value has been marked as ``unknown'', then that means
+the prologue has done something too complex for us to track, and
+we don't know the frame size.
+
+@item
+To see where we've saved the previous frame's registers, we just
+search the values we've tracked --- stack slots, usually, but
+registers, too, if you want --- for something equal to the register's
+original value.  If the calling conventions suggest a standard place
+to save a given register, then we can check there first, but really,
+anything that will get us back the original value will probably work.
+@end itemize
+
+This does take some work.  But prologue analyzers aren't
+quick-and-simple pattern patching to recognize a few fixed prologue
+forms any more; they're big, hairy functions.  Along with inferior
+function calls, prologue analysis accounts for a substantial portion
+of the time needed to stabilize a @value{GDBN} port.  So it's
+worthwhile to look for an approach that will be easier to understand
+and maintain.  In the approach described above:
+
+@itemize @bullet
+
+@item
+It's easier to see that the analyzer is correct: you just see
+whether the analyzer properly (albiet conservatively) simulates
+the effect of each instruction.
+
+@item
+It's easier to extend the analyzer: you can add support for new
+instructions, and know that you haven't broken anything that
+wasn't already broken before.
+
+@item
+It's orthogonal: to gather new information, you don't need to
+complicate the code for each instruction.  As long as your domain
+of conservative values is already detailed enough to tell you
+what you need, then all the existing instruction simulations are
+already gathering the right data for you.
+
+@end itemize
+
+The file @file{prologue-value.h} contains detailed comments explaining
+the framework and how to use it.
+
+
 @section Breakpoint Handling

 @cindex breakpoints


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-28 19:15         ` Jim Blandy
@ 2006-03-28 21:17           ` Eli Zaretskii
  2006-03-28 21:59             ` Jim Blandy
  0 siblings, 1 reply; 17+ messages in thread
From: Eli Zaretskii @ 2006-03-28 21:17 UTC (permalink / raw)
  To: Jim Blandy; +Cc: jimb, gdb-patches

> Date: Tue, 28 Mar 2006 10:29:37 -0800
> From: "Jim Blandy" <jimb@red-bean.com>
> Cc: "Jim Blandy" <jimb@codesourcery.com>, gdb-patches@sources.redhat.com
> 
> > Would it be a good idea to have the listing of the API in the manual,
> > with short explanations?  I'm not saying it is necessarily required,
> > but please give it a thought.
> 
> I'd rather have the API described in the header file.  I don't think
> we, as a project, can afford to maintain two copies of that sort of
> documentation, and I think the header files are best place for
> detailed, function-by-function information.

But that's contrary to the rest of gdbint.texinfo, which does document
the more important parts of the respective APIs.

> How does this look?

Looks good, thanks.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-28 21:17           ` Eli Zaretskii
@ 2006-03-28 21:59             ` Jim Blandy
  2006-04-07 19:19               ` Ulrich Weigand
  0 siblings, 1 reply; 17+ messages in thread
From: Jim Blandy @ 2006-03-28 21:59 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: gdb-patches


Eli Zaretskii <eliz@gnu.org> writes:
>> Date: Tue, 28 Mar 2006 10:29:37 -0800
>> From: "Jim Blandy" <jimb@red-bean.com>
>> Cc: "Jim Blandy" <jimb@codesourcery.com>, gdb-patches@sources.redhat.com
>> 
>> > Would it be a good idea to have the listing of the API in the manual,
>> > with short explanations?  I'm not saying it is necessarily required,
>> > but please give it a thought.
>> 
>> I'd rather have the API described in the header file.  I don't think
>> we, as a project, can afford to maintain two copies of that sort of
>> documentation, and I think the header files are best place for
>> detailed, function-by-function information.
>
> But that's contrary to the rest of gdbint.texinfo, which does document
> the more important parts of the respective APIs.

That's true.  I think I have philosophical issues with gdbint.texinfo,
but I don't want to argue about it.  :)

>> How does this look?
>
> Looks good, thanks.

Okay, I've committed the following:

src/gdb/ChangeLog:
2006-03-28  Jim Blandy  <jimb@codesourcery.com>

	* prologue-value.c, prologue-value.h: New files.
	* Makefile.in (prologue_value_h): New variable.
	(HFILES_NO_SRCDIR): List prologue-value.h.
	(SFILES): List prologue-value.c.
	(COMMON_OBS): List prologue-value.o.
	(prologue-value.o): New rule.

src/gdb/doc/ChangeLog:
2006-03-28  Jim Blandy  <jimb@codesourcery.com>

	* gdbint.texinfo (Prologue Analysis): New section.

Index: src/gdb/prologue-value.c
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.c
@@ -0,0 +1,591 @@
+/* Prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#include "defs.h"
+#include "gdb_string.h"
+#include "gdb_assert.h"
+#include "prologue-value.h"
+#include "regcache.h"
+
+\f
+/* Constructors.  */
+
+pv_t
+pv_unknown (void)
+{
+  pv_t v = { pvk_unknown, 0, 0 };
+
+  return v;
+}
+
+
+pv_t
+pv_constant (CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_constant;
+  v.reg = -1;                   /* for debugging */
+  v.k = k;
+
+  return v;
+}
+
+
+pv_t
+pv_register (int reg, CORE_ADDR k)
+{
+  pv_t v;
+
+  v.kind = pvk_register;
+  v.reg = reg;
+  v.k = k;
+
+  return v;
+}
+
+
+\f
+/* Arithmetic operations.  */
+
+/* If one of *A and *B is a constant, and the other isn't, swap the
+   values as necessary to ensure that *B is the constant.  This can
+   reduce the number of cases we need to analyze in the functions
+   below.  */
+static void
+constant_last (pv_t *a, pv_t *b)
+{
+  if (a->kind == pvk_constant
+      && b->kind != pvk_constant)
+    {
+      pv_t temp = *a;
+      *a = *b;
+      *b = temp;
+    }
+}
+
+
+pv_t
+pv_add (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can add a constant to a register.  */
+  if (a.kind == pvk_register
+      && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k + b.k);
+
+  /* We can add a constant to another constant.  */
+  else if (a.kind == pvk_constant
+           && b.kind == pvk_constant)
+    return pv_constant (a.k + b.k);
+
+  /* Anything else we don't know how to add.  We don't have a
+     representation for, say, the sum of two registers, or a multiple
+     of a register's value (adding a register to itself).  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_add_constant (pv_t v, CORE_ADDR k)
+{
+  /* Rather than thinking of all the cases we can and can't handle,
+     we'll just let pv_add take care of that for us.  */
+  return pv_add (v, pv_constant (k));
+}
+
+
+pv_t
+pv_subtract (pv_t a, pv_t b)
+{
+  /* This isn't quite the same as negating B and adding it to A, since
+     we don't have a representation for the negation of anything but a
+     constant.  For example, we can't negate { pvk_register, R1, 10 },
+     but we do know that { pvk_register, R1, 10 } minus { pvk_register,
+     R1, 5 } is { pvk_constant, <ignored>, 5 }.
+
+     This means, for example, that we could subtract two stack
+     addresses; they're both relative to the original SP.  Since the
+     frame pointer is set based on the SP, its value will be the
+     original SP plus some constant (probably zero), so we can use its
+     value just fine, too.  */
+
+  constant_last (&a, &b);
+
+  /* We can subtract two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k - b.k);
+
+  /* We can subtract a constant from a register.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_constant)
+    return pv_register (a.reg, a.k - b.k);
+
+  /* We can subtract a register from itself, yielding a constant.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg)
+    return pv_constant (a.k - b.k);
+
+  /* We don't know how to subtract anything else.  */
+  else
+    return pv_unknown ();
+}
+
+
+pv_t
+pv_logical_and (pv_t a, pv_t b)
+{
+  constant_last (&a, &b);
+
+  /* We can 'and' two constants.  */
+  if (a.kind == pvk_constant
+      && b.kind == pvk_constant)
+    return pv_constant (a.k & b.k);
+
+  /* We can 'and' anything with the constant zero.  */
+  else if (b.kind == pvk_constant
+           && b.k == 0)
+    return pv_constant (0);
+
+  /* We can 'and' anything with ~0.  */
+  else if (b.kind == pvk_constant
+           && b.k == ~ (CORE_ADDR) 0)
+    return a;
+
+  /* We can 'and' a register with itself.  */
+  else if (a.kind == pvk_register
+           && b.kind == pvk_register
+           && a.reg == b.reg
+           && a.k == b.k)
+    return a;
+
+  /* Otherwise, we don't know.  */
+  else
+    return pv_unknown ();
+}
+
+
+\f
+/* Examining prologue values.  */
+
+int
+pv_is_identical (pv_t a, pv_t b)
+{
+  if (a.kind != b.kind)
+    return 0;
+
+  switch (a.kind)
+    {
+    case pvk_unknown:
+      return 1;
+    case pvk_constant:
+      return (a.k == b.k);
+    case pvk_register:
+      return (a.reg == b.reg && a.k == b.k);
+    default:
+      gdb_assert (0);
+    }
+}
+
+
+int
+pv_is_constant (pv_t a)
+{
+  return (a.kind == pvk_constant);
+}
+
+
+int
+pv_is_register (pv_t a, int r)
+{
+  return (a.kind == pvk_register
+          && a.reg == r);
+}
+
+
+int
+pv_is_register_k (pv_t a, int r, CORE_ADDR k)
+{
+  return (a.kind == pvk_register
+          && a.reg == r
+          && a.k == k);
+}
+
+
+enum pv_boolean
+pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                 pv_t array_addr, CORE_ADDR array_len,
+                 CORE_ADDR elt_size,
+                 int *i)
+{
+  /* Note that, since .k is a CORE_ADDR, and CORE_ADDR is unsigned, if
+     addr is *before* the start of the array, then this isn't going to
+     be negative...  */
+  pv_t offset = pv_subtract (addr, array_addr);
+
+  if (offset.kind == pvk_constant)
+    {
+      /* This is a rather odd test.  We want to know if the SIZE bytes
+         at ADDR don't overlap the array at all, so you'd expect it to
+         be an || expression: "if we're completely before || we're
+         completely after".  But with unsigned arithmetic, things are
+         different: since it's a number circle, not a number line, the
+         right values for offset.k are actually one contiguous range.  */
+      if (offset.k <= -size
+          && offset.k >= array_len * elt_size)
+        return pv_definite_no;
+      else if (offset.k % elt_size != 0
+               || size != elt_size)
+        return pv_maybe;
+      else
+        {
+          *i = offset.k / elt_size;
+          return pv_definite_yes;
+        }
+    }
+  else
+    return pv_maybe;
+}
+
+
+\f
+/* Areas.  */
+
+
+/* A particular value known to be stored in an area.
+
+   Entries form a ring, sorted by unsigned offset from the area's base
+   register's value.  Since entries can straddle the wrap-around point,
+   unsigned offsets form a circle, not a number line, so the list
+   itself is structured the same way --- there is no inherent head.
+   The entry with the lowest offset simply follows the entry with the
+   highest offset.  Entries may abut, but never overlap.  The area's
+   'entry' pointer points to an arbitrary node in the ring.  */
+struct area_entry
+{
+  /* Links in the doubly-linked ring.  */
+  struct area_entry *prev, *next;
+
+  /* Offset of this entry's address from the value of the base
+     register.  */
+  CORE_ADDR offset;
+
+  /* The size of this entry.  Note that an entry may wrap around from
+     the end of the address space to the beginning.  */
+  CORE_ADDR size;
+
+  /* The value stored here.  */
+  pv_t value;
+};
+
+
+struct pv_area
+{
+  /* This area's base register.  */
+  int base_reg;
+
+  /* The mask to apply to addresses, to make the wrap-around happen at
+     the right place.  */
+  CORE_ADDR addr_mask;
+
+  /* An element of the doubly-linked ring of entries, or zero if we
+     have none.  */
+  struct area_entry *entry;
+};
+
+
+struct pv_area *
+make_pv_area (int base_reg)
+{
+  struct pv_area *a = (struct pv_area *) xmalloc (sizeof (*a));
+
+  memset (a, 0, sizeof (*a));
+
+  a->base_reg = base_reg;
+  a->entry = 0;
+
+  /* Remember that shift amounts equal to the type's width are
+     undefined.  */
+  a->addr_mask = ((((CORE_ADDR) 1 << (TARGET_ADDR_BIT - 1)) - 1) << 1) | 1;
+
+  return a;
+}
+
+
+/* Delete all entries from AREA.  */
+static void
+clear_entries (struct pv_area *area)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    {
+      /* This needs to be a do-while loop, in order to actually
+         process the node being checked for in the terminating
+         condition.  */
+      do
+        {
+          struct area_entry *next = e->next;
+          xfree (e);
+        }
+      while (e != area->entry);
+
+      area->entry = 0;
+    }
+}
+
+
+void
+free_pv_area (struct pv_area *area)
+{
+  clear_entries (area);
+  xfree (area);
+}
+
+
+static void
+do_free_pv_area_cleanup (void *arg)
+{
+  free_pv_area ((struct pv_area *) arg);
+}
+
+
+struct cleanup *
+make_cleanup_free_pv_area (struct pv_area *area)
+{
+  return make_cleanup (do_free_pv_area_cleanup, (void *) area);
+}
+
+
+int
+pv_area_store_would_trash (struct pv_area *area, pv_t addr)
+{
+  /* It may seem odd that pvk_constant appears here --- after all,
+     that's the case where we know the most about the address!  But
+     pv_areas are always relative to a register, and we don't know the
+     value of the register, so we can't compare entry addresses to
+     constants.  */
+  return (addr.kind == pvk_unknown
+          || addr.kind == pvk_constant
+          || (addr.kind == pvk_register && addr.reg != area->base_reg));
+}
+
+
+/* Return a pointer to the first entry we hit in AREA starting at
+   OFFSET and going forward.
+
+   This may return zero, if AREA has no entries.
+
+   And since the entries are a ring, this may return an entry that
+   entirely preceeds OFFSET.  This is the correct behavior: depending
+   on the sizes involved, we could still overlap such an area, with
+   wrap-around.  */
+static struct area_entry *
+find_entry (struct pv_area *area, CORE_ADDR offset)
+{
+  struct area_entry *e = area->entry;
+
+  if (! e)
+    return 0;
+
+  /* If the next entry would be better than the current one, then scan
+     forward.  Since we use '<' in this loop, it always terminates.
+
+     Note that, even setting aside the addr_mask stuff, we must not
+     simplify this, in high school algebra fashion, to
+     (e->next->offset < e->offset), because of the way < interacts
+     with wrap-around.  We have to subtract offset from both sides to
+     make sure both things we're comparing are on the same side of the
+     discontinuity.  */
+  while (((e->next->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->next;
+
+  /* If the previous entry would be better than the current one, then
+     scan backwards.  */
+  while (((e->prev->offset - offset) & area->addr_mask)
+         < ((e->offset - offset) & area->addr_mask))
+    e = e->prev;
+
+  /* In case there's some locality to the searches, set the area's
+     pointer to the entry we've found.  */
+  area->entry = e;
+
+  return e;
+}
+
+
+/* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY;
+   return zero otherwise.  AREA is the area to which ENTRY belongs.  */
+static int
+overlaps (struct pv_area *area,
+          struct area_entry *entry,
+          CORE_ADDR offset,
+          CORE_ADDR size)
+{
+  /* Think carefully about wrap-around before simplifying this.  */
+  return (((entry->offset - offset) & area->addr_mask) < size
+          || ((offset - entry->offset) & area->addr_mask) < entry->size);
+}
+
+
+void
+pv_area_store (struct pv_area *area,
+               pv_t addr,
+               CORE_ADDR size,
+               pv_t value)
+{
+  /* Remove any (potentially) overlapping entries.  */
+  if (pv_area_store_would_trash (area, addr))
+    clear_entries (area);
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* Delete all entries that we would overlap.  */
+      while (e && overlaps (area, e, offset, size))
+        {
+          struct area_entry *next = (e->next == e) ? 0 : e->next;
+          e->prev->next = e->next;
+          e->next->prev = e->prev;
+
+          xfree (e);
+          e = next;
+        }
+
+      /* Move the area's pointer to the next remaining entry.  This
+         will also zero the pointer if we've deleted all the entries.  */
+      area->entry = e;
+    }
+
+  /* Now, there are no entries overlapping us, and area->entry is
+     either zero or pointing at the closest entry after us.  We can
+     just insert ourselves before that.
+
+     But if we're storing an unknown value, don't bother --- that's
+     the default.  */
+  if (value.kind == pvk_unknown)
+    return;
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = (struct area_entry *) xmalloc (sizeof (*e));
+      e->offset = offset;
+      e->size = size;
+      e->value = value;
+
+      if (area->entry)
+        {
+          e->prev = area->entry->prev;
+          e->next = area->entry;
+          e->prev->next = e->next->prev = e;
+        }
+      else
+        {
+          e->prev = e->next = e;
+          area->entry = e;
+        }
+    }
+}
+
+
+pv_t
+pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size)
+{
+  /* If we have no entries, or we can't decide how ADDR relates to the
+     entries we do have, then the value is unknown.  */
+  if (! area->entry
+      || pv_area_store_would_trash (area, addr))
+    return pv_unknown ();
+  else
+    {
+      CORE_ADDR offset = addr.k;
+      struct area_entry *e = find_entry (area, offset);
+
+      /* If this entry exactly matches what we're looking for, then
+         we're set.  Otherwise, say it's unknown.  */
+      if (e->offset == offset && e->size == size)
+        return e->value;
+      else
+        return pv_unknown ();
+    }
+}
+
+
+int
+pv_area_find_reg (struct pv_area *area,
+                  struct gdbarch *gdbarch,
+                  int reg,
+                  CORE_ADDR *offset_p)
+{
+  struct area_entry *e = area->entry;
+
+  if (e)
+    do
+      {
+        if (e->value.kind == pvk_register
+            && e->value.reg == reg
+            && e->value.k == 0
+            && e->size == register_size (gdbarch, reg))
+          {
+            if (offset_p)
+              *offset_p = e->offset;
+            return 1;
+          }
+
+        e = e->next;
+      }
+    while (e != area->entry);
+
+  return 0;
+}
+
+
+void
+pv_area_scan (struct pv_area *area,
+              void (*func) (void *closure,
+                            pv_t addr,
+                            CORE_ADDR size,
+                            pv_t value),
+              void *closure)
+{
+  struct area_entry *e = area->entry;
+  pv_t addr;
+
+  addr.kind = pvk_register;
+  addr.reg = area->base_reg;
+
+  if (e)
+    do
+      {
+        addr.k = e->offset;
+        func (closure, addr, e->size, e->value);
+        e = e->next;
+      }
+    while (e != area->entry);
+}
Index: src/gdb/prologue-value.h
===================================================================
--- /dev/null
+++ src/gdb/prologue-value.h
@@ -0,0 +1,302 @@
+/* Interface to prologue value handling for GDB.
+   Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This file is part of GDB.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to:
+
+        Free Software Foundation, Inc.
+        51 Franklin St - Fifth Floor
+        Boston, MA 02110-1301
+        USA */
+
+#ifndef PROLOGUE_VALUE_H
+#define PROLOGUE_VALUE_H
+
+/* When we analyze a prologue, we're really doing 'abstract
+   interpretation' or 'pseudo-evaluation': running the function's code
+   in simulation, but using conservative approximations of the values
+   it would have when it actually runs.  For example, if our function
+   starts with the instruction:
+
+      addi r1, 42     # add 42 to r1
+
+   we don't know exactly what value will be in r1 after executing this
+   instruction, but we do know it'll be 42 greater than its original
+   value.
+
+   If we then see an instruction like:
+
+      addi r1, 22     # add 22 to r1
+
+   we still don't know what r1's value is, but again, we can say it is
+   now 64 greater than its original value.
+
+   If the next instruction were:
+
+      mov r2, r1      # set r2 to r1's value
+
+   then we can say that r2's value is now the original value of r1
+   plus 64.
+
+   It's common for prologues to save registers on the stack, so we'll
+   need to track the values of stack frame slots, as well as the
+   registers.  So after an instruction like this:
+
+      mov (fp+4), r2
+
+   then we'd know that the stack slot four bytes above the frame
+   pointer holds the original value of r1 plus 64.
+
+   And so on.
+
+   Of course, this can only go so far before it gets unreasonable.  If
+   we wanted to be able to say anything about the value of r1 after
+   the instruction:
+
+      xor r1, r3      # exclusive-or r1 and r3, place result in r1
+
+   then things would get pretty complex.  But remember, we're just
+   doing a conservative approximation; if exclusive-or instructions
+   aren't relevant to prologues, we can just say r1's value is now
+   'unknown'.  We can ignore things that are too complex, if that loss
+   of information is acceptable for our application.
+
+   So when I say "conservative approximation" here, what I mean is an
+   approximation that is either accurate, or marked "unknown", but
+   never inaccurate.
+
+   Once you've reached the current PC, or an instruction that you
+   don't know how to simulate, you stop.  Now you can examine the
+   state of the registers and stack slots you've kept track of.
+
+   - To see how large your stack frame is, just check the value of the
+     stack pointer register; if it's the original value of the SP
+     minus a constant, then that constant is the stack frame's size.
+     If the SP's value has been marked as 'unknown', then that means
+     the prologue has done something too complex for us to track, and
+     we don't know the frame size.
+
+   - To see where we've saved the previous frame's registers, we just
+     search the values we've tracked --- stack slots, usually, but
+     registers, too, if you want --- for something equal to the
+     register's original value.  If the ABI suggests a standard place
+     to save a given register, then we can check there first, but
+     really, anything that will get us back the original value will
+     probably work.
+
+   Sure, this takes some work.  But prologue analyzers aren't
+   quick-and-simple pattern patching to recognize a few fixed prologue
+   forms any more; they're big, hairy functions.  Along with inferior
+   function calls, prologue analysis accounts for a substantial
+   portion of the time needed to stabilize a GDB port.  So I think
+   it's worthwhile to look for an approach that will be easier to
+   understand and maintain.  In the approach used here:
+
+   - It's easier to see that the analyzer is correct: you just see
+     whether the analyzer properly (albiet conservatively) simulates
+     the effect of each instruction.
+
+   - It's easier to extend the analyzer: you can add support for new
+     instructions, and know that you haven't broken anything that
+     wasn't already broken before.
+
+   - It's orthogonal: to gather new information, you don't need to
+     complicate the code for each instruction.  As long as your domain
+     of conservative values is already detailed enough to tell you
+     what you need, then all the existing instruction simulations are
+     already gathering the right data for you.
+
+   A 'struct prologue_value' is a conservative approximation of the
+   real value the register or stack slot will have.  */
+
+struct prologue_value {
+
+  /* What sort of value is this?  This determines the interpretation
+     of subsequent fields.  */
+  enum {
+
+    /* We don't know anything about the value.  This is also used for
+       values we could have kept track of, when doing so would have
+       been too complex and we don't want to bother.  The bottom of
+       our lattice.  */
+    pvk_unknown,
+
+    /* A known constant.  K is its value.  */
+    pvk_constant,
+
+    /* The value that register REG originally had *UPON ENTRY TO THE
+       FUNCTION*, plus K.  If K is zero, this means, obviously, just
+       the value REG had upon entry to the function.  REG is a GDB
+       register number.  Before we start interpreting, we initialize
+       every register R to { pvk_register, R, 0 }.  */
+    pvk_register,
+
+  } kind;
+
+  /* The meanings of the following fields depend on 'kind'; see the
+     comments for the specific 'kind' values.  */
+  int reg;
+  CORE_ADDR k;
+};
+
+typedef struct prologue_value pv_t;
+
+
+/* Return the unknown prologue value --- { pvk_unknown, ?, ? }.  */
+pv_t pv_unknown (void);
+
+/* Return the prologue value representing the constant K.  */
+pv_t pv_constant (CORE_ADDR k);
+
+/* Return the prologue value representing the original value of
+   register REG, plus the constant K.  */
+pv_t pv_register (int reg, CORE_ADDR k);
+
+
+/* Return conservative approximations of the results of the following
+   operations.  */
+pv_t pv_add (pv_t a, pv_t b);               /* a + b */
+pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */
+pv_t pv_subtract (pv_t a, pv_t b);          /* a - b */
+pv_t pv_logical_and (pv_t a, pv_t b);       /* a & b */
+
+
+/* Return non-zero iff A and B are identical expressions.
+
+   This is not the same as asking if the two values are equal; the
+   result of such a comparison would have to be a pv_boolean, and
+   asking whether two 'unknown' values were equal would give you
+   pv_maybe.  Same for comparing, say, { pvk_register, R1, 0 } and {
+   pvk_register, R2, 0}.
+
+   Instead, this function asks whether the two representations are the
+   same.  */
+int pv_is_identical (pv_t a, pv_t b);
+
+
+/* Return non-zero if A is known to be a constant.  */
+int pv_is_constant (pv_t a);
+
+/* Return non-zero if A is the original value of register number R
+   plus some constant, zero otherwise.  */
+int pv_is_register (pv_t a, int r);
+
+
+/* Return non-zero if A is the original value of register R plus the
+   constant K.  */
+int pv_is_register_k (pv_t a, int r, CORE_ADDR k);
+
+/* A conservative boolean type, including "maybe", when we can't
+   figure out whether something is true or not.  */
+enum pv_boolean {
+  pv_maybe,
+  pv_definite_yes,
+  pv_definite_no,
+};
+
+
+/* Decide whether a reference to SIZE bytes at ADDR refers exactly to
+   an element of an array.  The array starts at ARRAY_ADDR, and has
+   ARRAY_LEN values of ELT_SIZE bytes each.  If ADDR definitely does
+   refer to an array element, set *I to the index of the referenced
+   element in the array, and return pv_definite_yes.  If it definitely
+   doesn't, return pv_definite_no.  If we can't tell, return pv_maybe.
+
+   If the reference does touch the array, but doesn't fall exactly on
+   an element boundary, or doesn't refer to the whole element, return
+   pv_maybe.  */
+enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size,
+                                 pv_t array_addr, CORE_ADDR array_len,
+                                 CORE_ADDR elt_size,
+                                 int *i);
+
+
+/* A 'struct pv_area' keeps track of values stored in a particular
+   region of memory.  */
+struct pv_area;
+
+/* Create a new area, tracking stores relative to the original value
+   of BASE_REG.  If BASE_REG is SP, then this effectively records the
+   contents of the stack frame: the original value of the SP is the
+   frame's CFA, or some constant offset from it.
+
+   Stores to constant addresses, unknown addresses, or to addresses
+   relative to registers other than BASE_REG will trash this area; see
+   pv_area_store_would_trash.  */
+struct pv_area *make_pv_area (int base_reg);
+
+/* Free AREA.  */
+void free_pv_area (struct pv_area *area);
+
+
+/* Register a cleanup to free AREA.  */
+struct cleanup *make_cleanup_free_pv_area (struct pv_area *area);
+
+
+/* Store the SIZE-byte value VALUE at ADDR in AREA.
+
+   If ADDR is not relative to the same base register we used in
+   creating AREA, then we can't tell which values here the stored
+   value might overlap, and we'll have to mark everything as
+   unknown.  */
+void pv_area_store (struct pv_area *area,
+                    pv_t addr,
+                    CORE_ADDR size,
+                    pv_t value);
+
+/* Return the SIZE-byte value at ADDR in AREA.  This may return
+   pv_unknown ().  */
+pv_t pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size);
+
+/* Return true if storing to address ADDR in AREA would force us to
+   mark the contents of the entire area as unknown.  This could happen
+   if, say, ADDR is unknown, since we could be storing anywhere.  Or,
+   it could happen if ADDR is relative to a different register than
+   the other stores base register, since we don't know the relative
+   values of the two registers.
+
+   If you've reached such a store, it may be better to simply stop the
+   prologue analysis, and return the information you've gathered,
+   instead of losing all that information, most of which is probably
+   okay.  */
+int pv_area_store_would_trash (struct pv_area *area, pv_t addr);
+
+
+/* Search AREA for the original value of REGISTER.  If we can't find
+   it, return zero; if we can find it, return a non-zero value, and if
+   OFFSET_P is non-zero, set *OFFSET_P to the register's offset within
+   AREA.  GDBARCH is the architecture of which REGISTER is a member.
+
+   In the worst case, this takes time proportional to the number of
+   items stored in AREA.  If you plan to gather a lot of information
+   about registers saved in AREA, consider calling pv_area_scan
+   instead, and collecting all your information in one pass.  */
+int pv_area_find_reg (struct pv_area *area,
+                      struct gdbarch *gdbarch,
+                      int register,
+                      CORE_ADDR *offset_p);
+
+
+/* For every part of AREA whose value we know, apply FUNC to CLOSURE,
+   the value's address, its size, and the value itself.  */
+void pv_area_scan (struct pv_area *area,
+                   void (*func) (void *closure,
+                                 pv_t addr,
+                                 CORE_ADDR size,
+                                 pv_t value),
+                   void *closure);
+
+
+#endif /* PROLOGUE_VALUE_H */
Index: src/gdb/Makefile.in
===================================================================
--- src.orig/gdb/Makefile.in
+++ src/gdb/Makefile.in
@@ -542,6 +542,7 @@ SFILES = ada-exp.y ada-lang.c ada-typepr
 	objc-exp.y objc-lang.c \
 	objfiles.c osabi.c observer.c \
 	p-exp.y p-lang.c p-typeprint.c p-valprint.c parse.c printcmd.c \
+	prologue-value.c \
 	regcache.c reggroups.c remote.c remote-fileio.c \
 	scm-exp.c scm-lang.c scm-valprint.c \
 	sentinel-frame.c \
@@ -757,6 +758,7 @@ ppcnbsd_tdep_h = ppcnbsd-tdep.h
 ppcobsd_tdep_h = ppcobsd-tdep.h
 ppc_tdep_h = ppc-tdep.h
 proc_utils_h = proc-utils.h
+prologue_value_h = prologue-value.h
 regcache_h = regcache.h
 reggroups_h = reggroups.h
 regset_h = regset.h
@@ -867,6 +869,7 @@ HFILES_NO_SRCDIR = bcache.h buildsym.h c
 	symfile.h stabsread.h target.h terminal.h typeprint.h \
 	xcoffsolib.h \
 	macrotab.h macroexp.h macroscope.h \
+	prologue-value.h \
 	ada-lang.h c-lang.h f-lang.h \
 	jv-lang.h \
 	m2-lang.h  p-lang.h \
@@ -2437,6 +2440,8 @@ procfs.o: procfs.c $(defs_h) $(inferior_
 proc-service.o: proc-service.c $(defs_h) $(gdb_proc_service_h) $(inferior_h) \
 	$(symtab_h) $(target_h) $(gregset_h)
 proc-why.o: proc-why.c $(defs_h) $(proc_utils_h)
+prologue-value.o: prologue-value.c $(defs_h) $(gdb_string_h) $(gdb_assert_h) \
+	$(prologue_value_h) $(regcache_h)
 p-typeprint.o: p-typeprint.c $(defs_h) $(gdb_obstack_h) $(bfd_h) $(symtab_h) \
 	$(gdbtypes_h) $(expression_h) $(value_h) $(gdbcore_h) $(target_h) \
 	$(language_h) $(p_lang_h) $(typeprint_h) $(gdb_string_h)
Index: src/gdb/doc/gdbint.texinfo
===================================================================
--- src.orig/gdb/doc/gdbint.texinfo
+++ src/gdb/doc/gdbint.texinfo
@@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame 
 @code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and
 @code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame.
 
+@section Prologue Analysis
+
+@cindex prologue analysis
+@cindex call frame information
+@cindex CFI (call frame information)
+To produce a backtrace and allow the user to manipulate older frames'
+variables and arguments, @value{GDBN} needs to find the base addresses
+of older frames, and discover where those frames' registers have been
+saved.  Since a frame's ``callee-saves'' registers get saved by
+younger frames if and when they're reused, a frame's registers may be
+scattered unpredictably across younger frames.  This means that
+changing the value of a register-allocated variable in an older frame
+may actually entail writing to a save slot in some younger frame.
+
+Modern versions of GCC emit Dwarf call frame information (``CFI''),
+which describes how to find frame base addresses and saved registers.
+But CFI is not always available, so as a fallback @value{GDBN} uses a
+technique called @dfn{prologue analysis} to find frame sizes and saved
+registers.  A prologue analyzer disassembles the function's machine
+code starting from its entry point, and looks for instructions that
+allocate frame space, save the stack pointer in a frame pointer
+register, save registers, and so on.  Obviously, this can't be done
+accurately in general, but it's tractible to do well enough to be very
+helpful.  Prologue analysis predates the GNU toolchain's support for
+CFI; at one time, prologue analysis was the only mechanism
+@value{GDBN} used for stack unwinding at all, when the function
+calling conventions didn't specify a fixed frame layout.
+
+In the olden days, function prologues were generated by hand-written,
+target-specific code in GCC, and treated as opaque and untouchable by
+optimizers.  Looking at this code, it was usually straightforward to
+write a prologue analyzer for @value{GDBN} that would accurately
+understand all the prologues GCC would generate.  However, over time
+GCC became more aggressive about instruction scheduling, and began to
+understand more about the semantics of the prologue instructions
+themselves; in response, @value{GDBN}'s analyzers became more complex
+and fragile.  Keeping the prologue analyzers working as GCC (and the
+instruction sets themselves) evolved became a substantial task.
+
+@cindex @file{prologue-value.c}
+@cindex abstract interpretation of function prologues
+@cindex pseudo-evaluation of function prologues
+To try to address this problem, the code in @file{prologue-value.h}
+and @file{prologue-value.c} provides a general framework for writing
+prologue analyzers that are simpler and more robust than ad-hoc
+analyzers.  When we analyze a prologue using the prologue-value
+framework, we're really doing ``abstract interpretation'' or
+``pseudo-evaluation'': running the function's code in simulation, but
+using conservative approximations of the values registers and memory
+would hold when the code actually runs.  For example, if our function
+starts with the instruction:
+
+@example
+addi r1, 42     # add 42 to r1
+@end example
+@noindent
+we don't know exactly what value will be in @code{r1} after executing
+this instruction, but we do know it'll be 42 greater than its original
+value.
+
+If we then see an instruction like:
+
+@example
+addi r1, 22     # add 22 to r1
+@end example
+@noindent
+we still don't know what @code{r1's} value is, but again, we can say
+it is now 64 greater than its original value.
+
+If the next instruction were:
+
+@example
+mov r2, r1      # set r2 to r1's value
+@end example
+@noindent
+then we can say that @code{r2's} value is now the original value of
+@code{r1} plus 64.
+
+It's common for prologues to save registers on the stack, so we'll
+need to track the values of stack frame slots, as well as the
+registers.  So after an instruction like this:
+
+@example
+mov (fp+4), r2
+@end example
+@noindent
+then we'd know that the stack slot four bytes above the frame pointer
+holds the original value of @code{r1} plus 64.
+
+And so on.
+
+Of course, this can only go so far before it gets unreasonable.  If we
+wanted to be able to say anything about the value of @code{r1} after
+the instruction:
+
+@example
+xor r1, r3      # exclusive-or r1 and r3, place result in r1
+@end example
+@noindent
+then things would get pretty complex.  But remember, we're just doing
+a conservative approximation; if exclusive-or instructions aren't
+relevant to prologues, we can just say @code{r1}'s value is now
+``unknown''.  We can ignore things that are too complex, if that loss of
+information is acceptable for our application.
+
+So when we say ``conservative approximation'' here, what we mean is an
+approximation that is either accurate, or marked ``unknown'', but
+never inaccurate.
+
+Using this framework, a prologue analyzer is simply an interpreter for
+machine code, but one that uses conservative approximations for the
+contents of registers and memory instead of actual values.  Starting
+from the function's entry point, you simulate instructions up to the
+current PC, or an instruction that you don't know how to simulate.
+Now you can examine the state of the registers and stack slots you've
+kept track of.
+
+@itemize @bullet
+
+@item
+To see how large your stack frame is, just check the value of the
+stack pointer register; if it's the original value of the SP
+minus a constant, then that constant is the stack frame's size.
+If the SP's value has been marked as ``unknown'', then that means
+the prologue has done something too complex for us to track, and
+we don't know the frame size.
+
+@item
+To see where we've saved the previous frame's registers, we just
+search the values we've tracked --- stack slots, usually, but
+registers, too, if you want --- for something equal to the register's
+original value.  If the calling conventions suggest a standard place
+to save a given register, then we can check there first, but really,
+anything that will get us back the original value will probably work.
+@end itemize
+
+This does take some work.  But prologue analyzers aren't
+quick-and-simple pattern patching to recognize a few fixed prologue
+forms any more; they're big, hairy functions.  Along with inferior
+function calls, prologue analysis accounts for a substantial portion
+of the time needed to stabilize a @value{GDBN} port.  So it's
+worthwhile to look for an approach that will be easier to understand
+and maintain.  In the approach described above:
+
+@itemize @bullet
+
+@item
+It's easier to see that the analyzer is correct: you just see
+whether the analyzer properly (albiet conservatively) simulates
+the effect of each instruction.
+
+@item
+It's easier to extend the analyzer: you can add support for new
+instructions, and know that you haven't broken anything that
+wasn't already broken before.
+
+@item
+It's orthogonal: to gather new information, you don't need to
+complicate the code for each instruction.  As long as your domain
+of conservative values is already detailed enough to tell you
+what you need, then all the existing instruction simulations are
+already gathering the right data for you.
+
+@end itemize
+
+The file @file{prologue-value.h} contains detailed comments explaining
+the framework and how to use it.
+
+
 @section Breakpoint Handling
 
 @cindex breakpoints


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-03-28 21:59             ` Jim Blandy
@ 2006-04-07 19:19               ` Ulrich Weigand
  2006-04-08 23:40                 ` Jim Blandy
  0 siblings, 1 reply; 17+ messages in thread
From: Ulrich Weigand @ 2006-04-07 19:19 UTC (permalink / raw)
  To: Jim Blandy; +Cc: Eli Zaretskii, gdb-patches

Jim Blandy wrote:

> 2006-03-28  Jim Blandy  <jimb@codesourcery.com>
> 
> 	* prologue-value.c, prologue-value.h: New files.
> 	* Makefile.in (prologue_value_h): New variable.
> 	(HFILES_NO_SRCDIR): List prologue-value.h.
> 	(SFILES): List prologue-value.c.
> 	(COMMON_OBS): List prologue-value.o.

However, the patch doesn't contain the latter change 
to COMMON_OBS.  Was this an oversight?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  Linux on zSeries Development
  Ulrich.Weigand@de.ibm.com


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: RFA: prologue value modules
  2006-04-07 19:19               ` Ulrich Weigand
@ 2006-04-08 23:40                 ` Jim Blandy
  0 siblings, 0 replies; 17+ messages in thread
From: Jim Blandy @ 2006-04-08 23:40 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: Jim Blandy, Eli Zaretskii, gdb-patches

On 4/7/06, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> Jim Blandy wrote:
>
> > 2006-03-28  Jim Blandy  <jimb@codesourcery.com>
> >
> >       * prologue-value.c, prologue-value.h: New files.
> >       * Makefile.in (prologue_value_h): New variable.
> >       (HFILES_NO_SRCDIR): List prologue-value.h.
> >       (SFILES): List prologue-value.c.
> >       (COMMON_OBS): List prologue-value.o.
>
> However, the patch doesn't contain the latter change
> to COMMON_OBS.  Was this an oversight?

Yes!  Thanks for noticing that.  I've committed the below as obvious.


Index: ChangeLog
===================================================================
RCS file: /cvs/src/src/gdb/ChangeLog,v
retrieving revision 1.7690
diff -u -p -r1.7690 ChangeLog
--- ChangeLog   8 Apr 2006 21:15:26 -0000       1.7690
+++ ChangeLog   8 Apr 2006 23:39:42 -0000
@@ -1,3 +1,8 @@
+2006-04-08  Jim Blandy  <jimb@codesourcery.com>
+
+       * Makefile.in (COMMON_OBS): List prologue-value.o.  (Omitted from
+       last patch.)
+
 2006-04-08  David S. Miller  <davem@sunset.davemloft.net>

        * sparc-linux-tdep.c (sparc32_linux_step_trap): New.
Index: Makefile.in
===================================================================
RCS file: /cvs/src/src/gdb/Makefile.in,v
retrieving revision 1.806
diff -u -p -r1.806 Makefile.in
--- Makefile.in 8 Apr 2006 21:15:26 -0000       1.806
+++ Makefile.in 8 Apr 2006 23:39:43 -0000
@@ -952,7 +952,8 @@ COMMON_OBS = $(DEPFILES) $(CONFIG_OBS) $
        reggroups.o regset.o \
        trad-frame.o \
        tramp-frame.o \
-       solib.o solib-null.o
+       solib.o solib-null.o \
+       prologue-value.o

 TSOBS = inflow.o


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2006-04-08 23:40 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-03-23  5:20 RFA: prologue value modules Jim Blandy
2006-03-23  5:24 ` Daniel Jacobowitz
2006-03-23  5:32   ` Randolph Chung
2006-03-23  5:39   ` Jim Blandy
2006-03-23 14:06     ` Jim Blandy
2006-03-24  1:03     ` Jim Blandy
2006-03-24  2:40       ` Jim Blandy
2006-03-28  7:51         ` Jim Blandy
2006-03-24 14:48 ` Eli Zaretskii
2006-03-24 22:43   ` Jim Blandy
2006-03-28  2:24     ` Jim Blandy
2006-03-28 18:58       ` Eli Zaretskii
2006-03-28 19:15         ` Jim Blandy
2006-03-28 21:17           ` Eli Zaretskii
2006-03-28 21:59             ` Jim Blandy
2006-04-07 19:19               ` Ulrich Weigand
2006-04-08 23:40                 ` Jim Blandy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox