[RFA] Add la_getstr member to language

Mirror of the gdb-patches mailing list
 help / color / mirror / Atom feed

* [RFA] Add la_getstr member to language_defn
@ 2008-11-24 13:24 Thiago Jung Bauermann
  2008-11-24 13:32 ` Daniel Jacobowitz
  2008-11-24 20:03 ` Tom Tromey
  0 siblings, 2 replies; 33+ messages in thread
From: Thiago Jung Bauermann @ 2008-11-24 13:24 UTC (permalink / raw)
  To: gdb-patches ml

Hi,

In the python branch, I needed to read a string fro the inferior in a
language-agnostic way. There's currently no way to do that. This patch
addresses the issue by providing a la_getstr function to language_defn.
Each language supported by GDB can provide a way to read strings from a
value by suppling its version of the function. I implemented it for C,
in c_getstr. It assumes NULL-terminated strings, but doesn't assume that
characters are 1-byte wide. I installed c_getstr for C++ and asm as
well.

I also factored out a read_string function out of val_print_string,
which is handy to use as a base for la_getstr implementations.

I intend to post a patch which extends the value binding in python to
use this code Real Soon Nowâ„¢.

One improvement which is easy to make but I left for later (this patch
is big enough already) is to nuke target_read_string and make its
callers use read_string instead (the former implements just a subset of
what the latter provides). It would be a simple modification, but would
affect code specific to targets to which I don't have access to
(target_read_string is almost exclusively used in solib-*.c). I can work
on a subsequent patch to do that if you think it's safe enough to not
require my testing on all affected targets.

Tested on ppc-linux, introduces no regression. Ok to commit?
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


2008-11-23  Thiago Jung Bauermann  <bauerman@br.ibm.com>

	* language.h (language_dfn): Add la_getstr member.
	(LA_GET_STRING): New macro.
	(default_getstr): New prototype.
	* language.c (default_getstr): New function.
	(unknown_language_defn, auto_language_defn, local_language_defn): Use
	default_getstr for la_getstr.
	* c-lang.c (c_getstr): New function.
	(c_language_defn, cplus_language_defn, asm_language_defn): Use
	c_getstr for la_getstr.
	(minimal_language_defn): Use default_getstr for la_getstr.
	* ada-lang.c (ada_language_defn): Likewise.
	* f-lang.c (f_language_defn): Likewise.
	* jv-lang.c (java_language_defn): Likewise.
	* m2-lang.c (m2_language_defn): Likewise.
	* objc-lang.c (objc_language_defn): Likewise.
	* p-lang.c (p_language_defn): Likewise.
	* scm-lang.c (scm_language_defn): Likewise.
	* typeprint.c (type_to_string): New function.
	* value.h (type_to_string): New prototype.
	* valprint.c (val_print_string): Factor out code for reading string
	from the inferior into its own function.  Put 2 spaces after period
	in comments.
	(read_string): New function.
	* valprint.h (read_string): New prototype.

Index: gdb.git/gdb/ada-lang.c
===================================================================
--- gdb.git.orig/gdb/ada-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/ada-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -11084,6 +11084,7 @@ const struct language_defn ada_language_
   ada_language_arch_info,
   ada_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/c-lang.c
===================================================================
--- gdb.git.orig/gdb/c-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/c-lang.c	2008-11-23 02:55:44.000000000 -0200
@@ -181,6 +181,104 @@ c_printstr (struct ui_file *stream, cons
   if (force_ellipses || i < length)
     fputs_filtered ("...", stream);
 }
+
+/* Obtain a C string from the inferior, storing it in a newly allocated
+   buffer in BUFFER, which should be freed by the caller.  If VALUE is an
+   array with known length, the function will copy all of its contents to
+   the buffer.  If the length is not known, read until a null byte is found.
+   LENGTH will contain the size of the string (not counting the NULL
+   character).
+
+   Assumes strings are terminated by a NULL character.  The size of a character
+   is determined by the length of the target type of the pointer or array.
+   This means that a NULL byte present in a multi-byte character will not
+   terminate the string unless the whole character is NULL.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check LENGTH to recognize this situation.
+
+   Return 0 on success, errno on failure.  */
+
+static int
+c_getstr (struct value *value, gdb_byte **buffer, int *length)
+{
+  int fetchlimit, err, width;
+  struct type *type = value_type (value);
+  struct type *element_type = TYPE_TARGET_TYPE (type);
+
+  if (element_type == NULL)
+    goto error;
+
+  if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+    {
+      /* If we know the size of the array, we can use it as a limit on the
+	 number of characters to be fetched.  */
+      if ((TYPE_NFIELDS (type) == 1)
+	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
+	{
+	  LONGEST low_bound, high_bound;
+
+	  get_discrete_bounds (TYPE_FIELD_TYPE (type, 0),
+			       &low_bound, &high_bound);
+	  fetchlimit = high_bound - low_bound + 1;
+	}
+      else
+	fetchlimit = -1;
+    }
+  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)
+    fetchlimit = -1;
+  else
+    /* We work only with arrays and pointers.  */
+    goto error;
+
+  element_type = check_typedef (element_type);
+  if ((TYPE_CODE (element_type) != TYPE_CODE_INT)
+      && (TYPE_CODE (element_type) != TYPE_CODE_CHAR))
+    /* If the elements are not integers or characters, we don't consider it
+       a string.  */
+    goto error;
+
+  width = TYPE_LENGTH (element_type);
+
+  /* If the string lives in GDB's memory intead of the inferior's, then we
+     just need to copy it to BUFFER.  Also, since such strings are arrays
+     with known size, FETCHLIMIT will hold the size of the string.  */
+  if (((VALUE_LVAL (value) == not_lval)
+      || (VALUE_LVAL (value) == lval_internalvar)) && (fetchlimit != -1))
+    {
+      *length = fetchlimit;
+      *buffer = xmalloc (*length * width);
+      memcpy (*buffer, value_contents (value), *length * width);
+      err = 0;
+    }
+  else
+    err = read_string (value_as_address (value), -1, width, fetchlimit,
+			 buffer, length);
+
+  /* If the last character is NULL, subtract it from length.  */
+  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
+      *length -= width;
+
+  return err;
+
+error:
+  {
+    char *type_str;
+    struct cleanup *old_chain;
+
+    type_str = type_to_string (type);
+    if (type_str)
+      {
+	old_chain = make_cleanup (xfree, type_str);
+	error (_("Trying to read string with inappropriate type `%s'."),
+	       type_str);
+      }
+    else
+      error (_("Trying to read string with inappropriate type."));
+  }
+}
+
 \f
 /* Preprocessing and parsing C and C++ expressions.  */
 
@@ -419,6 +517,7 @@ const struct language_defn c_language_de
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_getstr,
   LANG_MAGIC
 };
 
@@ -537,6 +636,7 @@ const struct language_defn cplus_languag
   cplus_language_arch_info,
   default_print_array_index,
   cp_pass_by_reference,
+  c_getstr,
   LANG_MAGIC
 };
 
@@ -574,6 +674,7 @@ const struct language_defn asm_language_
   c_language_arch_info, /* FIXME: la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  c_getstr,
   LANG_MAGIC
 };
 
@@ -616,6 +717,7 @@ const struct language_defn minimal_langu
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/f-lang.c
===================================================================
--- gdb.git.orig/gdb/f-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/f-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -343,6 +343,7 @@ const struct language_defn f_language_de
   f_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/jv-lang.c
===================================================================
--- gdb.git.orig/gdb/jv-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/jv-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -1128,6 +1128,7 @@ const struct language_defn java_language
   java_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/language.c
===================================================================
--- gdb.git.orig/gdb/language.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/language.c	2008-11-23 02:55:37.000000000 -0200
@@ -1043,6 +1043,12 @@ default_print_array_index (struct value 
   fprintf_filtered (stream, "] = ");
 }
 
+int
+default_getstr (struct value *value, gdb_byte **buffer, int *length)
+{
+  error (_("Getting a string is unsupported in this language."));
+}
+
 /* Define the language that is no language.  */
 
 static int
@@ -1165,6 +1171,7 @@ const struct language_defn unknown_langu
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
@@ -1203,6 +1210,7 @@ const struct language_defn auto_language
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
@@ -1240,6 +1248,7 @@ const struct language_defn local_languag
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 \f
Index: gdb.git/gdb/language.h
===================================================================
--- gdb.git.orig/gdb/language.h	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/language.h	2008-11-23 02:55:37.000000000 -0200
@@ -282,6 +282,12 @@ struct language_defn
        reference at the language level.  */
     int (*la_pass_by_reference) (struct type *type);
 
+    /* Obtain a string from the inferior, storing it in a newly allocated
+       buffer in BUFFER, which should be freed by the caller.  LENGTH will
+       hold the size in bytes of the string (only actual characters, excluding
+       an eventual NULL terminating character).  */
+    int (*la_getstr) (struct value *value, gdb_byte **buffer, int *length);
+
     /* Add fields above this point, so the magic number is always last. */
     /* Magic number for compat checking */
 
@@ -380,6 +386,8 @@ extern enum language set_language (enum 
 				 force_ellipses,options))
 #define LA_EMIT_CHAR(ch, stream, quoter) \
   (current_language->la_emitchar(ch, stream, quoter))
+#define LA_GET_STRING(value, buffer, length) \
+  (current_language->la_getstr(value, buffer, length))
 
 #define LA_PRINT_ARRAY_INDEX(index_value, stream, optins) \
   (current_language->la_print_array_index(index_value, stream, options))
@@ -489,4 +497,6 @@ int default_pass_by_reference (struct ty
 void default_print_typedef (struct type *type, struct symbol *new_symbol,
 			    struct ui_file *stream);
 
+int default_getstr (struct value *value, gdb_byte **buffer, int *length);
+
 #endif /* defined (LANGUAGE_H) */
Index: gdb.git/gdb/m2-lang.c
===================================================================
--- gdb.git.orig/gdb/m2-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/m2-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -394,6 +394,7 @@ const struct language_defn m2_language_d
   m2_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/objc-lang.c
===================================================================
--- gdb.git.orig/gdb/objc-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/objc-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -531,6 +531,7 @@ const struct language_defn objc_language
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/p-lang.c
===================================================================
--- gdb.git.orig/gdb/p-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/p-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -433,6 +433,7 @@ const struct language_defn pascal_langua
   pascal_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/scm-lang.c
===================================================================
--- gdb.git.orig/gdb/scm-lang.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/scm-lang.c	2008-11-23 02:55:37.000000000 -0200
@@ -273,6 +273,7 @@ const struct language_defn scm_language_
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/typeprint.c
===================================================================
--- gdb.git.orig/gdb/typeprint.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/typeprint.c	2008-11-23 02:55:37.000000000 -0200
@@ -33,6 +33,7 @@
 #include "cp-abi.h"
 #include "typeprint.h"
 #include "gdb_string.h"
+#include "exceptions.h"
 #include "valprint.h"
 #include <errno.h>
 
@@ -78,6 +79,34 @@ type_print (struct type *type, char *var
   LA_PRINT_TYPE (type, varstring, stream, show, 0);
 }
 
+/* Print TYPE to a string, returning it.  The caller is responsible for
+   freeing the string.  */
+
+char *
+type_to_string (struct type *type)
+{
+  char *s = NULL;
+  long dummy;
+  struct ui_file *stb;
+  struct cleanup *old_chain;
+  volatile struct gdb_exception except;
+
+  stb = mem_fileopen ();
+  old_chain = make_cleanup_ui_file_delete (stb);
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      type_print (type, "", stb, -1);
+      s = ui_file_xstrdup (stb, &dummy);
+    }
+  if (except.reason < 0)
+    s = NULL;
+
+  do_cleanups (old_chain);
+
+  return s;
+}
+
 /* Print type of EXP, or last thing in value history if EXP == NULL.
    show is passed to type_print.  */
 
Index: gdb.git/gdb/valprint.c
===================================================================
--- gdb.git.orig/gdb/valprint.c	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/valprint.c	2008-11-23 02:55:37.000000000 -0200
@@ -1177,40 +1177,40 @@ partial_memory_read (CORE_ADDR memaddr, 
   return (nread);
 }
 
-/*  Print a string from the inferior, starting at ADDR and printing up to LEN
-   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
-   stops at the first null byte, otherwise printing proceeds (including null
-   bytes) until either print_max or LEN characters have been printed,
-   whichever is smaller. */
-
-/* FIXME: Use target_read_string.  */
+/* Read a string from the inferior, at ADDR, with LEN characters of WIDTH bytes
+   each.  Fetch at most FETCHLIMIT characters.  BUFFER will be set to a newly
+   allocated buffer containing the string, which the caller is responsible to
+   free, and BYTES_READ will be set to the number of bytes read.  Returns 0 on
+   success, or errno on failure.
+
+   If LEN is -1, stops at the first null character (not necessarily the first
+   null byte), otherwise reading proceeds (including null characters) until
+   FETCHLIMIT characters have been read.  Set FETCHLIMIT to UINT_MAX to read
+   as many characters as possible from the string.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check BYTES_READ to recognize this situation.
+
+   Note: There was a FIXME asking to make this code use target_read_string,
+   but this function is more general (can read past null characters, up to
+   given LEN). Besides, it is used much more often than target_read_string
+   so it is more tested.  Perhaps target_read_string should be rewritten to
+   use this function?  */
 
 int
-val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
-		  const struct value_print_options *options)
+read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+	     gdb_byte **buffer, int *bytes_read)
 {
-  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero. */
-  int errcode;			/* Errno returned from bad reads. */
-  unsigned int fetchlimit;	/* Maximum number of chars to print. */
-  unsigned int nfetch;		/* Chars to fetch / chars fetched. */
-  unsigned int chunksize;	/* Size of each fetch, in chars. */
-  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer. */
-  gdb_byte *bufptr;		/* Pointer to next available byte in buffer. */
-  gdb_byte *limit;		/* First location past end of fetch buffer. */
-  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain. */
-  int found_nul;		/* Non-zero if we found the nul char */
+  int found_nul;		/* Non-zero if we found the nul char.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  unsigned int nfetch;		/* Chars to fetch / chars fetched.  */
+  unsigned int chunksize;	/* Size of each fetch, in chars.  */
+  gdb_byte *bufptr;		/* Pointer to next available byte in buffer.  */
+  gdb_byte *limit;		/* First location past end of fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
 
-  /* First we need to figure out the limit on the number of characters we are
-     going to attempt to fetch and print.  This is actually pretty simple.  If
-     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
-     LEN is -1, then the limit is print_max.  This is true regardless of
-     whether print_max is zero, UINT_MAX (unlimited), or something in between,
-     because finding the null byte (or available memory) is what actually
-     limits the fetch. */
-
-  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
-
-  /* Now decide how large of chunks to try to read in one operation.  This
+  /* Decide how large of chunks to try to read in one operation.  This
      is also pretty simple.  If LEN >= zero, then we want fetchlimit chars,
      so we might as well read them all in one operation.  If LEN is -1, we
      are looking for a null terminator to end the fetching, so we might as
@@ -1222,16 +1222,16 @@ val_print_string (CORE_ADDR addr, int le
   chunksize = (len == -1 ? min (8, fetchlimit) : fetchlimit);
 
   /* Loop until we either have all the characters to print, or we encounter
-     some error, such as bumping into the end of the address space. */
+     some error, such as bumping into the end of the address space.  */
 
   found_nul = 0;
   old_chain = make_cleanup (null_cleanup, 0);
 
   if (len > 0)
     {
-      buffer = (gdb_byte *) xmalloc (len * width);
-      bufptr = buffer;
-      old_chain = make_cleanup (xfree, buffer);
+      *buffer = (gdb_byte *) xmalloc (len * width);
+      bufptr = *buffer;
+      old_chain = make_cleanup (xfree, *buffer);
 
       nfetch = partial_memory_read (addr, bufptr, len * width, &errcode)
 	/ width;
@@ -1241,32 +1241,36 @@ val_print_string (CORE_ADDR addr, int le
   else if (len == -1)
     {
       unsigned long bufsize = 0;
+
+      *buffer = NULL;
+
       do
 	{
 	  QUIT;
 	  nfetch = min (chunksize, fetchlimit - bufsize);
 
-	  if (buffer == NULL)
-	    buffer = (gdb_byte *) xmalloc (nfetch * width);
+	  if (*buffer == NULL)
+	    *buffer = (gdb_byte *) xmalloc (nfetch * width);
 	  else
 	    {
 	      discard_cleanups (old_chain);
-	      buffer = (gdb_byte *) xrealloc (buffer, (nfetch + bufsize) * width);
+	      *buffer = (gdb_byte *) xrealloc (*buffer,
+					       (nfetch + bufsize) * width);
 	    }
 
-	  old_chain = make_cleanup (xfree, buffer);
-	  bufptr = buffer + bufsize * width;
+	  old_chain = make_cleanup (xfree, *buffer);
+	  bufptr = *buffer + bufsize * width;
 	  bufsize += nfetch;
 
-	  /* Read as much as we can. */
+	  /* Read as much as we can.  */
 	  nfetch = partial_memory_read (addr, bufptr, nfetch * width, &errcode)
-	    / width;
+		    / width;
 
 	  /* Scan this chunk for the null byte that terminates the string
 	     to print.  If found, we don't need to fetch any more.  Note
 	     that bufptr is explicitly left pointing at the next character
 	     after the null byte, or at the next character after the end of
-	     the buffer. */
+	     the buffer.  */
 
 	  limit = bufptr + nfetch * width;
 	  while (bufptr < limit)
@@ -1287,20 +1291,65 @@ val_print_string (CORE_ADDR addr, int le
 	    }
 	}
       while (errcode == 0	/* no error */
-	     && bufptr - buffer < fetchlimit * width	/* no overrun */
+	     && bufptr - *buffer < fetchlimit * width	/* no overrun */
 	     && !found_nul);	/* haven't found nul yet */
     }
   else
-    {				/* length of string is really 0! */
-      buffer = bufptr = NULL;
+    {				/* Length of string is really 0!  */
+      *buffer = bufptr = NULL;
       errcode = 0;
     }
 
   /* bufptr and addr now point immediately beyond the last byte which we
      consider part of the string (including a '\0' which ends the string).  */
+  *bytes_read = bufptr - *buffer;
+
+  QUIT;
+
+  discard_cleanups (old_chain);
+
+  return errcode;
+}
+
+/* Print a string from the inferior, starting at ADDR and printing up to LEN
+   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
+   stops at the first null byte, otherwise printing proceeds (including null
+   bytes) until either print_max or LEN characters have been printed,
+   whichever is smaller.  */
+
+int
+val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
+		  const struct value_print_options *options)
+{
+  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  int found_nul;		/* Non-zero if we found the nul char */
+  unsigned int fetchlimit;	/* Maximum number of chars to print.  */
+  int bytes_read;
+  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer.  */
+  CORE_ADDR addr_next;		/* Address right after the last read byte.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* First we need to figure out the limit on the number of characters we are
+     going to attempt to fetch and print.  This is actually pretty simple.  If
+     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
+     LEN is -1, then the limit is print_max.  This is true regardless of
+     whether print_max is zero, UINT_MAX (unlimited), or something in between,
+     because finding the null byte (or available memory) is what actually
+     limits the fetch.  */
+
+  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+
+  errcode = read_string (addr, len, width, fetchlimit, &buffer, &bytes_read);
+  old_chain = make_cleanup (xfree, buffer);
+
+  addr_next = addr + bytes_read;
 
   /* We now have either successfully filled the buffer to fetchlimit, or
-     terminated early due to an error or finding a null char when LEN is -1. */
+     terminated early due to an error or finding a null char when LEN is -1.  */
+
+  /* Determine found_nul by looking at the last character read.  */
+  found_nul = extract_unsigned_integer (buffer + bytes_read - width, width) == 0;
 
   if (len == -1 && !found_nul)
     {
@@ -1312,30 +1361,28 @@ val_print_string (CORE_ADDR addr, int le
 
       peekbuf = (gdb_byte *) alloca (width);
 
-      if (target_read_memory (addr, peekbuf, width) == 0
+      if (target_read_memory (addr_next, peekbuf, width) == 0
 	  && extract_unsigned_integer (peekbuf, width) != 0)
 	force_ellipsis = 1;
     }
-  else if ((len >= 0 && errcode != 0) || (len > (bufptr - buffer) / width))
+  else if ((len >= 0 && errcode != 0) || (len > bytes_read / width))
     {
       /* Getting an error when we have a requested length, or fetching less
          than the number of characters actually requested, always make us
-         print ellipsis. */
+         print ellipsis.  */
       force_ellipsis = 1;
     }
 
-  QUIT;
-
   /* If we get an error before fetching anything, don't print a string.
      But if we fetch something and then get an error, print the string
      and then the error message.  */
-  if (errcode == 0 || bufptr > buffer)
+  if (errcode == 0 || bytes_read > 0)
     {
       if (options->addressprint)
 	{
 	  fputs_filtered (" ", stream);
 	}
-      LA_PRINT_STRING (stream, buffer, (bufptr - buffer) / width, width, force_ellipsis, options);
+      LA_PRINT_STRING (stream, buffer, bytes_read / width, width, force_ellipsis, options);
     }
 
   if (errcode != 0)
@@ -1343,19 +1390,21 @@ val_print_string (CORE_ADDR addr, int le
       if (errcode == EIO)
 	{
 	  fprintf_filtered (stream, " <Address ");
-	  fputs_filtered (paddress (addr), stream);
+	  fputs_filtered (paddress (addr_next), stream);
 	  fprintf_filtered (stream, " out of bounds>");
 	}
       else
 	{
 	  fprintf_filtered (stream, " <Error reading address ");
-	  fputs_filtered (paddress (addr), stream);
+	  fputs_filtered (paddress (addr_next), stream);
 	  fprintf_filtered (stream, ": %s>", safe_strerror (errcode));
 	}
     }
+
   gdb_flush (stream);
   do_cleanups (old_chain);
-  return ((bufptr - buffer) / width);
+
+  return (bytes_read / width);
 }
 \f
 
Index: gdb.git/gdb/valprint.h
===================================================================
--- gdb.git.orig/gdb/valprint.h	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/valprint.h	2008-11-23 02:55:37.000000000 -0200
@@ -136,4 +136,7 @@ extern void print_hex_chars (struct ui_f
 
 extern void print_char_chars (struct ui_file *, const gdb_byte *,
 			      unsigned int, enum bfd_endian);
+
+int read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+		 gdb_byte **buffer, int *bytes_read);
 #endif
Index: gdb.git/gdb/value.h
===================================================================
--- gdb.git.orig/gdb/value.h	2008-11-23 02:33:57.000000000 -0200
+++ gdb.git/gdb/value.h	2008-11-23 02:55:37.000000000 -0200
@@ -514,6 +514,8 @@ extern void modify_field (gdb_byte *addr
 extern void type_print (struct type *type, char *varstring,
 			struct ui_file *stream, int show);
 
+extern char * type_to_string (struct type *type);
+
 extern gdb_byte *baseclass_addr (struct type *type, int index,
 				 gdb_byte *valaddr,
 				 struct value **valuep, int *errp);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 13:24 [RFA] Add la_getstr member to language_defn Thiago Jung Bauermann
@ 2008-11-24 13:32 ` Daniel Jacobowitz
  2008-11-24 14:59   ` Thiago Jung Bauermann
  2008-11-24 20:03 ` Tom Tromey
  1 sibling, 1 reply; 33+ messages in thread
From: Daniel Jacobowitz @ 2008-11-24 13:32 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: gdb-patches ml

On Sun, Nov 23, 2008 at 03:14:37AM -0200, Thiago Jung Bauermann wrote:
> One improvement which is easy to make but I left for later (this patch
> is big enough already) is to nuke target_read_string and make its
> callers use read_string instead (the former implements just a subset of
> what the latter provides). It would be a simple modification, but would
> affect code specific to targets to which I don't have access to
> (target_read_string is almost exclusively used in solib-*.c). I can work
> on a subsequent patch to do that if you think it's safe enough to not
> require my testing on all affected targets.

What's the difference between them?

> +/* Obtain a C string from the inferior, storing it in a newly allocated
> +   buffer in BUFFER, which should be freed by the caller.  If VALUE is an
> +   array with known length, the function will copy all of its contents to
> +   the buffer.  If the length is not known, read until a null byte is found.
> +   LENGTH will contain the size of the string (not counting the NULL
> +   character).

This is the right behavior for gdb's "print VAR".  But I'm not sure
it's the right behavior for a method named la_getstr - in fact I think
it isn't.  Suppose (to pick a random example with no relation to
anything - no, wait, it's from the Linux kernel...):

struct task_struct
{
	volatile long state;
	...
	char comm[16];
	...
};

If I ask for the contents of the array, I should get sixteen bytes.
But if I ask for a string I ought to get, in my opinion, up to sixteen
bytes.  Characters up to but not including the first zero, or sixteen
at most.  A ps implementation which prints "comm\0er command" is not
very helpful :-)

BTW, it's not a NULL character; it's NUL, the null character.  NULL
in uppercase is the pointer.

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 13:32 ` Daniel Jacobowitz
@ 2008-11-24 14:59   ` Thiago Jung Bauermann
  2008-11-24 16:19     ` Daniel Jacobowitz
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2008-11-24 14:59 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches ml

Hi Daniel,

Thanks for your review.

Am Sonntag, den 23.11.2008, 11:10 -0500 schrieb Daniel Jacobowitz:
> On Sun, Nov 23, 2008 at 03:14:37AM -0200, Thiago Jung Bauermann wrote:
> > One improvement which is easy to make but I left for later (this patch
> > is big enough already) is to nuke target_read_string and make its
> > callers use read_string instead (the former implements just a subset of
> > what the latter provides). It would be a simple modification, but would
> > affect code specific to targets to which I don't have access to
> > (target_read_string is almost exclusively used in solib-*.c). I can work
> > on a subsequent patch to do that if you think it's safe enough to not
> > require my testing on all affected targets.
> 
> What's the difference between them?

target_read_string assumes 1-byte characters, so it is equivalent to
calling read_string with a character width of 1.

Also, the way they fetch the string differs (I don't know how relevant
is the difference): target_read_string reads in steps of at most 4
bytes, and doubles its allocated buffer each time it gets full, while
read_string reads the whole string at once if its size is known, or else
reads in steps of at most 8 bytes, and linearly increases its allocated
buffer in steps of 8 bytes.

I mentioned substituting one for the other because I tend to dislike
having two implementations of basically the same thing.

> > +/* Obtain a C string from the inferior, storing it in a newly allocated
> > +   buffer in BUFFER, which should be freed by the caller.  If VALUE is an
> > +   array with known length, the function will copy all of its contents to
> > +   the buffer.  If the length is not known, read until a null byte is found.
> > +   LENGTH will contain the size of the string (not counting the NULL
> > +   character).
> 
> This is the right behavior for gdb's "print VAR".  But I'm not sure
> it's the right behavior for a method named la_getstr - in fact I think
> it isn't.

The comment is wrong. c_getstr actually reads until a null character is
found, but doesn't read past the end of an array with known length. I
updated the comment to reflect that.

Except that there was a bug for GDB-hosted strings, in which case the
function behaved as described by the comment. I fixed it now.

> If I ask for the contents of the array, I should get sixteen bytes.
> But if I ask for a string I ought to get, in my opinion, up to sixteen
> bytes.  Characters up to but not including the first zero, or sixteen
> at most.  A ps implementation which prints "comm\0er command" is not
> very helpful :-)

Agreed, and that's what I (mostly) implemented, but got confused by the
GDB-hosted special case. Thanks for the detailed example anyway.

By the way, the special case is for not_lval and lval_internalvar.
Should I test for lval_internalvar_component as well?

> BTW, it's not a NULL character; it's NUL, the null character.  NULL
> in uppercase is the pointer.

Fixed comments.

What about this version?
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


2008-11-23  Thiago Jung Bauermann  <bauerman@br.ibm.com>

	* language.h (language_dfn): Add la_getstr member.
	(LA_GET_STRING): New macro.
	(default_getstr): New prototype.
	* language.c (default_getstr): New function.
	(unknown_language_defn, auto_language_defn, local_language_defn): Use
	default_getstr for la_getstr.
	* c-lang.c (c_getstr): New function.
	(c_language_defn, cplus_language_defn, asm_language_defn): Use
	c_getstr for la_getstr.
	(minimal_language_defn): Use default_getstr for la_getstr.
	* ada-lang.c (ada_language_defn): Likewise.
	* f-lang.c (f_language_defn): Likewise.
	* jv-lang.c (java_language_defn): Likewise.
	* m2-lang.c (m2_language_defn): Likewise.
	* objc-lang.c (objc_language_defn): Likewise.
	* p-lang.c (p_language_defn): Likewise.
	* scm-lang.c (scm_language_defn): Likewise.
	* typeprint.c (type_to_string): New function.
	* value.h (type_to_string): New prototype.
	* valprint.c (val_print_string): Factor out code for reading string
	from the inferior into its own function.  Put 2 spaces after period
	in comments.
	(read_string): New function.
	* valprint.h (read_string): New prototype.

Index: gdb.git/gdb/ada-lang.c
===================================================================
--- gdb.git.orig/gdb/ada-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/ada-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -11084,6 +11084,7 @@ const struct language_defn ada_language_
   ada_language_arch_info,
   ada_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/c-lang.c
===================================================================
--- gdb.git.orig/gdb/c-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/c-lang.c	2008-11-23 23:14:04.000000000 -0200
@@ -181,6 +181,113 @@ c_printstr (struct ui_file *stream, cons
   if (force_ellipses || i < length)
     fputs_filtered ("...", stream);
 }
+
+/* Obtain a C string from the inferior storing it in a newly allocated
+   buffer in BUFFER, which should be freed by the caller.  The string is
+   read until a null character is found. If VALUE is an array with known
+   length, the function will not read past the end of the array.  LENGTH
+   will contain the size of the string in bytes (not counting the null
+   character).
+
+   Assumes strings are terminated by a null character.  The size of a character
+   is determined by the length of the target type of the pointer or array.
+   This means that a null byte present in a multi-byte character will not
+   terminate the string unless the whole character is null.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check LENGTH to recognize this situation.
+
+   Return 0 on success, errno on failure.  */
+
+static int
+c_getstr (struct value *value, gdb_byte **buffer, int *length)
+{
+  int fetchlimit, err, width;
+  struct type *type = value_type (value);
+  struct type *element_type = TYPE_TARGET_TYPE (type);
+
+  if (element_type == NULL)
+    goto error;
+
+  if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+    {
+      /* If we know the size of the array, we can use it as a limit on the
+	 number of characters to be fetched.  */
+      if ((TYPE_NFIELDS (type) == 1)
+	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
+	{
+	  LONGEST low_bound, high_bound;
+
+	  get_discrete_bounds (TYPE_FIELD_TYPE (type, 0),
+			       &low_bound, &high_bound);
+	  fetchlimit = high_bound - low_bound + 1;
+	}
+      else
+	fetchlimit = -1;
+    }
+  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)
+    fetchlimit = -1;
+  else
+    /* We work only with arrays and pointers.  */
+    goto error;
+
+  element_type = check_typedef (element_type);
+  if ((TYPE_CODE (element_type) != TYPE_CODE_INT)
+      && (TYPE_CODE (element_type) != TYPE_CODE_CHAR))
+    /* If the elements are not integers or characters, we don't consider it
+       a string.  */
+    goto error;
+
+  width = TYPE_LENGTH (element_type);
+
+  /* If the string lives in GDB's memory intead of the inferior's, then we
+     just need to copy it to BUFFER.  Also, since such strings are arrays
+     with known size, FETCHLIMIT will hold the size of the array.  */
+  if (((VALUE_LVAL (value) == not_lval)
+      || (VALUE_LVAL (value) == lval_internalvar)) && (fetchlimit != -1))
+    {
+      int i;
+      const gdb_byte *contents = value_contents (value);
+
+      /* Look for a null character.  */
+      for (i = 0; i < fetchlimit; i++)
+	if (extract_unsigned_integer (contents + i*width, width) == 0)
+	  break;
+
+      /* i is now either the number of non-null characters, or fetchlimit.  */
+      *length = i*width;
+      *buffer = xmalloc (*length);
+      memcpy (*buffer, contents, *length);
+      err = 0;
+    }
+  else
+    err = read_string (value_as_address (value), -1, width, fetchlimit,
+		       buffer, length);
+
+  /* If the last character is null, subtract it from length.  */
+  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
+      *length -= width;
+
+  return err;
+
+error:
+  {
+    char *type_str;
+    struct cleanup *old_chain;
+
+    type_str = type_to_string (type);
+    if (type_str)
+      {
+	old_chain = make_cleanup (xfree, type_str);
+	error (_("Trying to read string with inappropriate type `%s'."),
+	       type_str);
+      }
+    else
+      error (_("Trying to read string with inappropriate type."));
+  }
+}
+
 \f
 /* Preprocessing and parsing C and C++ expressions.  */
 
@@ -419,6 +526,7 @@ const struct language_defn c_language_de
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_getstr,
   LANG_MAGIC
 };
 
@@ -537,6 +645,7 @@ const struct language_defn cplus_languag
   cplus_language_arch_info,
   default_print_array_index,
   cp_pass_by_reference,
+  c_getstr,
   LANG_MAGIC
 };
 
@@ -574,6 +683,7 @@ const struct language_defn asm_language_
   c_language_arch_info, /* FIXME: la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  c_getstr,
   LANG_MAGIC
 };
 
@@ -616,6 +726,7 @@ const struct language_defn minimal_langu
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/f-lang.c
===================================================================
--- gdb.git.orig/gdb/f-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/f-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -343,6 +343,7 @@ const struct language_defn f_language_de
   f_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/jv-lang.c
===================================================================
--- gdb.git.orig/gdb/jv-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/jv-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -1128,6 +1128,7 @@ const struct language_defn java_language
   java_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/language.c
===================================================================
--- gdb.git.orig/gdb/language.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/language.c	2008-11-23 23:13:59.000000000 -0200
@@ -1043,6 +1043,12 @@ default_print_array_index (struct value 
   fprintf_filtered (stream, "] = ");
 }
 
+int
+default_getstr (struct value *value, gdb_byte **buffer, int *length)
+{
+  error (_("Getting a string is unsupported in this language."));
+}
+
 /* Define the language that is no language.  */
 
 static int
@@ -1165,6 +1171,7 @@ const struct language_defn unknown_langu
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
@@ -1203,6 +1210,7 @@ const struct language_defn auto_language
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
@@ -1240,6 +1248,7 @@ const struct language_defn local_languag
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 \f
Index: gdb.git/gdb/language.h
===================================================================
--- gdb.git.orig/gdb/language.h	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/language.h	2008-11-23 23:13:59.000000000 -0200
@@ -282,6 +282,12 @@ struct language_defn
        reference at the language level.  */
     int (*la_pass_by_reference) (struct type *type);
 
+    /* Obtain a string from the inferior, storing it in a newly allocated
+       buffer in BUFFER, which should be freed by the caller.  LENGTH will
+       hold the size in bytes of the string (only actual characters, excluding
+       an eventual terminating null character).  */
+    int (*la_getstr) (struct value *value, gdb_byte **buffer, int *length);
+
     /* Add fields above this point, so the magic number is always last. */
     /* Magic number for compat checking */
 
@@ -380,6 +386,8 @@ extern enum language set_language (enum 
 				 force_ellipses,options))
 #define LA_EMIT_CHAR(ch, stream, quoter) \
   (current_language->la_emitchar(ch, stream, quoter))
+#define LA_GET_STRING(value, buffer, length) \
+  (current_language->la_getstr(value, buffer, length))
 
 #define LA_PRINT_ARRAY_INDEX(index_value, stream, optins) \
   (current_language->la_print_array_index(index_value, stream, options))
@@ -489,4 +497,6 @@ int default_pass_by_reference (struct ty
 void default_print_typedef (struct type *type, struct symbol *new_symbol,
 			    struct ui_file *stream);
 
+int default_getstr (struct value *value, gdb_byte **buffer, int *length);
+
 #endif /* defined (LANGUAGE_H) */
Index: gdb.git/gdb/m2-lang.c
===================================================================
--- gdb.git.orig/gdb/m2-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/m2-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -394,6 +394,7 @@ const struct language_defn m2_language_d
   m2_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/objc-lang.c
===================================================================
--- gdb.git.orig/gdb/objc-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/objc-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -531,6 +531,7 @@ const struct language_defn objc_language
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/p-lang.c
===================================================================
--- gdb.git.orig/gdb/p-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/p-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -433,6 +433,7 @@ const struct language_defn pascal_langua
   pascal_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/scm-lang.c
===================================================================
--- gdb.git.orig/gdb/scm-lang.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/scm-lang.c	2008-11-23 23:13:59.000000000 -0200
@@ -273,6 +273,7 @@ const struct language_defn scm_language_
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_getstr,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/typeprint.c
===================================================================
--- gdb.git.orig/gdb/typeprint.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/typeprint.c	2008-11-23 23:13:59.000000000 -0200
@@ -33,6 +33,7 @@
 #include "cp-abi.h"
 #include "typeprint.h"
 #include "gdb_string.h"
+#include "exceptions.h"
 #include "valprint.h"
 #include <errno.h>
 
@@ -78,6 +79,34 @@ type_print (struct type *type, char *var
   LA_PRINT_TYPE (type, varstring, stream, show, 0);
 }
 
+/* Print TYPE to a string, returning it.  The caller is responsible for
+   freeing the string.  */
+
+char *
+type_to_string (struct type *type)
+{
+  char *s = NULL;
+  long dummy;
+  struct ui_file *stb;
+  struct cleanup *old_chain;
+  volatile struct gdb_exception except;
+
+  stb = mem_fileopen ();
+  old_chain = make_cleanup_ui_file_delete (stb);
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      type_print (type, "", stb, -1);
+      s = ui_file_xstrdup (stb, &dummy);
+    }
+  if (except.reason < 0)
+    s = NULL;
+
+  do_cleanups (old_chain);
+
+  return s;
+}
+
 /* Print type of EXP, or last thing in value history if EXP == NULL.
    show is passed to type_print.  */
 
Index: gdb.git/gdb/valprint.c
===================================================================
--- gdb.git.orig/gdb/valprint.c	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/valprint.c	2008-11-23 23:37:25.000000000 -0200
@@ -1177,43 +1177,44 @@ partial_memory_read (CORE_ADDR memaddr, 
   return (nread);
 }
 
-/*  Print a string from the inferior, starting at ADDR and printing up to LEN
-   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
-   stops at the first null byte, otherwise printing proceeds (including null
-   bytes) until either print_max or LEN characters have been printed,
-   whichever is smaller. */
-
-/* FIXME: Use target_read_string.  */
+/* Read a string from the inferior, at ADDR, with LEN characters of WIDTH bytes
+   each.  Fetch at most FETCHLIMIT characters.  BUFFER will be set to a newly
+   allocated buffer containing the string, which the caller is responsible to
+   free, and BYTES_READ will be set to the number of bytes read.  Returns 0 on
+   success, or errno on failure.
+
+   If LEN is -1, stops at the first null character (not necessarily the first
+   null byte) up to a maximum of FETCHLIMIT characters, otherwise reading
+   proceeds (including null characters) until LEN characters have been read.
+   Set FETCHLIMIT to UINT_MAX to read as many characters as possible from the
+   string.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check BYTES_READ to recognize this situation.
+
+   Note: There was a FIXME asking to make this code use target_read_string,
+   but this function is more general (can read past null characters, up to
+   given LEN). Besides, it is used much more often than target_read_string
+   so it is more tested.  Perhaps callers of target_read_string should use
+   this function instead?  */
 
 int
-val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
-		  const struct value_print_options *options)
+read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+	     gdb_byte **buffer, int *bytes_read)
 {
-  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero. */
-  int errcode;			/* Errno returned from bad reads. */
-  unsigned int fetchlimit;	/* Maximum number of chars to print. */
-  unsigned int nfetch;		/* Chars to fetch / chars fetched. */
-  unsigned int chunksize;	/* Size of each fetch, in chars. */
-  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer. */
-  gdb_byte *bufptr;		/* Pointer to next available byte in buffer. */
-  gdb_byte *limit;		/* First location past end of fetch buffer. */
-  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain. */
-  int found_nul;		/* Non-zero if we found the nul char */
+  int found_nul;		/* Non-zero if we found the nul char.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  unsigned int nfetch;		/* Chars to fetch / chars fetched.  */
+  unsigned int chunksize;	/* Size of each fetch, in chars.  */
+  gdb_byte *bufptr;		/* Pointer to next available byte in buffer.  */
+  gdb_byte *limit;		/* First location past end of fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
 
-  /* First we need to figure out the limit on the number of characters we are
-     going to attempt to fetch and print.  This is actually pretty simple.  If
-     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
-     LEN is -1, then the limit is print_max.  This is true regardless of
-     whether print_max is zero, UINT_MAX (unlimited), or something in between,
-     because finding the null byte (or available memory) is what actually
-     limits the fetch. */
-
-  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
-
-  /* Now decide how large of chunks to try to read in one operation.  This
+  /* Decide how large of chunks to try to read in one operation.  This
      is also pretty simple.  If LEN >= zero, then we want fetchlimit chars,
      so we might as well read them all in one operation.  If LEN is -1, we
-     are looking for a null terminator to end the fetching, so we might as
+     are looking for a NUL terminator to end the fetching, so we might as
      well read in blocks that are large enough to be efficient, but not so
      large as to be slow if fetchlimit happens to be large.  So we choose the
      minimum of 8 and fetchlimit.  We used to use 200 instead of 8 but
@@ -1221,17 +1222,17 @@ val_print_string (CORE_ADDR addr, int le
 
   chunksize = (len == -1 ? min (8, fetchlimit) : fetchlimit);
 
-  /* Loop until we either have all the characters to print, or we encounter
-     some error, such as bumping into the end of the address space. */
+  /* Loop until we either have all the characters, or we encounter
+     some error, such as bumping into the end of the address space.  */
 
   found_nul = 0;
   old_chain = make_cleanup (null_cleanup, 0);
 
   if (len > 0)
     {
-      buffer = (gdb_byte *) xmalloc (len * width);
-      bufptr = buffer;
-      old_chain = make_cleanup (xfree, buffer);
+      *buffer = (gdb_byte *) xmalloc (len * width);
+      bufptr = *buffer;
+      old_chain = make_cleanup (xfree, *buffer);
 
       nfetch = partial_memory_read (addr, bufptr, len * width, &errcode)
 	/ width;
@@ -1241,32 +1242,36 @@ val_print_string (CORE_ADDR addr, int le
   else if (len == -1)
     {
       unsigned long bufsize = 0;
+
+      *buffer = NULL;
+
       do
 	{
 	  QUIT;
 	  nfetch = min (chunksize, fetchlimit - bufsize);
 
-	  if (buffer == NULL)
-	    buffer = (gdb_byte *) xmalloc (nfetch * width);
+	  if (*buffer == NULL)
+	    *buffer = (gdb_byte *) xmalloc (nfetch * width);
 	  else
 	    {
 	      discard_cleanups (old_chain);
-	      buffer = (gdb_byte *) xrealloc (buffer, (nfetch + bufsize) * width);
+	      *buffer = (gdb_byte *) xrealloc (*buffer,
+					       (nfetch + bufsize) * width);
 	    }
 
-	  old_chain = make_cleanup (xfree, buffer);
-	  bufptr = buffer + bufsize * width;
+	  old_chain = make_cleanup (xfree, *buffer);
+	  bufptr = *buffer + bufsize * width;
 	  bufsize += nfetch;
 
-	  /* Read as much as we can. */
+	  /* Read as much as we can.  */
 	  nfetch = partial_memory_read (addr, bufptr, nfetch * width, &errcode)
-	    / width;
+		    / width;
 
-	  /* Scan this chunk for the null byte that terminates the string
+	  /* Scan this chunk for the null character that terminates the string
 	     to print.  If found, we don't need to fetch any more.  Note
 	     that bufptr is explicitly left pointing at the next character
-	     after the null byte, or at the next character after the end of
-	     the buffer. */
+	     after the null character, or at the next character after the end
+	     of the buffer.  */
 
 	  limit = bufptr + nfetch * width;
 	  while (bufptr < limit)
@@ -1279,7 +1284,7 @@ val_print_string (CORE_ADDR addr, int le
 	      if (c == 0)
 		{
 		  /* We don't care about any error which happened after
-		     the NULL terminator.  */
+		     the NUL terminator.  */
 		  errcode = 0;
 		  found_nul = 1;
 		  break;
@@ -1287,26 +1292,70 @@ val_print_string (CORE_ADDR addr, int le
 	    }
 	}
       while (errcode == 0	/* no error */
-	     && bufptr - buffer < fetchlimit * width	/* no overrun */
-	     && !found_nul);	/* haven't found nul yet */
+	     && bufptr - *buffer < fetchlimit * width	/* no overrun */
+	     && !found_nul);	/* haven't found NUL yet */
     }
   else
-    {				/* length of string is really 0! */
-      buffer = bufptr = NULL;
+    {				/* Length of string is really 0!  */
+      *buffer = bufptr = NULL;
       errcode = 0;
     }
 
   /* bufptr and addr now point immediately beyond the last byte which we
      consider part of the string (including a '\0' which ends the string).  */
+  *bytes_read = bufptr - *buffer;
+
+  QUIT;
+
+  discard_cleanups (old_chain);
+
+  return errcode;
+}
+
+/* Print a string from the inferior, starting at ADDR and printing up to LEN
+   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
+   stops at the first null byte, otherwise printing proceeds (including null
+   bytes) until either print_max or LEN characters have been printed,
+   whichever is smaller.  */
+
+int
+val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
+		  const struct value_print_options *options)
+{
+  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  int found_nul;		/* Non-zero if we found the nul char */
+  unsigned int fetchlimit;	/* Maximum number of chars to print.  */
+  int bytes_read;
+  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* First we need to figure out the limit on the number of characters we are
+     going to attempt to fetch and print.  This is actually pretty simple.  If
+     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
+     LEN is -1, then the limit is print_max.  This is true regardless of
+     whether print_max is zero, UINT_MAX (unlimited), or something in between,
+     because finding the null byte (or available memory) is what actually
+     limits the fetch.  */
+
+  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+
+  errcode = read_string (addr, len, width, fetchlimit, &buffer, &bytes_read);
+  old_chain = make_cleanup (xfree, buffer);
+
+  addr += bytes_read;
 
   /* We now have either successfully filled the buffer to fetchlimit, or
-     terminated early due to an error or finding a null char when LEN is -1. */
+     terminated early due to an error or finding a null char when LEN is -1.  */
+
+  /* Determine found_nul by looking at the last character read.  */
+  found_nul = extract_unsigned_integer (buffer + bytes_read - width, width) == 0;
 
   if (len == -1 && !found_nul)
     {
       gdb_byte *peekbuf;
 
-      /* We didn't find a null terminator we were looking for.  Attempt
+      /* We didn't find a NUL terminator we were looking for.  Attempt
          to peek at the next character.  If not successful, or it is not
          a null byte, then force ellipsis to be printed.  */
 
@@ -1316,26 +1365,24 @@ val_print_string (CORE_ADDR addr, int le
 	  && extract_unsigned_integer (peekbuf, width) != 0)
 	force_ellipsis = 1;
     }
-  else if ((len >= 0 && errcode != 0) || (len > (bufptr - buffer) / width))
+  else if ((len >= 0 && errcode != 0) || (len > bytes_read / width))
     {
       /* Getting an error when we have a requested length, or fetching less
          than the number of characters actually requested, always make us
-         print ellipsis. */
+         print ellipsis.  */
       force_ellipsis = 1;
     }
 
-  QUIT;
-
   /* If we get an error before fetching anything, don't print a string.
      But if we fetch something and then get an error, print the string
      and then the error message.  */
-  if (errcode == 0 || bufptr > buffer)
+  if (errcode == 0 || bytes_read > 0)
     {
       if (options->addressprint)
 	{
 	  fputs_filtered (" ", stream);
 	}
-      LA_PRINT_STRING (stream, buffer, (bufptr - buffer) / width, width, force_ellipsis, options);
+      LA_PRINT_STRING (stream, buffer, bytes_read / width, width, force_ellipsis, options);
     }
 
   if (errcode != 0)
@@ -1353,9 +1400,11 @@ val_print_string (CORE_ADDR addr, int le
 	  fprintf_filtered (stream, ": %s>", safe_strerror (errcode));
 	}
     }
+
   gdb_flush (stream);
   do_cleanups (old_chain);
-  return ((bufptr - buffer) / width);
+
+  return (bytes_read / width);
 }
 \f
 
Index: gdb.git/gdb/valprint.h
===================================================================
--- gdb.git.orig/gdb/valprint.h	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/valprint.h	2008-11-23 23:13:59.000000000 -0200
@@ -136,4 +136,7 @@ extern void print_hex_chars (struct ui_f
 
 extern void print_char_chars (struct ui_file *, const gdb_byte *,
 			      unsigned int, enum bfd_endian);
+
+int read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+		 gdb_byte **buffer, int *bytes_read);
 #endif
Index: gdb.git/gdb/value.h
===================================================================
--- gdb.git.orig/gdb/value.h	2008-11-23 22:51:43.000000000 -0200
+++ gdb.git/gdb/value.h	2008-11-23 23:13:59.000000000 -0200
@@ -514,6 +514,8 @@ extern void modify_field (gdb_byte *addr
 extern void type_print (struct type *type, char *varstring,
 			struct ui_file *stream, int show);
 
+extern char * type_to_string (struct type *type);
+
 extern gdb_byte *baseclass_addr (struct type *type, int index,
 				 gdb_byte *valaddr,
 				 struct value **valuep, int *errp);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 14:59   ` Thiago Jung Bauermann
@ 2008-11-24 16:19     ` Daniel Jacobowitz
  2008-11-24 20:22       ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Daniel Jacobowitz @ 2008-11-24 16:19 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: gdb-patches ml

On Sun, Nov 23, 2008 at 11:40:21PM -0200, Thiago Jung Bauermann wrote:
> The comment is wrong. c_getstr actually reads until a null character is
> found, but doesn't read past the end of an array with known length. I
> updated the comment to reflect that.
> 
> Except that there was a bug for GDB-hosted strings, in which case the
> function behaved as described by the comment. I fixed it now.

Are you sure?  I took a look at read_string and its comments and it
seemed to do what I described - but I might be wrong.

+   If LEN is -1, stops at the first null character (not necessarily the first
+   null byte) up to a maximum of FETCHLIMIT characters, otherwise reading
+   proceeds (including null characters) until LEN characters have been read.
+   Set FETCHLIMIT to UINT_MAX to read as many characters as possible from the
+   string.

-- 
Daniel Jacobowitz
CodeSourcery


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 16:19     ` Daniel Jacobowitz
@ 2008-11-24 20:22       ` Thiago Jung Bauermann
  2008-11-25  2:16         ` Daniel Jacobowitz
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2008-11-24 20:22 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches ml

El dom, 23-11-2008 a las 21:28 -0500, Daniel Jacobowitz escribiÃ³:
> On Sun, Nov 23, 2008 at 11:40:21PM -0200, Thiago Jung Bauermann wrote:
> > The comment is wrong. c_getstr actually reads until a null character is
> > found, but doesn't read past the end of an array with known length. I
> > updated the comment to reflect that.
> > 
> > Except that there was a bug for GDB-hosted strings, in which case the
> > function behaved as described by the comment. I fixed it now.
> 
> Are you sure?  I took a look at read_string and its comments and it
> seemed to do what I described - but I might be wrong.
> 
> +   If LEN is -1, stops at the first null character (not necessarily the first
> +   null byte) up to a maximum of FETCHLIMIT characters, otherwise reading
> +   proceeds (including null characters) until LEN characters have been read.
> +   Set FETCHLIMIT to UINT_MAX to read as many characters as possible from the
> +   string.

c_getstr allways passes -1 as the LEN argument, and if the length of the
array is known, it passes the length as FETCHLIMIT. If it is not known,
FETCHLIMIT will be UINT_MAX (actually, the code currently passes -1,
I'll change it to pass UINT_MAX explicitly).

Would the comment above be clearer if it read as follows?

  If LEN > 0, reads exactly LEN characters (including eventual NULs in
  the middle or end of the string).  If LEN is -1, stops at the first
  null character (not necessarily the first null byte) up to a maximum
  of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
  characters as possible from the string.

By the way, I just realised that if LEN is 0, BUFFER is not allocated,
contradicting the property I mention in the comment that "unless an
exception is thrown, BUFFER will always be allocated, even on failure".
I'll change it to allocate a 1 byte buffer in this case, to keep the
caller's life simple.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 20:22       ` Thiago Jung Bauermann
@ 2008-11-25  2:16         ` Daniel Jacobowitz
  2008-11-25  8:53           ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Daniel Jacobowitz @ 2008-11-25  2:16 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: gdb-patches ml

On Mon, Nov 24, 2008 at 04:34:19PM -0200, Thiago Jung Bauermann wrote:
> Would the comment above be clearer if it read as follows?
> 
>   If LEN > 0, reads exactly LEN characters (including eventual NULs in
>   the middle or end of the string).  If LEN is -1, stops at the first
>   null character (not necessarily the first null byte) up to a maximum
>   of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
>   characters as possible from the string.

Yes, I do think that's clearer.  I didn't realize on a quick read that
LEN and FETCHLIMIT were different... sorry for the confusion.

> By the way, I just realised that if LEN is 0, BUFFER is not allocated,
> contradicting the property I mention in the comment that "unless an
> exception is thrown, BUFFER will always be allocated, even on failure".
> I'll change it to allocate a 1 byte buffer in this case, to keep the
> caller's life simple.

Speaking of BUFFER, this is GDB - it's written in a language with
null-terminated strings.  Would a definition of read_string that
always null-terminated the buffer be more useful, or just confusing?
Now the result is null-terminated iff the target string is.  Of
course the string might include zero bytes if it's an array, but
the caller knows this.  It would make it easier to use the returned
buffer when reading a variable-length string, I think.

-- 
Daniel Jacobowitz
CodeSourcery

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-25  2:16         ` Daniel Jacobowitz
@ 2008-11-25  8:53           ` Thiago Jung Bauermann
  2009-01-03  2:27             ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2008-11-25  8:53 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches ml

El lun, 24-11-2008 a las 15:21 -0500, Daniel Jacobowitz escribiÃ³:
> On Mon, Nov 24, 2008 at 04:34:19PM -0200, Thiago Jung Bauermann wrote:
> >   If LEN > 0, reads exactly LEN characters (including eventual NULs in
> >   the middle or end of the string).  If LEN is -1, stops at the first
> >   null character (not necessarily the first null byte) up to a maximum
> >   of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
> >   characters as possible from the string.
> 
> Yes, I do think that's clearer.  I didn't realize on a quick read that
> LEN and FETCHLIMIT were different... sorry for the confusion.

Ok, this version of the patch has the following changes from my last
submission:

- changed read_string comment to use the wording above;
- renamed la_getstr, c_getstr and default_getstr to la_get_string, 
  c_get_string and default_get_string;
- added const char **encoding argument to {la,c,default}_get_string,
  so that the charset of the string can be returned to the caller;
- fixed c_get_string to use UINT_MAX for fetchlimit instead of -1;
- fixed c_get_string to allocate 1 byte when LEN is 0.

WDYT?

> > By the way, I just realised that if LEN is 0, BUFFER is not allocated,
> > contradicting the property I mention in the comment that "unless an
> > exception is thrown, BUFFER will always be allocated, even on failure".
> > I'll change it to allocate a 1 byte buffer in this case, to keep the
> > caller's life simple.
> 
> Speaking of BUFFER, this is GDB - it's written in a language with
> null-terminated strings.  Would a definition of read_string that
> always null-terminated the buffer be more useful, or just confusing?
> Now the result is null-terminated iff the target string is.  Of
> course the string might include zero bytes if it's an array, but
> the caller knows this.  It would make it easier to use the returned
> buffer when reading a variable-length string, I think.

Actually, GDB is written in a language with no concept of strings. :-)
I don't think it's helpful to encourage GDB code to assume strings are
terminated by a null byte, since multibyte encodings (including UTF-16
and UCS-4) can have them in the middle of the string as part of a
character.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


2008-11-24  Thiago Jung Bauermann  <bauerman@br.ibm.com>

	* language.h (language_dfn): Add la_get_string member.
	(LA_GET_STRING): New macro.
	(default_get_string): New prototype.
	* language.c (default_get_string): New function.
	(unknown_language_defn, auto_language_defn, local_language_defn): Use
	default_get_string for la_get_string.
	* c-lang.c (c_get_string): New function.
	(c_language_defn, cplus_language_defn, asm_language_defn): Use
	c_get_string for la_get_string.
	(minimal_language_defn): Use default_get_string for la_get_string.
	* ada-lang.c (ada_language_defn): Likewise.
	* f-lang.c (f_language_defn): Likewise.
	* jv-lang.c (java_language_defn): Likewise.
	* m2-lang.c (m2_language_defn): Likewise.
	* objc-lang.c (objc_language_defn): Likewise.
	* p-lang.c (p_language_defn): Likewise.
	* scm-lang.c (scm_language_defn): Likewise.
	* typeprint.c (type_to_string): New function.
	* value.h (type_to_string): New prototype.
	* valprint.c (val_print_string): Factor out code for reading string
	from the inferior into its own function.  Put 2 spaces after period
	in comments.
	(read_string): New function.
	* valprint.h (read_string): New prototype.

Index: gdb.git/gdb/ada-lang.c
===================================================================
--- gdb.git.orig/gdb/ada-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/ada-lang.c	2008-11-24 19:55:23.000000000 -0200
@@ -11084,6 +11084,7 @@ const struct language_defn ada_language_
   ada_language_arch_info,
   ada_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/c-lang.c
===================================================================
--- gdb.git.orig/gdb/c-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/c-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -181,6 +181,118 @@ c_printstr (struct ui_file *stream, cons
   if (force_ellipses || i < length)
     fputs_filtered ("...", stream);
 }
+
+/* Obtain a C string from the inferior storing it in a newly allocated
+   buffer in BUFFER, which should be freed by the caller.  The string is
+   read until a null character is found. If VALUE is an array with known
+   length, the function will not read past the end of the array.  LENGTH
+   will contain the size of the string in bytes (not counting the null
+   character).
+
+   Assumes strings are terminated by a null character.  The size of a character
+   is determined by the length of the target type of the pointer or array.
+   This means that a null byte present in a multi-byte character will not
+   terminate the string unless the whole character is null.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check LENGTH to recognize this situation.
+
+   CHARSET is always set to the target charset.
+
+   Return 0 on success, errno on failure.  */
+
+static int
+c_get_string (struct value *value, gdb_byte **buffer, int *length,
+	      const char **charset)
+{
+  int err, width;
+  unsigned int fetchlimit;
+  struct type *type = value_type (value);
+  struct type *element_type = TYPE_TARGET_TYPE (type);
+
+  if (element_type == NULL)
+    goto error;
+
+  if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+    {
+      /* If we know the size of the array, we can use it as a limit on the
+	 number of characters to be fetched.  */
+      if ((TYPE_NFIELDS (type) == 1)
+	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
+	{
+	  LONGEST low_bound, high_bound;
+
+	  get_discrete_bounds (TYPE_FIELD_TYPE (type, 0),
+			       &low_bound, &high_bound);
+	  fetchlimit = high_bound - low_bound + 1;
+	}
+      else
+	fetchlimit = UINT_MAX;
+    }
+  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)
+    fetchlimit = UINT_MAX;
+  else
+    /* We work only with arrays and pointers.  */
+    goto error;
+
+  element_type = check_typedef (element_type);
+  if ((TYPE_CODE (element_type) != TYPE_CODE_INT)
+      && (TYPE_CODE (element_type) != TYPE_CODE_CHAR))
+    /* If the elements are not integers or characters, we don't consider it
+       a string.  */
+    goto error;
+
+  width = TYPE_LENGTH (element_type);
+
+  /* If the string lives in GDB's memory intead of the inferior's, then we
+     just need to copy it to BUFFER.  Also, since such strings are arrays
+     with known size, FETCHLIMIT will hold the size of the array.  */
+  if (((VALUE_LVAL (value) == not_lval)
+      || (VALUE_LVAL (value) == lval_internalvar)) && (fetchlimit != UINT_MAX))
+    {
+      int i;
+      const gdb_byte *contents = value_contents (value);
+
+      /* Look for a null character.  */
+      for (i = 0; i < fetchlimit; i++)
+	if (extract_unsigned_integer (contents + i*width, width) == 0)
+	  break;
+
+      /* i is now either the number of non-null characters, or fetchlimit.  */
+      *length = i*width;
+      *buffer = xmalloc (*length);
+      memcpy (*buffer, contents, *length);
+      err = 0;
+    }
+  else
+    err = read_string (value_as_address (value), -1, width, fetchlimit,
+		       buffer, length);
+
+  /* If the last character is null, subtract it from length.  */
+  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
+      *length -= width;
+
+  *charset = target_charset ();
+
+  return err;
+
+error:
+  {
+    char *type_str;
+
+    type_str = type_to_string (type);
+    if (type_str)
+      {
+	make_cleanup (xfree, type_str);
+	error (_("Trying to read string with inappropriate type `%s'."),
+	       type_str);
+      }
+    else
+      error (_("Trying to read string with inappropriate type."));
+  }
+}
+
 \f
 /* Preprocessing and parsing C and C++ expressions.  */
 
@@ -419,6 +531,7 @@ const struct language_defn c_language_de
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -537,6 +650,7 @@ const struct language_defn cplus_languag
   cplus_language_arch_info,
   default_print_array_index,
   cp_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -574,6 +688,7 @@ const struct language_defn asm_language_
   c_language_arch_info, /* FIXME: la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -616,6 +731,7 @@ const struct language_defn minimal_langu
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/f-lang.c
===================================================================
--- gdb.git.orig/gdb/f-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/f-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -343,6 +343,7 @@ const struct language_defn f_language_de
   f_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/jv-lang.c
===================================================================
--- gdb.git.orig/gdb/jv-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/jv-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -1128,6 +1128,7 @@ const struct language_defn java_language
   java_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/language.c
===================================================================
--- gdb.git.orig/gdb/language.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/language.c	2008-11-24 19:55:24.000000000 -0200
@@ -1043,6 +1043,13 @@ default_print_array_index (struct value 
   fprintf_filtered (stream, "] = ");
 }
 
+int
+default_get_string (struct value *value, gdb_byte **buffer, int *length,
+		    const char **charset)
+{
+  error (_("Getting a string is unsupported in this language."));
+}
+
 /* Define the language that is no language.  */
 
 static int
@@ -1165,6 +1172,7 @@ const struct language_defn unknown_langu
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1203,6 +1211,7 @@ const struct language_defn auto_language
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1240,6 +1249,7 @@ const struct language_defn local_languag
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 \f
Index: gdb.git/gdb/language.h
===================================================================
--- gdb.git.orig/gdb/language.h	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/language.h	2008-11-24 19:55:24.000000000 -0200
@@ -282,6 +282,14 @@ struct language_defn
        reference at the language level.  */
     int (*la_pass_by_reference) (struct type *type);
 
+    /* Obtain a string from the inferior, storing it in a newly allocated
+       buffer in BUFFER, which should be freed by the caller.  LENGTH will
+       hold the size in bytes of the string (only actual characters, excluding
+       an eventual terminating null character).  CHARSET will hold the encoding
+       used in the string.  */
+    int (*la_get_string) (struct value *value, gdb_byte **buffer, int *length,
+			  const char **charset);
+
     /* Add fields above this point, so the magic number is always last. */
     /* Magic number for compat checking */
 
@@ -380,6 +388,8 @@ extern enum language set_language (enum 
 				 force_ellipses,options))
 #define LA_EMIT_CHAR(ch, stream, quoter) \
   (current_language->la_emitchar(ch, stream, quoter))
+#define LA_GET_STRING(value, buffer, length, encoding) \
+  (current_language->la_get_string(value, buffer, length, encoding))
 
 #define LA_PRINT_ARRAY_INDEX(index_value, stream, optins) \
   (current_language->la_print_array_index(index_value, stream, options))
@@ -489,4 +499,7 @@ int default_pass_by_reference (struct ty
 void default_print_typedef (struct type *type, struct symbol *new_symbol,
 			    struct ui_file *stream);
 
+int default_get_string (struct value *value, gdb_byte **buffer, int *length,
+			const char **charset);
+
 #endif /* defined (LANGUAGE_H) */
Index: gdb.git/gdb/m2-lang.c
===================================================================
--- gdb.git.orig/gdb/m2-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/m2-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -394,6 +394,7 @@ const struct language_defn m2_language_d
   m2_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/objc-lang.c
===================================================================
--- gdb.git.orig/gdb/objc-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/objc-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -531,6 +531,7 @@ const struct language_defn objc_language
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/p-lang.c
===================================================================
--- gdb.git.orig/gdb/p-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/p-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -433,6 +433,7 @@ const struct language_defn pascal_langua
   pascal_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/scm-lang.c
===================================================================
--- gdb.git.orig/gdb/scm-lang.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/scm-lang.c	2008-11-24 19:55:24.000000000 -0200
@@ -273,6 +273,7 @@ const struct language_defn scm_language_
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
Index: gdb.git/gdb/typeprint.c
===================================================================
--- gdb.git.orig/gdb/typeprint.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/typeprint.c	2008-11-24 19:55:16.000000000 -0200
@@ -33,6 +33,7 @@
 #include "cp-abi.h"
 #include "typeprint.h"
 #include "gdb_string.h"
+#include "exceptions.h"
 #include "valprint.h"
 #include <errno.h>
 
@@ -78,6 +79,34 @@ type_print (struct type *type, char *var
   LA_PRINT_TYPE (type, varstring, stream, show, 0);
 }
 
+/* Print TYPE to a string, returning it.  The caller is responsible for
+   freeing the string.  */
+
+char *
+type_to_string (struct type *type)
+{
+  char *s = NULL;
+  long dummy;
+  struct ui_file *stb;
+  struct cleanup *old_chain;
+  volatile struct gdb_exception except;
+
+  stb = mem_fileopen ();
+  old_chain = make_cleanup_ui_file_delete (stb);
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      type_print (type, "", stb, -1);
+      s = ui_file_xstrdup (stb, &dummy);
+    }
+  if (except.reason < 0)
+    s = NULL;
+
+  do_cleanups (old_chain);
+
+  return s;
+}
+
 /* Print type of EXP, or last thing in value history if EXP == NULL.
    show is passed to type_print.  */
 
Index: gdb.git/gdb/valprint.c
===================================================================
--- gdb.git.orig/gdb/valprint.c	2008-11-24 19:15:49.000000000 -0200
+++ gdb.git/gdb/valprint.c	2008-11-24 19:55:24.000000000 -0200
@@ -1177,43 +1177,44 @@ partial_memory_read (CORE_ADDR memaddr, 
   return (nread);
 }
 
-/*  Print a string from the inferior, starting at ADDR and printing up to LEN
-   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
-   stops at the first null byte, otherwise printing proceeds (including null
-   bytes) until either print_max or LEN characters have been printed,
-   whichever is smaller. */
-
-/* FIXME: Use target_read_string.  */
+/* Read a string from the inferior, at ADDR, with LEN characters of WIDTH bytes
+   each.  Fetch at most FETCHLIMIT characters.  BUFFER will be set to a newly
+   allocated buffer containing the string, which the caller is responsible to
+   free, and BYTES_READ will be set to the number of bytes read.  Returns 0 on
+   success, or errno on failure.
+
+   If LEN > 0, reads exactly LEN characters (including eventual NULs in
+   the middle or end of the string).  If LEN is -1, stops at the first
+   null character (not necessarily the first null byte) up to a maximum
+   of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
+   characters as possible from the string.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check BYTES_READ to recognize this situation.
+
+   Note: There was a FIXME asking to make this code use target_read_string,
+   but this function is more general (can read past null characters, up to
+   given LEN). Besides, it is used much more often than target_read_string
+   so it is more tested.  Perhaps callers of target_read_string should use
+   this function instead?  */
 
 int
-val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
-		  const struct value_print_options *options)
+read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+	     gdb_byte **buffer, int *bytes_read)
 {
-  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero. */
-  int errcode;			/* Errno returned from bad reads. */
-  unsigned int fetchlimit;	/* Maximum number of chars to print. */
-  unsigned int nfetch;		/* Chars to fetch / chars fetched. */
-  unsigned int chunksize;	/* Size of each fetch, in chars. */
-  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer. */
-  gdb_byte *bufptr;		/* Pointer to next available byte in buffer. */
-  gdb_byte *limit;		/* First location past end of fetch buffer. */
-  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain. */
-  int found_nul;		/* Non-zero if we found the nul char */
+  int found_nul;		/* Non-zero if we found the nul char.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  unsigned int nfetch;		/* Chars to fetch / chars fetched.  */
+  unsigned int chunksize;	/* Size of each fetch, in chars.  */
+  gdb_byte *bufptr;		/* Pointer to next available byte in buffer.  */
+  gdb_byte *limit;		/* First location past end of fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
 
-  /* First we need to figure out the limit on the number of characters we are
-     going to attempt to fetch and print.  This is actually pretty simple.  If
-     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
-     LEN is -1, then the limit is print_max.  This is true regardless of
-     whether print_max is zero, UINT_MAX (unlimited), or something in between,
-     because finding the null byte (or available memory) is what actually
-     limits the fetch. */
-
-  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
-
-  /* Now decide how large of chunks to try to read in one operation.  This
+  /* Decide how large of chunks to try to read in one operation.  This
      is also pretty simple.  If LEN >= zero, then we want fetchlimit chars,
      so we might as well read them all in one operation.  If LEN is -1, we
-     are looking for a null terminator to end the fetching, so we might as
+     are looking for a NUL terminator to end the fetching, so we might as
      well read in blocks that are large enough to be efficient, but not so
      large as to be slow if fetchlimit happens to be large.  So we choose the
      minimum of 8 and fetchlimit.  We used to use 200 instead of 8 but
@@ -1221,17 +1222,17 @@ val_print_string (CORE_ADDR addr, int le
 
   chunksize = (len == -1 ? min (8, fetchlimit) : fetchlimit);
 
-  /* Loop until we either have all the characters to print, or we encounter
-     some error, such as bumping into the end of the address space. */
+  /* Loop until we either have all the characters, or we encounter
+     some error, such as bumping into the end of the address space.  */
 
   found_nul = 0;
   old_chain = make_cleanup (null_cleanup, 0);
 
   if (len > 0)
     {
-      buffer = (gdb_byte *) xmalloc (len * width);
-      bufptr = buffer;
-      old_chain = make_cleanup (xfree, buffer);
+      *buffer = (gdb_byte *) xmalloc (len * width);
+      bufptr = *buffer;
+      old_chain = make_cleanup (xfree, *buffer);
 
       nfetch = partial_memory_read (addr, bufptr, len * width, &errcode)
 	/ width;
@@ -1241,32 +1242,36 @@ val_print_string (CORE_ADDR addr, int le
   else if (len == -1)
     {
       unsigned long bufsize = 0;
+
+      *buffer = NULL;
+
       do
 	{
 	  QUIT;
 	  nfetch = min (chunksize, fetchlimit - bufsize);
 
-	  if (buffer == NULL)
-	    buffer = (gdb_byte *) xmalloc (nfetch * width);
+	  if (*buffer == NULL)
+	    *buffer = (gdb_byte *) xmalloc (nfetch * width);
 	  else
 	    {
 	      discard_cleanups (old_chain);
-	      buffer = (gdb_byte *) xrealloc (buffer, (nfetch + bufsize) * width);
+	      *buffer = (gdb_byte *) xrealloc (*buffer,
+					       (nfetch + bufsize) * width);
 	    }
 
-	  old_chain = make_cleanup (xfree, buffer);
-	  bufptr = buffer + bufsize * width;
+	  old_chain = make_cleanup (xfree, *buffer);
+	  bufptr = *buffer + bufsize * width;
 	  bufsize += nfetch;
 
-	  /* Read as much as we can. */
+	  /* Read as much as we can.  */
 	  nfetch = partial_memory_read (addr, bufptr, nfetch * width, &errcode)
-	    / width;
+		    / width;
 
-	  /* Scan this chunk for the null byte that terminates the string
+	  /* Scan this chunk for the null character that terminates the string
 	     to print.  If found, we don't need to fetch any more.  Note
 	     that bufptr is explicitly left pointing at the next character
-	     after the null byte, or at the next character after the end of
-	     the buffer. */
+	     after the null character, or at the next character after the end
+	     of the buffer.  */
 
 	  limit = bufptr + nfetch * width;
 	  while (bufptr < limit)
@@ -1279,7 +1284,7 @@ val_print_string (CORE_ADDR addr, int le
 	      if (c == 0)
 		{
 		  /* We don't care about any error which happened after
-		     the NULL terminator.  */
+		     the NUL terminator.  */
 		  errcode = 0;
 		  found_nul = 1;
 		  break;
@@ -1287,26 +1292,71 @@ val_print_string (CORE_ADDR addr, int le
 	    }
 	}
       while (errcode == 0	/* no error */
-	     && bufptr - buffer < fetchlimit * width	/* no overrun */
-	     && !found_nul);	/* haven't found nul yet */
+	     && bufptr - *buffer < fetchlimit * width	/* no overrun */
+	     && !found_nul);	/* haven't found NUL yet */
     }
   else
-    {				/* length of string is really 0! */
-      buffer = bufptr = NULL;
+    {				/* Length of string is really 0!  */
+      /* We always allocate *buffer.  */
+      *buffer = bufptr = xmalloc (1);
       errcode = 0;
     }
 
   /* bufptr and addr now point immediately beyond the last byte which we
      consider part of the string (including a '\0' which ends the string).  */
+  *bytes_read = bufptr - *buffer;
+
+  QUIT;
+
+  discard_cleanups (old_chain);
+
+  return errcode;
+}
+
+/* Print a string from the inferior, starting at ADDR and printing up to LEN
+   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
+   stops at the first null byte, otherwise printing proceeds (including null
+   bytes) until either print_max or LEN characters have been printed,
+   whichever is smaller.  */
+
+int
+val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
+		  const struct value_print_options *options)
+{
+  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  int found_nul;		/* Non-zero if we found the nul char */
+  unsigned int fetchlimit;	/* Maximum number of chars to print.  */
+  int bytes_read;
+  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* First we need to figure out the limit on the number of characters we are
+     going to attempt to fetch and print.  This is actually pretty simple.  If
+     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
+     LEN is -1, then the limit is print_max.  This is true regardless of
+     whether print_max is zero, UINT_MAX (unlimited), or something in between,
+     because finding the null byte (or available memory) is what actually
+     limits the fetch.  */
+
+  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+
+  errcode = read_string (addr, len, width, fetchlimit, &buffer, &bytes_read);
+  old_chain = make_cleanup (xfree, buffer);
+
+  addr += bytes_read;
 
   /* We now have either successfully filled the buffer to fetchlimit, or
-     terminated early due to an error or finding a null char when LEN is -1. */
+     terminated early due to an error or finding a null char when LEN is -1.  */
+
+  /* Determine found_nul by looking at the last character read.  */
+  found_nul = extract_unsigned_integer (buffer + bytes_read - width, width) == 0;
 
   if (len == -1 && !found_nul)
     {
       gdb_byte *peekbuf;
 
-      /* We didn't find a null terminator we were looking for.  Attempt
+      /* We didn't find a NUL terminator we were looking for.  Attempt
          to peek at the next character.  If not successful, or it is not
          a null byte, then force ellipsis to be printed.  */
 
@@ -1316,26 +1366,24 @@ val_print_string (CORE_ADDR addr, int le
 	  && extract_unsigned_integer (peekbuf, width) != 0)
 	force_ellipsis = 1;
     }
-  else if ((len >= 0 && errcode != 0) || (len > (bufptr - buffer) / width))
+  else if ((len >= 0 && errcode != 0) || (len > bytes_read / width))
     {
       /* Getting an error when we have a requested length, or fetching less
          than the number of characters actually requested, always make us
-         print ellipsis. */
+         print ellipsis.  */
       force_ellipsis = 1;
     }
 
-  QUIT;
-
   /* If we get an error before fetching anything, don't print a string.
      But if we fetch something and then get an error, print the string
      and then the error message.  */
-  if (errcode == 0 || bufptr > buffer)
+  if (errcode == 0 || bytes_read > 0)
     {
       if (options->addressprint)
 	{
 	  fputs_filtered (" ", stream);
 	}
-      LA_PRINT_STRING (stream, buffer, (bufptr - buffer) / width, width, force_ellipsis, options);
+      LA_PRINT_STRING (stream, buffer, bytes_read / width, width, force_ellipsis, options);
     }
 
   if (errcode != 0)
@@ -1353,9 +1401,11 @@ val_print_string (CORE_ADDR addr, int le
 	  fprintf_filtered (stream, ": %s>", safe_strerror (errcode));
 	}
     }
+
   gdb_flush (stream);
   do_cleanups (old_chain);
-  return ((bufptr - buffer) / width);
+
+  return (bytes_read / width);
 }
 \f
 
Index: gdb.git/gdb/valprint.h
===================================================================
--- gdb.git.orig/gdb/valprint.h	2008-11-24 19:15:50.000000000 -0200
+++ gdb.git/gdb/valprint.h	2008-11-24 19:55:16.000000000 -0200
@@ -136,4 +136,7 @@ extern void print_hex_chars (struct ui_f
 
 extern void print_char_chars (struct ui_file *, const gdb_byte *,
 			      unsigned int, enum bfd_endian);
+
+int read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+		 gdb_byte **buffer, int *bytes_read);
 #endif
Index: gdb.git/gdb/value.h
===================================================================
--- gdb.git.orig/gdb/value.h	2008-11-24 19:15:50.000000000 -0200
+++ gdb.git/gdb/value.h	2008-11-24 19:55:16.000000000 -0200
@@ -514,6 +514,8 @@ extern void modify_field (gdb_byte *addr
 extern void type_print (struct type *type, char *varstring,
 			struct ui_file *stream, int show);
 
+extern char * type_to_string (struct type *type);
+
 extern gdb_byte *baseclass_addr (struct type *type, int index,
 				 gdb_byte *valaddr,
 				 struct value **valuep, int *errp);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-25  8:53           ` Thiago Jung Bauermann
@ 2009-01-03  2:27             ` Thiago Jung Bauermann
  2009-02-02 18:42               ` Tom Tromey
  2009-02-03  0:23               ` Joel Brobecker
  0 siblings, 2 replies; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-01-03  2:27 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: gdb-patches ml

El lun, 24-11-2008 a las 20:09 -0200, Thiago Jung Bauermann escribiÃ³: 
> Ok, this version of the patch has the following changes from my last
> submission:
> 
> - changed read_string comment to use the wording above;
> - renamed la_getstr, c_getstr and default_getstr to la_get_string, 
>   c_get_string and default_get_string;
> - added const char **encoding argument to {la,c,default}_get_string,
>   so that the charset of the string can be returned to the caller;
> - fixed c_get_string to use UINT_MAX for fetchlimit instead of -1;
> - fixed c_get_string to allocate 1 byte when LEN is 0.
> 
> WDYT?

The patch doesn't apply anymore. This is the same patch, refreshed
against HEAD as of Dec 28th. Ok?
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


2009-01-03  Thiago Jung Bauermann  <bauerman@br.ibm.com>

	* language.h (language_dfn): Add la_get_string member.
	(LA_GET_STRING): New macro.
	(default_get_string): New prototype.
	* language.c (default_get_string): New function.
	(unknown_language_defn, auto_language_defn, local_language_defn): Use
	default_get_string for la_get_string.
	* c-lang.c (c_get_string): New function.
	(c_language_defn, cplus_language_defn, asm_language_defn): Use
	c_get_string for la_get_string.
	(minimal_language_defn): Use default_get_string for la_get_string.
	* ada-lang.c (ada_language_defn): Likewise.
	* f-lang.c (f_language_defn): Likewise.
	* jv-lang.c (java_language_defn): Likewise.
	* m2-lang.c (m2_language_defn): Likewise.
	* objc-lang.c (objc_language_defn): Likewise.
	* p-lang.c (p_language_defn): Likewise.
	* scm-lang.c (scm_language_defn): Likewise.
	* typeprint.c (type_to_string): New function.
	* value.h (type_to_string): New prototype.
	* valprint.c (val_print_string): Factor out code for reading string
	from the inferior into its own function.  Put 2 spaces after period
	in comments.
	(read_string): New function.
	* valprint.h (read_string): New prototype.

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index 71d99b0..f8b980b 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -1,7 +1,7 @@
 /* Ada language support routines for GDB, the GNU debugger.  Copyright (C)
 
-   1992, 1993, 1994, 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007
-   Free Software Foundation, Inc.
+   1992, 1993, 1994, 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007, 2008,
+   2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -11057,6 +11057,7 @@ const struct language_defn ada_language_defn = {
   ada_language_arch_info,
   ada_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index dc7b059..3324405 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -1,7 +1,7 @@
 /* C language support routines for GDB, the GNU debugger.
 
    Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2002, 2003,
-   2004, 2005, 2007, 2008 Free Software Foundation, Inc.
+   2004, 2005, 2007, 2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -181,6 +181,118 @@ c_printstr (struct ui_file *stream, const gdb_byte *string,
   if (force_ellipses || i < length)
     fputs_filtered ("...", stream);
 }
+
+/* Obtain a C string from the inferior storing it in a newly allocated
+   buffer in BUFFER, which should be freed by the caller.  The string is
+   read until a null character is found. If VALUE is an array with known
+   length, the function will not read past the end of the array.  LENGTH
+   will contain the size of the string in bytes (not counting the null
+   character).
+
+   Assumes strings are terminated by a null character.  The size of a character
+   is determined by the length of the target type of the pointer or array.
+   This means that a null byte present in a multi-byte character will not
+   terminate the string unless the whole character is null.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check LENGTH to recognize this situation.
+
+   CHARSET is always set to the target charset.
+
+   Return 0 on success, errno on failure.  */
+
+static int
+c_get_string (struct value *value, gdb_byte **buffer, int *length,
+	      const char **charset)
+{
+  int err, width;
+  unsigned int fetchlimit;
+  struct type *type = value_type (value);
+  struct type *element_type = TYPE_TARGET_TYPE (type);
+
+  if (element_type == NULL)
+    goto error;
+
+  if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+    {
+      /* If we know the size of the array, we can use it as a limit on the
+	 number of characters to be fetched.  */
+      if ((TYPE_NFIELDS (type) == 1)
+	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
+	{
+	  LONGEST low_bound, high_bound;
+
+	  get_discrete_bounds (TYPE_FIELD_TYPE (type, 0),
+			       &low_bound, &high_bound);
+	  fetchlimit = high_bound - low_bound + 1;
+	}
+      else
+	fetchlimit = UINT_MAX;
+    }
+  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)
+    fetchlimit = UINT_MAX;
+  else
+    /* We work only with arrays and pointers.  */
+    goto error;
+
+  element_type = check_typedef (element_type);
+  if ((TYPE_CODE (element_type) != TYPE_CODE_INT)
+      && (TYPE_CODE (element_type) != TYPE_CODE_CHAR))
+    /* If the elements are not integers or characters, we don't consider it
+       a string.  */
+    goto error;
+
+  width = TYPE_LENGTH (element_type);
+
+  /* If the string lives in GDB's memory intead of the inferior's, then we
+     just need to copy it to BUFFER.  Also, since such strings are arrays
+     with known size, FETCHLIMIT will hold the size of the array.  */
+  if (((VALUE_LVAL (value) == not_lval)
+      || (VALUE_LVAL (value) == lval_internalvar)) && (fetchlimit != UINT_MAX))
+    {
+      int i;
+      const gdb_byte *contents = value_contents (value);
+
+      /* Look for a null character.  */
+      for (i = 0; i < fetchlimit; i++)
+	if (extract_unsigned_integer (contents + i*width, width) == 0)
+	  break;
+
+      /* i is now either the number of non-null characters, or fetchlimit.  */
+      *length = i*width;
+      *buffer = xmalloc (*length);
+      memcpy (*buffer, contents, *length);
+      err = 0;
+    }
+  else
+    err = read_string (value_as_address (value), -1, width, fetchlimit,
+		       buffer, length);
+
+  /* If the last character is null, subtract it from length.  */
+  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
+      *length -= width;
+
+  *charset = target_charset ();
+
+  return err;
+
+error:
+  {
+    char *type_str;
+
+    type_str = type_to_string (type);
+    if (type_str)
+      {
+	make_cleanup (xfree, type_str);
+	error (_("Trying to read string with inappropriate type `%s'."),
+	       type_str);
+      }
+    else
+      error (_("Trying to read string with inappropriate type."));
+  }
+}
+
 \f
 /* Preprocessing and parsing C and C++ expressions.  */
 
@@ -314,6 +426,7 @@ const struct language_defn c_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -432,6 +545,7 @@ const struct language_defn cplus_language_defn =
   cplus_language_arch_info,
   default_print_array_index,
   cp_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -469,6 +583,7 @@ const struct language_defn asm_language_defn =
   c_language_arch_info, /* FIXME: la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -511,6 +626,7 @@ const struct language_defn minimal_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/f-lang.c b/gdb/f-lang.c
index 4d4d4d7..6359841 100644
--- a/gdb/f-lang.c
+++ b/gdb/f-lang.c
@@ -1,7 +1,7 @@
 /* Fortran language support routines for GDB, the GNU debugger.
 
    Copyright (C) 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002, 2003,
-   2004, 2005, 2007, 2008 Free Software Foundation, Inc.
+   2004, 2005, 2007, 2008, 2009 Free Software Foundation, Inc.
 
    Contributed by Motorola.  Adapted from the C parser by Farooq Butt
    (fmbutt@engage.sps.mot.com).
@@ -343,6 +343,7 @@ const struct language_defn f_language_defn =
   f_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/jv-lang.c b/gdb/jv-lang.c
index 0470eef..0728866 100644
--- a/gdb/jv-lang.c
+++ b/gdb/jv-lang.c
@@ -1,6 +1,6 @@
 /* Java language support routines for GDB, the GNU debugger.
 
-   Copyright (C) 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007, 2008
+   Copyright (C) 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007, 2008, 2009
    Free Software Foundation, Inc.
 
    This file is part of GDB.
@@ -1128,6 +1128,7 @@ const struct language_defn java_language_defn =
   java_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/language.c b/gdb/language.c
index 46e238d..36198c3 100644
--- a/gdb/language.c
+++ b/gdb/language.c
@@ -1,7 +1,7 @@
 /* Multiple source language support for GDB.
 
    Copyright (C) 1991, 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001,
-   2002, 2003, 2004, 2005, 2007, 2008 Free Software Foundation, Inc.
+   2002, 2003, 2004, 2005, 2007, 2008, 2009 Free Software Foundation, Inc.
 
    Contributed by the Department of Computer Science at the State University
    of New York at Buffalo.
@@ -1043,6 +1043,13 @@ default_print_array_index (struct value *index_value, struct ui_file *stream,
   fprintf_filtered (stream, "] = ");
 }
 
+int
+default_get_string (struct value *value, gdb_byte **buffer, int *length,
+		    const char **charset)
+{
+  error (_("Getting a string is unsupported in this language."));
+}
+
 /* Define the language that is no language.  */
 
 static int
@@ -1165,6 +1172,7 @@ const struct language_defn unknown_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1203,6 +1211,7 @@ const struct language_defn auto_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1240,6 +1249,7 @@ const struct language_defn local_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 \f
diff --git a/gdb/language.h b/gdb/language.h
index c92c57c..46f8f34 100644
--- a/gdb/language.h
+++ b/gdb/language.h
@@ -1,7 +1,7 @@
 /* Source-language-related definitions for GDB.
 
    Copyright (C) 1991, 1992, 1993, 1994, 1995, 1998, 1999, 2000, 2003, 2004,
-   2007, 2008 Free Software Foundation, Inc.
+   2007, 2008, 2009 Free Software Foundation, Inc.
 
    Contributed by the Department of Computer Science at the State University
    of New York at Buffalo.
@@ -282,6 +282,14 @@ struct language_defn
        reference at the language level.  */
     int (*la_pass_by_reference) (struct type *type);
 
+    /* Obtain a string from the inferior, storing it in a newly allocated
+       buffer in BUFFER, which should be freed by the caller.  LENGTH will
+       hold the size in bytes of the string (only actual characters, excluding
+       an eventual terminating null character).  CHARSET will hold the encoding
+       used in the string.  */
+    int (*la_get_string) (struct value *value, gdb_byte **buffer, int *length,
+			  const char **charset);
+
     /* Add fields above this point, so the magic number is always last. */
     /* Magic number for compat checking */
 
@@ -380,6 +388,8 @@ extern enum language set_language (enum language);
 				 force_ellipses,options))
 #define LA_EMIT_CHAR(ch, stream, quoter) \
   (current_language->la_emitchar(ch, stream, quoter))
+#define LA_GET_STRING(value, buffer, length, encoding) \
+  (current_language->la_get_string(value, buffer, length, encoding))
 
 #define LA_PRINT_ARRAY_INDEX(index_value, stream, optins) \
   (current_language->la_print_array_index(index_value, stream, options))
@@ -489,4 +499,7 @@ int default_pass_by_reference (struct type *type);
 void default_print_typedef (struct type *type, struct symbol *new_symbol,
 			    struct ui_file *stream);
 
+int default_get_string (struct value *value, gdb_byte **buffer, int *length,
+			const char **charset);
+
 #endif /* defined (LANGUAGE_H) */
diff --git a/gdb/m2-lang.c b/gdb/m2-lang.c
index e09b64b..9e4bb1b 100644
--- a/gdb/m2-lang.c
+++ b/gdb/m2-lang.c
@@ -1,7 +1,7 @@
 /* Modula 2 language support routines for GDB, the GNU debugger.
 
    Copyright (C) 1992, 1993, 1994, 1995, 1996, 1998, 2000, 2002, 2003, 2004,
-   2005, 2007, 2008 Free Software Foundation, Inc.
+   2005, 2007, 2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -394,6 +394,7 @@ const struct language_defn m2_language_defn =
   m2_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/objc-lang.c b/gdb/objc-lang.c
index 3a952f5..a6c74a3 100644
--- a/gdb/objc-lang.c
+++ b/gdb/objc-lang.c
@@ -1,6 +1,6 @@
 /* Objective-C language support routines for GDB, the GNU debugger.
 
-   Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008
+   Copyright (C) 2002, 2003, 2004, 2005, 2007, 2008, 2009
    Free Software Foundation, Inc.
 
    Contributed by Apple Computer, Inc.
@@ -531,6 +531,7 @@ const struct language_defn objc_language_defn = {
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/p-lang.c b/gdb/p-lang.c
index cd4285d..4ba6136 100644
--- a/gdb/p-lang.c
+++ b/gdb/p-lang.c
@@ -1,6 +1,6 @@
 /* Pascal language support routines for GDB, the GNU debugger.
 
-   Copyright (C) 2000, 2002, 2003, 2004, 2005, 2007, 2008
+   Copyright (C) 2000, 2002, 2003, 2004, 2005, 2007, 2008, 2009
    Free Software Foundation, Inc.
 
    This file is part of GDB.
@@ -433,6 +433,7 @@ const struct language_defn pascal_language_defn =
   pascal_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/scm-lang.c b/gdb/scm-lang.c
index 0b25590..345befd 100644
--- a/gdb/scm-lang.c
+++ b/gdb/scm-lang.c
@@ -1,7 +1,7 @@
 /* Scheme/Guile language support routines for GDB, the GNU debugger.
 
    Copyright (C) 1995, 1996, 1998, 2000, 2001, 2002, 2003, 2004, 2005, 2007,
-   2008 Free Software Foundation, Inc.
+   2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -273,6 +273,7 @@ const struct language_defn scm_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/typeprint.c b/gdb/typeprint.c
index edf87cd..1f824fa 100644
--- a/gdb/typeprint.c
+++ b/gdb/typeprint.c
@@ -1,7 +1,7 @@
 /* Language independent support for printing types for GDB, the GNU debugger.
 
    Copyright (C) 1986, 1988, 1989, 1991, 1992, 1993, 1994, 1995, 1998, 1999,
-   2000, 2001, 2003, 2006, 2007, 2008 Free Software Foundation, Inc.
+   2000, 2001, 2003, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -33,6 +33,7 @@
 #include "cp-abi.h"
 #include "typeprint.h"
 #include "gdb_string.h"
+#include "exceptions.h"
 #include "valprint.h"
 #include <errno.h>
 
@@ -78,6 +79,34 @@ type_print (struct type *type, char *varstring, struct ui_file *stream,
   LA_PRINT_TYPE (type, varstring, stream, show, 0);
 }
 
+/* Print TYPE to a string, returning it.  The caller is responsible for
+   freeing the string.  */
+
+char *
+type_to_string (struct type *type)
+{
+  char *s = NULL;
+  long dummy;
+  struct ui_file *stb;
+  struct cleanup *old_chain;
+  volatile struct gdb_exception except;
+
+  stb = mem_fileopen ();
+  old_chain = make_cleanup_ui_file_delete (stb);
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      type_print (type, "", stb, -1);
+      s = ui_file_xstrdup (stb, &dummy);
+    }
+  if (except.reason < 0)
+    s = NULL;
+
+  do_cleanups (old_chain);
+
+  return s;
+}
+
 /* Print type of EXP, or last thing in value history if EXP == NULL.
    show is passed to type_print.  */
 
diff --git a/gdb/valprint.c b/gdb/valprint.c
index 6bcb2f8..4654d24 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -1,8 +1,8 @@
 /* Print values for GDB, the GNU debugger.
 
    Copyright (C) 1986, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996,
-   1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008
-   Free Software Foundation, Inc.
+   1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
+   2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -1177,43 +1177,44 @@ partial_memory_read (CORE_ADDR memaddr, gdb_byte *myaddr, int len, int *errnoptr
   return (nread);
 }
 
-/*  Print a string from the inferior, starting at ADDR and printing up to LEN
-   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
-   stops at the first null byte, otherwise printing proceeds (including null
-   bytes) until either print_max or LEN characters have been printed,
-   whichever is smaller. */
+/* Read a string from the inferior, at ADDR, with LEN characters of WIDTH bytes
+   each.  Fetch at most FETCHLIMIT characters.  BUFFER will be set to a newly
+   allocated buffer containing the string, which the caller is responsible to
+   free, and BYTES_READ will be set to the number of bytes read.  Returns 0 on
+   success, or errno on failure.
 
-/* FIXME: Use target_read_string.  */
+   If LEN > 0, reads exactly LEN characters (including eventual NULs in
+   the middle or end of the string).  If LEN is -1, stops at the first
+   null character (not necessarily the first null byte) up to a maximum
+   of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
+   characters as possible from the string.
 
-int
-val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
-		  const struct value_print_options *options)
-{
-  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero. */
-  int errcode;			/* Errno returned from bad reads. */
-  unsigned int fetchlimit;	/* Maximum number of chars to print. */
-  unsigned int nfetch;		/* Chars to fetch / chars fetched. */
-  unsigned int chunksize;	/* Size of each fetch, in chars. */
-  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer. */
-  gdb_byte *bufptr;		/* Pointer to next available byte in buffer. */
-  gdb_byte *limit;		/* First location past end of fetch buffer. */
-  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain. */
-  int found_nul;		/* Non-zero if we found the nul char */
-
-  /* First we need to figure out the limit on the number of characters we are
-     going to attempt to fetch and print.  This is actually pretty simple.  If
-     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
-     LEN is -1, then the limit is print_max.  This is true regardless of
-     whether print_max is zero, UINT_MAX (unlimited), or something in between,
-     because finding the null byte (or available memory) is what actually
-     limits the fetch. */
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check BYTES_READ to recognize this situation.
 
-  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+   Note: There was a FIXME asking to make this code use target_read_string,
+   but this function is more general (can read past null characters, up to
+   given LEN). Besides, it is used much more often than target_read_string
+   so it is more tested.  Perhaps callers of target_read_string should use
+   this function instead?  */
 
-  /* Now decide how large of chunks to try to read in one operation.  This
+int
+read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+	     gdb_byte **buffer, int *bytes_read)
+{
+  int found_nul;		/* Non-zero if we found the nul char.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  unsigned int nfetch;		/* Chars to fetch / chars fetched.  */
+  unsigned int chunksize;	/* Size of each fetch, in chars.  */
+  gdb_byte *bufptr;		/* Pointer to next available byte in buffer.  */
+  gdb_byte *limit;		/* First location past end of fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* Decide how large of chunks to try to read in one operation.  This
      is also pretty simple.  If LEN >= zero, then we want fetchlimit chars,
      so we might as well read them all in one operation.  If LEN is -1, we
-     are looking for a null terminator to end the fetching, so we might as
+     are looking for a NUL terminator to end the fetching, so we might as
      well read in blocks that are large enough to be efficient, but not so
      large as to be slow if fetchlimit happens to be large.  So we choose the
      minimum of 8 and fetchlimit.  We used to use 200 instead of 8 but
@@ -1221,17 +1222,17 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 
   chunksize = (len == -1 ? min (8, fetchlimit) : fetchlimit);
 
-  /* Loop until we either have all the characters to print, or we encounter
-     some error, such as bumping into the end of the address space. */
+  /* Loop until we either have all the characters, or we encounter
+     some error, such as bumping into the end of the address space.  */
 
   found_nul = 0;
   old_chain = make_cleanup (null_cleanup, 0);
 
   if (len > 0)
     {
-      buffer = (gdb_byte *) xmalloc (len * width);
-      bufptr = buffer;
-      old_chain = make_cleanup (xfree, buffer);
+      *buffer = (gdb_byte *) xmalloc (len * width);
+      bufptr = *buffer;
+      old_chain = make_cleanup (xfree, *buffer);
 
       nfetch = partial_memory_read (addr, bufptr, len * width, &errcode)
 	/ width;
@@ -1241,32 +1242,36 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
   else if (len == -1)
     {
       unsigned long bufsize = 0;
+
+      *buffer = NULL;
+
       do
 	{
 	  QUIT;
 	  nfetch = min (chunksize, fetchlimit - bufsize);
 
-	  if (buffer == NULL)
-	    buffer = (gdb_byte *) xmalloc (nfetch * width);
+	  if (*buffer == NULL)
+	    *buffer = (gdb_byte *) xmalloc (nfetch * width);
 	  else
 	    {
 	      discard_cleanups (old_chain);
-	      buffer = (gdb_byte *) xrealloc (buffer, (nfetch + bufsize) * width);
+	      *buffer = (gdb_byte *) xrealloc (*buffer,
+					       (nfetch + bufsize) * width);
 	    }
 
-	  old_chain = make_cleanup (xfree, buffer);
-	  bufptr = buffer + bufsize * width;
+	  old_chain = make_cleanup (xfree, *buffer);
+	  bufptr = *buffer + bufsize * width;
 	  bufsize += nfetch;
 
-	  /* Read as much as we can. */
+	  /* Read as much as we can.  */
 	  nfetch = partial_memory_read (addr, bufptr, nfetch * width, &errcode)
-	    / width;
+		    / width;
 
-	  /* Scan this chunk for the null byte that terminates the string
+	  /* Scan this chunk for the null character that terminates the string
 	     to print.  If found, we don't need to fetch any more.  Note
 	     that bufptr is explicitly left pointing at the next character
-	     after the null byte, or at the next character after the end of
-	     the buffer. */
+	     after the null character, or at the next character after the end
+	     of the buffer.  */
 
 	  limit = bufptr + nfetch * width;
 	  while (bufptr < limit)
@@ -1279,7 +1284,7 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	      if (c == 0)
 		{
 		  /* We don't care about any error which happened after
-		     the NULL terminator.  */
+		     the NUL terminator.  */
 		  errcode = 0;
 		  found_nul = 1;
 		  break;
@@ -1287,26 +1292,71 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	    }
 	}
       while (errcode == 0	/* no error */
-	     && bufptr - buffer < fetchlimit * width	/* no overrun */
-	     && !found_nul);	/* haven't found nul yet */
+	     && bufptr - *buffer < fetchlimit * width	/* no overrun */
+	     && !found_nul);	/* haven't found NUL yet */
     }
   else
-    {				/* length of string is really 0! */
-      buffer = bufptr = NULL;
+    {				/* Length of string is really 0!  */
+      /* We always allocate *buffer.  */
+      *buffer = bufptr = xmalloc (1);
       errcode = 0;
     }
 
   /* bufptr and addr now point immediately beyond the last byte which we
      consider part of the string (including a '\0' which ends the string).  */
+  *bytes_read = bufptr - *buffer;
+
+  QUIT;
+
+  discard_cleanups (old_chain);
+
+  return errcode;
+}
+
+/* Print a string from the inferior, starting at ADDR and printing up to LEN
+   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
+   stops at the first null byte, otherwise printing proceeds (including null
+   bytes) until either print_max or LEN characters have been printed,
+   whichever is smaller.  */
+
+int
+val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
+		  const struct value_print_options *options)
+{
+  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  int found_nul;		/* Non-zero if we found the nul char */
+  unsigned int fetchlimit;	/* Maximum number of chars to print.  */
+  int bytes_read;
+  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* First we need to figure out the limit on the number of characters we are
+     going to attempt to fetch and print.  This is actually pretty simple.  If
+     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
+     LEN is -1, then the limit is print_max.  This is true regardless of
+     whether print_max is zero, UINT_MAX (unlimited), or something in between,
+     because finding the null byte (or available memory) is what actually
+     limits the fetch.  */
+
+  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+
+  errcode = read_string (addr, len, width, fetchlimit, &buffer, &bytes_read);
+  old_chain = make_cleanup (xfree, buffer);
+
+  addr += bytes_read;
 
   /* We now have either successfully filled the buffer to fetchlimit, or
-     terminated early due to an error or finding a null char when LEN is -1. */
+     terminated early due to an error or finding a null char when LEN is -1.  */
+
+  /* Determine found_nul by looking at the last character read.  */
+  found_nul = extract_unsigned_integer (buffer + bytes_read - width, width) == 0;
 
   if (len == -1 && !found_nul)
     {
       gdb_byte *peekbuf;
 
-      /* We didn't find a null terminator we were looking for.  Attempt
+      /* We didn't find a NUL terminator we were looking for.  Attempt
          to peek at the next character.  If not successful, or it is not
          a null byte, then force ellipsis to be printed.  */
 
@@ -1316,26 +1366,24 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	  && extract_unsigned_integer (peekbuf, width) != 0)
 	force_ellipsis = 1;
     }
-  else if ((len >= 0 && errcode != 0) || (len > (bufptr - buffer) / width))
+  else if ((len >= 0 && errcode != 0) || (len > bytes_read / width))
     {
       /* Getting an error when we have a requested length, or fetching less
          than the number of characters actually requested, always make us
-         print ellipsis. */
+         print ellipsis.  */
       force_ellipsis = 1;
     }
 
-  QUIT;
-
   /* If we get an error before fetching anything, don't print a string.
      But if we fetch something and then get an error, print the string
      and then the error message.  */
-  if (errcode == 0 || bufptr > buffer)
+  if (errcode == 0 || bytes_read > 0)
     {
       if (options->addressprint)
 	{
 	  fputs_filtered (" ", stream);
 	}
-      LA_PRINT_STRING (stream, buffer, (bufptr - buffer) / width, width, force_ellipsis, options);
+      LA_PRINT_STRING (stream, buffer, bytes_read / width, width, force_ellipsis, options);
     }
 
   if (errcode != 0)
@@ -1353,9 +1401,11 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	  fprintf_filtered (stream, ": %s>", safe_strerror (errcode));
 	}
     }
+
   gdb_flush (stream);
   do_cleanups (old_chain);
-  return ((bufptr - buffer) / width);
+
+  return (bytes_read / width);
 }
 \f
 
diff --git a/gdb/valprint.h b/gdb/valprint.h
index 47a2c4f..8b65af6 100644
--- a/gdb/valprint.h
+++ b/gdb/valprint.h
@@ -1,7 +1,7 @@
 /* Declarations for value printing routines for GDB, the GNU debugger.
 
    Copyright (C) 1986, 1988, 1989, 1991, 1992, 1993, 1994, 2000, 2005, 2007,
-   2008 Free Software Foundation, Inc.
+   2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -136,4 +136,7 @@ extern void print_hex_chars (struct ui_file *, const gdb_byte *,
 
 extern void print_char_chars (struct ui_file *, const gdb_byte *,
 			      unsigned int, enum bfd_endian);
+
+int read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+		 gdb_byte **buffer, int *bytes_read);
 #endif
diff --git a/gdb/value.h b/gdb/value.h
index 2961919..c6e29dc 100644
--- a/gdb/value.h
+++ b/gdb/value.h
@@ -2,7 +2,7 @@
 
    Copyright (C) 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995,
    1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007,
-   2008 Free Software Foundation, Inc.
+   2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -520,6 +520,8 @@ extern void modify_field (gdb_byte *addr, LONGEST fieldval, int bitpos,
 extern void type_print (struct type *type, char *varstring,
 			struct ui_file *stream, int show);
 
+extern char * type_to_string (struct type *type);
+
 extern gdb_byte *baseclass_addr (struct type *type, int index,
 				 gdb_byte *valaddr,
 				 struct value **valuep, int *errp);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-01-03  2:27             ` Thiago Jung Bauermann
@ 2009-02-02 18:42               ` Tom Tromey
  2009-02-03 12:51                 ` Thiago Jung Bauermann
  2009-02-03  0:23               ` Joel Brobecker
  1 sibling, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2009-02-02 18:42 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

[ la_getstr ]
Thiago> The patch doesn't apply anymore. This is the same patch, refreshed
Thiago> against HEAD as of Dec 28th. 

Thiago> +   Return 0 on success, errno on failure.  */
Thiago> +
Thiago> +static int
Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int *length,
Thiago> +	      const char **charset)

The return value here is a bit funny.  For some errors this function
calls error, but for others it returns the result of read_string.

When would we want the latter behavior?  I think perhaps la_getstr
should simply have 'void' type and then call error if read_string
returns an error.

What do you think?

Otherwise, this patch seems good to me.  I'm not completely
comfortable approving it outright; but if there are no objections
after a week or so, I will.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-02 18:42               ` Tom Tromey
@ 2009-02-03 12:51                 ` Thiago Jung Bauermann
  2009-02-03 17:44                   ` Tom Tromey
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-03 12:51 UTC (permalink / raw)
  To: tromey; +Cc: Daniel Jacobowitz, gdb-patches ml

El lun, 02-02-2009 a las 11:41 -0700, Tom Tromey escribiÃ³:
> >>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:
> 
> [ la_getstr ]
> Thiago> The patch doesn't apply anymore. This is the same patch, refreshed
> Thiago> against HEAD as of Dec 28th. 
> 
> Thiago> +   Return 0 on success, errno on failure.  */
> Thiago> +
> Thiago> +static int
> Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int *length,
> Thiago> +	      const char **charset)
> 
> The return value here is a bit funny.  For some errors this function
> calls error, but for others it returns the result of read_string.
> 
> When would we want the latter behavior?  I think perhaps la_getstr
> should simply have 'void' type and then call error if read_string
> returns an error.

The function throws an exception when it can't even start to read a
string, and returns an error when it may have already read something,
but hit an error halfway through.

> What do you think?

I don't have a strong preference. The function could always throw an
exception on error, and the caller would then check LENGTH to see if
something was read. I can change it to do so if you prefer it that way.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-03 12:51                 ` Thiago Jung Bauermann
@ 2009-02-03 17:44                   ` Tom Tromey
  2009-02-04 12:37                     ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2009-02-03 17:44 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Tom> When would we want the latter behavior?  I think perhaps la_getstr
Tom> should simply have 'void' type and then call error if read_string
Tom> returns an error.

[...]

Thiago> I don't have a strong preference. The function could always throw an
Thiago> exception on error, and the caller would then check LENGTH to see if
Thiago> something was read. I can change it to do so if you prefer it that way.

When would a caller want to use a partial result?  If there is a case
where we'd want that, then the current approach used in the patch
seems ok.  However, if there is no case where we'd want to do that,
then this approach is more complex without a benefit.

The reason I ask is that I noticed that the only caller of this
function -- the new Value.string method -- discards a partial result.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-03 17:44                   ` Tom Tromey
@ 2009-02-04 12:37                     ` Thiago Jung Bauermann
  2009-02-04 19:19                       ` Tom Tromey
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-04 12:37 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Daniel Jacobowitz, gdb-patches ml

El mar, 03-02-2009 a las 10:44 -0700, Tom Tromey escribiÃ³:
> >>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:
> 
> Tom> When would we want the latter behavior?  I think perhaps la_getstr
> Tom> should simply have 'void' type and then call error if read_string
> Tom> returns an error.
> 
> [...]
> 
> Thiago> I don't have a strong preference. The function could always throw an
> Thiago> exception on error, and the caller would then check LENGTH to see if
> Thiago> something was read. I can change it to do so if you prefer it that way.
> 
> When would a caller want to use a partial result?  If there is a case
> where we'd want that, then the current approach used in the patch
> seems ok.  However, if there is no case where we'd want to do that,
> then this approach is more complex without a benefit.
> 
> The reason I ask is that I noticed that the only caller of this
> function -- the new Value.string method -- discards a partial result.

You're worried about complexity in the code for c_get_string, or its
callers? The former is not a problem because we get this behaviour for
free from read_string, which is also called by val_print_string. This
function uses a partial result.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-04 12:37                     ` Thiago Jung Bauermann
@ 2009-02-04 19:19                       ` Tom Tromey
  2009-02-04 22:26                         ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2009-02-04 19:19 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Thiago> You're worried about complexity in the code for c_get_string, or its
Thiago> callers? The former is not a problem because we get this behaviour for
Thiago> free from read_string, which is also called by val_print_string. This
Thiago> function uses a partial result.

FYI -- we talked about this on irc, and Thiago is going to change
la_getstr to return void and throw an exception on a short read.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-04 19:19                       ` Tom Tromey
@ 2009-02-04 22:26                         ` Thiago Jung Bauermann
  2009-02-05  0:55                           ` Tom Tromey
  2009-02-05 15:55                           ` Tom Tromey
  0 siblings, 2 replies; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-04 22:26 UTC (permalink / raw)
  To: Tom Tromey; +Cc: Daniel Jacobowitz, gdb-patches ml

El miÃ©, 04-02-2009 a las 12:19 -0700, Tom Tromey escribiÃ³:
> >>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:
> 
> Thiago> You're worried about complexity in the code for c_get_string, or its
> Thiago> callers? The former is not a problem because we get this behaviour for
> Thiago> free from read_string, which is also called by val_print_string. This
> Thiago> function uses a partial result.
> 
> FYI -- we talked about this on irc, and Thiago is going to change
> la_getstr to return void and throw an exception on a short read.

Right. Here's the new version. It also uses c_get_string for Ada and
minimal...
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


2009-02-04  Thiago Jung Bauermann  <bauerman@br.ibm.com>

	* language.h (language_dfn): Add la_get_string member.
	(LA_GET_STRING): New macro.
	(default_get_string): New prototype.
	* language.c (default_get_string): New function.
	(unknown_language_defn, auto_language_defn, local_language_defn): Use
	default_get_string for la_get_string.
	* c-lang.c (c_get_string): New function.
	(c_language_defn, cplus_language_defn, asm_language_defn): Use
	c_get_string for la_get_string.
	(minimal_language_defn): Likewise
	* ada-lang.c (ada_language_defn): Likewise.
	* f-lang.c (f_language_defn): Use default_get_string for
	la_get_string.
	* jv-lang.c (java_language_defn): Likewise.
	* m2-lang.c (m2_language_defn): Likewise.
	* objc-lang.c (objc_language_defn): Likewise.
	* p-lang.c (p_language_defn): Likewise.
	* scm-lang.c (scm_language_defn): Likewise.
	* typeprint.c (type_to_string): New function.
	* value.h (type_to_string): New prototype.
	* valprint.c (val_print_string): Factor out code for reading string
	from the inferior into its own function.  Put 2 spaces after period
	in comments.
	(read_string): New function.
	* valprint.h (read_string): New prototype.

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index 656e771..a01d45c 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -1,7 +1,7 @@
 /* Ada language support routines for GDB, the GNU debugger.  Copyright (C)
 
-   1992, 1993, 1994, 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007
-   Free Software Foundation, Inc.
+   1992, 1993, 1994, 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007, 2008,
+   2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -11055,6 +11055,7 @@ const struct language_defn ada_language_defn = {
   ada_language_arch_info,
   ada_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index 309a0b0..b5126b0 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -181,6 +181,121 @@ c_printstr (struct ui_file *stream, const gdb_byte *string,
   if (force_ellipses || i < length)
     fputs_filtered ("...", stream);
 }
+
+/* Obtain a C string from the inferior storing it in a newly allocated
+   buffer in BUFFER, which should be freed by the caller.  The string is
+   read until a null character is found. If VALUE is an array with known
+   length, the function will not read past the end of the array.  LENGTH
+   will contain the size of the string in bytes (not counting the null
+   character).
+
+   Assumes strings are terminated by a null character.  The size of a character
+   is determined by the length of the target type of the pointer or array.
+   This means that a null byte present in a multi-byte character will not
+   terminate the string unless the whole character is null.
+
+   CHARSET is always set to the target charset.  */
+
+void
+c_get_string (struct value *value, gdb_byte **buffer, int *length,
+	      const char **charset)
+{
+  int err, width;
+  unsigned int fetchlimit;
+  struct type *type = value_type (value);
+  struct type *element_type = TYPE_TARGET_TYPE (type);
+
+  if (element_type == NULL)
+    goto error;
+
+  if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+    {
+      /* If we know the size of the array, we can use it as a limit on the
+	 number of characters to be fetched.  */
+      if (TYPE_NFIELDS (type) == 1
+	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
+	{
+	  LONGEST low_bound, high_bound;
+
+	  get_discrete_bounds (TYPE_FIELD_TYPE (type, 0),
+			       &low_bound, &high_bound);
+	  fetchlimit = high_bound - low_bound + 1;
+	}
+      else
+	fetchlimit = UINT_MAX;
+    }
+  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)
+    fetchlimit = UINT_MAX;
+  else
+    /* We work only with arrays and pointers.  */
+    goto error;
+
+  element_type = check_typedef (element_type);
+  if (TYPE_CODE (element_type) != TYPE_CODE_INT
+      && TYPE_CODE (element_type) != TYPE_CODE_CHAR)
+    /* If the elements are not integers or characters, we don't consider it
+       a string.  */
+    goto error;
+
+  width = TYPE_LENGTH (element_type);
+
+  /* If the string lives in GDB's memory intead of the inferior's, then we
+     just need to copy it to BUFFER.  Also, since such strings are arrays
+     with known size, FETCHLIMIT will hold the size of the array.  */
+  if ((VALUE_LVAL (value) == not_lval
+       || VALUE_LVAL (value) == lval_internalvar)
+      && fetchlimit != UINT_MAX)
+    {
+      int i;
+      const gdb_byte *contents = value_contents (value);
+
+      /* Look for a null character.  */
+      for (i = 0; i < fetchlimit; i++)
+	if (extract_unsigned_integer (contents + i*width, width) == 0)
+	  break;
+
+      /* i is now either the number of non-null characters, or fetchlimit.  */
+      *length = i*width;
+      *buffer = xmalloc (*length);
+      memcpy (*buffer, contents, *length);
+      err = 0;
+    }
+  else
+    {
+      err = read_string (value_as_address (value), -1, width, fetchlimit,
+			 buffer, length);
+      if (err)
+	{
+	  xfree (buffer);
+	  error (_("Error reading string from inferior: %s"),
+		 safe_strerror (err));
+	}
+    }
+
+  /* If the last character is null, subtract it from length.  */
+  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
+      *length -= width;
+
+  *charset = target_charset ();
+
+  return;
+
+error:
+  {
+    char *type_str;
+
+    type_str = type_to_string (type);
+    if (type_str)
+      {
+	make_cleanup (xfree, type_str);
+	error (_("Trying to read string with inappropriate type `%s'."),
+	       type_str);
+      }
+    else
+      error (_("Trying to read string with inappropriate type."));
+  }
+}
+
 \f
 /* Preprocessing and parsing C and C++ expressions.  */
 
@@ -314,6 +429,7 @@ const struct language_defn c_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -432,6 +548,7 @@ const struct language_defn cplus_language_defn =
   cplus_language_arch_info,
   default_print_array_index,
   cp_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -469,6 +586,7 @@ const struct language_defn asm_language_defn =
   c_language_arch_info, /* FIXME: la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -511,6 +629,7 @@ const struct language_defn minimal_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/f-lang.c b/gdb/f-lang.c
index cf4abaa..6359841 100644
--- a/gdb/f-lang.c
+++ b/gdb/f-lang.c
@@ -343,6 +343,7 @@ const struct language_defn f_language_defn =
   f_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/jv-lang.c b/gdb/jv-lang.c
index 83fc064..0728866 100644
--- a/gdb/jv-lang.c
+++ b/gdb/jv-lang.c
@@ -1128,6 +1128,7 @@ const struct language_defn java_language_defn =
   java_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/language.c b/gdb/language.c
index bdf2ed7..3c37a64 100644
--- a/gdb/language.c
+++ b/gdb/language.c
@@ -1043,6 +1043,13 @@ default_print_array_index (struct value *index_value, struct ui_file *stream,
   fprintf_filtered (stream, "] = ");
 }
 
+void
+default_get_string (struct value *value, gdb_byte **buffer, int *length,
+		    const char **charset)
+{
+  error (_("Getting a string is unsupported in this language."));
+}
+
 /* Define the language that is no language.  */
 
 static int
@@ -1165,6 +1172,7 @@ const struct language_defn unknown_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1203,6 +1211,7 @@ const struct language_defn auto_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1240,6 +1249,7 @@ const struct language_defn local_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 \f
diff --git a/gdb/language.h b/gdb/language.h
index e7dd01e..85826fd 100644
--- a/gdb/language.h
+++ b/gdb/language.h
@@ -282,6 +282,14 @@ struct language_defn
        reference at the language level.  */
     int (*la_pass_by_reference) (struct type *type);
 
+    /* Obtain a string from the inferior, storing it in a newly allocated
+       buffer in BUFFER, which should be freed by the caller.  LENGTH will
+       hold the size in bytes of the string (only actual characters, excluding
+       an eventual terminating null character).  CHARSET will hold the encoding
+       used in the string.  */
+    void (*la_get_string) (struct value *value, gdb_byte **buffer, int *length,
+			  const char **charset);
+
     /* Add fields above this point, so the magic number is always last. */
     /* Magic number for compat checking */
 
@@ -380,6 +388,8 @@ extern enum language set_language (enum language);
 				 force_ellipses,options))
 #define LA_EMIT_CHAR(ch, stream, quoter) \
   (current_language->la_emitchar(ch, stream, quoter))
+#define LA_GET_STRING(value, buffer, length, encoding) \
+  (current_language->la_get_string(value, buffer, length, encoding))
 
 #define LA_PRINT_ARRAY_INDEX(index_value, stream, optins) \
   (current_language->la_print_array_index(index_value, stream, options))
@@ -489,4 +499,10 @@ int default_pass_by_reference (struct type *type);
 void default_print_typedef (struct type *type, struct symbol *new_symbol,
 			    struct ui_file *stream);
 
+void default_get_string (struct value *value, gdb_byte **buffer, int *length,
+			 const char **charset);
+
+void c_get_string (struct value *value, gdb_byte **buffer, int *length,
+		   const char **charset);
+
 #endif /* defined (LANGUAGE_H) */
diff --git a/gdb/m2-lang.c b/gdb/m2-lang.c
index a8dd8ad..9e4bb1b 100644
--- a/gdb/m2-lang.c
+++ b/gdb/m2-lang.c
@@ -394,6 +394,7 @@ const struct language_defn m2_language_defn =
   m2_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/objc-lang.c b/gdb/objc-lang.c
index 466221c..a6c74a3 100644
--- a/gdb/objc-lang.c
+++ b/gdb/objc-lang.c
@@ -531,6 +531,7 @@ const struct language_defn objc_language_defn = {
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/p-lang.c b/gdb/p-lang.c
index 0e0de2e..4ba6136 100644
--- a/gdb/p-lang.c
+++ b/gdb/p-lang.c
@@ -433,6 +433,7 @@ const struct language_defn pascal_language_defn =
   pascal_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/scm-lang.c b/gdb/scm-lang.c
index b604e40..345befd 100644
--- a/gdb/scm-lang.c
+++ b/gdb/scm-lang.c
@@ -273,6 +273,7 @@ const struct language_defn scm_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/typeprint.c b/gdb/typeprint.c
index 678aaa1..1f824fa 100644
--- a/gdb/typeprint.c
+++ b/gdb/typeprint.c
@@ -33,6 +33,7 @@
 #include "cp-abi.h"
 #include "typeprint.h"
 #include "gdb_string.h"
+#include "exceptions.h"
 #include "valprint.h"
 #include <errno.h>
 
@@ -78,6 +79,34 @@ type_print (struct type *type, char *varstring, struct ui_file *stream,
   LA_PRINT_TYPE (type, varstring, stream, show, 0);
 }
 
+/* Print TYPE to a string, returning it.  The caller is responsible for
+   freeing the string.  */
+
+char *
+type_to_string (struct type *type)
+{
+  char *s = NULL;
+  long dummy;
+  struct ui_file *stb;
+  struct cleanup *old_chain;
+  volatile struct gdb_exception except;
+
+  stb = mem_fileopen ();
+  old_chain = make_cleanup_ui_file_delete (stb);
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      type_print (type, "", stb, -1);
+      s = ui_file_xstrdup (stb, &dummy);
+    }
+  if (except.reason < 0)
+    s = NULL;
+
+  do_cleanups (old_chain);
+
+  return s;
+}
+
 /* Print type of EXP, or last thing in value history if EXP == NULL.
    show is passed to type_print.  */
 
diff --git a/gdb/valprint.c b/gdb/valprint.c
index b61da54..d13dc4c 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -1177,43 +1177,44 @@ partial_memory_read (CORE_ADDR memaddr, gdb_byte *myaddr, int len, int *errnoptr
   return (nread);
 }
 
-/*  Print a string from the inferior, starting at ADDR and printing up to LEN
-   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
-   stops at the first null byte, otherwise printing proceeds (including null
-   bytes) until either print_max or LEN characters have been printed,
-   whichever is smaller. */
-
-/* FIXME: Use target_read_string.  */
+/* Read a string from the inferior, at ADDR, with LEN characters of WIDTH bytes
+   each.  Fetch at most FETCHLIMIT characters.  BUFFER will be set to a newly
+   allocated buffer containing the string, which the caller is responsible to
+   free, and BYTES_READ will be set to the number of bytes read.  Returns 0 on
+   success, or errno on failure.
+
+   If LEN > 0, reads exactly LEN characters (including eventual NULs in
+   the middle or end of the string).  If LEN is -1, stops at the first
+   null character (not necessarily the first null byte) up to a maximum
+   of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
+   characters as possible from the string.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check BYTES_READ to recognize this situation.
+
+   Note: There was a FIXME asking to make this code use target_read_string,
+   but this function is more general (can read past null characters, up to
+   given LEN). Besides, it is used much more often than target_read_string
+   so it is more tested.  Perhaps callers of target_read_string should use
+   this function instead?  */
 
 int
-val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
-		  const struct value_print_options *options)
+read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+	     gdb_byte **buffer, int *bytes_read)
 {
-  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero. */
-  int errcode;			/* Errno returned from bad reads. */
-  unsigned int fetchlimit;	/* Maximum number of chars to print. */
-  unsigned int nfetch;		/* Chars to fetch / chars fetched. */
-  unsigned int chunksize;	/* Size of each fetch, in chars. */
-  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer. */
-  gdb_byte *bufptr;		/* Pointer to next available byte in buffer. */
-  gdb_byte *limit;		/* First location past end of fetch buffer. */
-  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain. */
-  int found_nul;		/* Non-zero if we found the nul char */
-
-  /* First we need to figure out the limit on the number of characters we are
-     going to attempt to fetch and print.  This is actually pretty simple.  If
-     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
-     LEN is -1, then the limit is print_max.  This is true regardless of
-     whether print_max is zero, UINT_MAX (unlimited), or something in between,
-     because finding the null byte (or available memory) is what actually
-     limits the fetch. */
-
-  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
-
-  /* Now decide how large of chunks to try to read in one operation.  This
+  int found_nul;		/* Non-zero if we found the nul char.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  unsigned int nfetch;		/* Chars to fetch / chars fetched.  */
+  unsigned int chunksize;	/* Size of each fetch, in chars.  */
+  gdb_byte *bufptr;		/* Pointer to next available byte in buffer.  */
+  gdb_byte *limit;		/* First location past end of fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* Decide how large of chunks to try to read in one operation.  This
      is also pretty simple.  If LEN >= zero, then we want fetchlimit chars,
      so we might as well read them all in one operation.  If LEN is -1, we
-     are looking for a null terminator to end the fetching, so we might as
+     are looking for a NUL terminator to end the fetching, so we might as
      well read in blocks that are large enough to be efficient, but not so
      large as to be slow if fetchlimit happens to be large.  So we choose the
      minimum of 8 and fetchlimit.  We used to use 200 instead of 8 but
@@ -1221,17 +1222,17 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 
   chunksize = (len == -1 ? min (8, fetchlimit) : fetchlimit);
 
-  /* Loop until we either have all the characters to print, or we encounter
-     some error, such as bumping into the end of the address space. */
+  /* Loop until we either have all the characters, or we encounter
+     some error, such as bumping into the end of the address space.  */
 
   found_nul = 0;
   old_chain = make_cleanup (null_cleanup, 0);
 
   if (len > 0)
     {
-      buffer = (gdb_byte *) xmalloc (len * width);
-      bufptr = buffer;
-      old_chain = make_cleanup (xfree, buffer);
+      *buffer = (gdb_byte *) xmalloc (len * width);
+      bufptr = *buffer;
+      old_chain = make_cleanup (xfree, *buffer);
 
       nfetch = partial_memory_read (addr, bufptr, len * width, &errcode)
 	/ width;
@@ -1241,32 +1242,36 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
   else if (len == -1)
     {
       unsigned long bufsize = 0;
+
+      *buffer = NULL;
+
       do
 	{
 	  QUIT;
 	  nfetch = min (chunksize, fetchlimit - bufsize);
 
-	  if (buffer == NULL)
-	    buffer = (gdb_byte *) xmalloc (nfetch * width);
+	  if (*buffer == NULL)
+	    *buffer = (gdb_byte *) xmalloc (nfetch * width);
 	  else
 	    {
 	      discard_cleanups (old_chain);
-	      buffer = (gdb_byte *) xrealloc (buffer, (nfetch + bufsize) * width);
+	      *buffer = (gdb_byte *) xrealloc (*buffer,
+					       (nfetch + bufsize) * width);
 	    }
 
-	  old_chain = make_cleanup (xfree, buffer);
-	  bufptr = buffer + bufsize * width;
+	  old_chain = make_cleanup (xfree, *buffer);
+	  bufptr = *buffer + bufsize * width;
 	  bufsize += nfetch;
 
-	  /* Read as much as we can. */
+	  /* Read as much as we can.  */
 	  nfetch = partial_memory_read (addr, bufptr, nfetch * width, &errcode)
-	    / width;
+		    / width;
 
-	  /* Scan this chunk for the null byte that terminates the string
+	  /* Scan this chunk for the null character that terminates the string
 	     to print.  If found, we don't need to fetch any more.  Note
 	     that bufptr is explicitly left pointing at the next character
-	     after the null byte, or at the next character after the end of
-	     the buffer. */
+	     after the null character, or at the next character after the end
+	     of the buffer.  */
 
 	  limit = bufptr + nfetch * width;
 	  while (bufptr < limit)
@@ -1279,7 +1284,7 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	      if (c == 0)
 		{
 		  /* We don't care about any error which happened after
-		     the NULL terminator.  */
+		     the NUL terminator.  */
 		  errcode = 0;
 		  found_nul = 1;
 		  break;
@@ -1287,26 +1292,71 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	    }
 	}
       while (errcode == 0	/* no error */
-	     && bufptr - buffer < fetchlimit * width	/* no overrun */
-	     && !found_nul);	/* haven't found nul yet */
+	     && bufptr - *buffer < fetchlimit * width	/* no overrun */
+	     && !found_nul);	/* haven't found NUL yet */
     }
   else
-    {				/* length of string is really 0! */
-      buffer = bufptr = NULL;
+    {				/* Length of string is really 0!  */
+      /* We always allocate *buffer.  */
+      *buffer = bufptr = xmalloc (1);
       errcode = 0;
     }
 
   /* bufptr and addr now point immediately beyond the last byte which we
      consider part of the string (including a '\0' which ends the string).  */
+  *bytes_read = bufptr - *buffer;
+
+  QUIT;
+
+  discard_cleanups (old_chain);
+
+  return errcode;
+}
+
+/* Print a string from the inferior, starting at ADDR and printing up to LEN
+   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
+   stops at the first null byte, otherwise printing proceeds (including null
+   bytes) until either print_max or LEN characters have been printed,
+   whichever is smaller.  */
+
+int
+val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
+		  const struct value_print_options *options)
+{
+  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  int found_nul;		/* Non-zero if we found the nul char */
+  unsigned int fetchlimit;	/* Maximum number of chars to print.  */
+  int bytes_read;
+  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* First we need to figure out the limit on the number of characters we are
+     going to attempt to fetch and print.  This is actually pretty simple.  If
+     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
+     LEN is -1, then the limit is print_max.  This is true regardless of
+     whether print_max is zero, UINT_MAX (unlimited), or something in between,
+     because finding the null byte (or available memory) is what actually
+     limits the fetch.  */
+
+  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+
+  errcode = read_string (addr, len, width, fetchlimit, &buffer, &bytes_read);
+  old_chain = make_cleanup (xfree, buffer);
+
+  addr += bytes_read;
 
   /* We now have either successfully filled the buffer to fetchlimit, or
-     terminated early due to an error or finding a null char when LEN is -1. */
+     terminated early due to an error or finding a null char when LEN is -1.  */
+
+  /* Determine found_nul by looking at the last character read.  */
+  found_nul = extract_unsigned_integer (buffer + bytes_read - width, width) == 0;
 
   if (len == -1 && !found_nul)
     {
       gdb_byte *peekbuf;
 
-      /* We didn't find a null terminator we were looking for.  Attempt
+      /* We didn't find a NUL terminator we were looking for.  Attempt
          to peek at the next character.  If not successful, or it is not
          a null byte, then force ellipsis to be printed.  */
 
@@ -1316,26 +1366,24 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	  && extract_unsigned_integer (peekbuf, width) != 0)
 	force_ellipsis = 1;
     }
-  else if ((len >= 0 && errcode != 0) || (len > (bufptr - buffer) / width))
+  else if ((len >= 0 && errcode != 0) || (len > bytes_read / width))
     {
       /* Getting an error when we have a requested length, or fetching less
          than the number of characters actually requested, always make us
-         print ellipsis. */
+         print ellipsis.  */
       force_ellipsis = 1;
     }
 
-  QUIT;
-
   /* If we get an error before fetching anything, don't print a string.
      But if we fetch something and then get an error, print the string
      and then the error message.  */
-  if (errcode == 0 || bufptr > buffer)
+  if (errcode == 0 || bytes_read > 0)
     {
       if (options->addressprint)
 	{
 	  fputs_filtered (" ", stream);
 	}
-      LA_PRINT_STRING (stream, buffer, (bufptr - buffer) / width, width, force_ellipsis, options);
+      LA_PRINT_STRING (stream, buffer, bytes_read / width, width, force_ellipsis, options);
     }
 
   if (errcode != 0)
@@ -1353,9 +1401,11 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	  fprintf_filtered (stream, ": %s>", safe_strerror (errcode));
 	}
     }
+
   gdb_flush (stream);
   do_cleanups (old_chain);
-  return ((bufptr - buffer) / width);
+
+  return (bytes_read / width);
 }
 \f
 
diff --git a/gdb/valprint.h b/gdb/valprint.h
index 6e85bb4..8b65af6 100644
--- a/gdb/valprint.h
+++ b/gdb/valprint.h
@@ -136,4 +136,7 @@ extern void print_hex_chars (struct ui_file *, const gdb_byte *,
 
 extern void print_char_chars (struct ui_file *, const gdb_byte *,
 			      unsigned int, enum bfd_endian);
+
+int read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+		 gdb_byte **buffer, int *bytes_read);
 #endif
diff --git a/gdb/value.h b/gdb/value.h
index 724aa5a..5dec547 100644
--- a/gdb/value.h
+++ b/gdb/value.h
@@ -524,6 +524,8 @@ extern void modify_field (gdb_byte *addr, LONGEST fieldval, int bitpos,
 extern void type_print (struct type *type, char *varstring,
 			struct ui_file *stream, int show);
 
+extern char * type_to_string (struct type *type);
+
 extern gdb_byte *baseclass_addr (struct type *type, int index,
 				 gdb_byte *valaddr,
 				 struct value **valuep, int *errp);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-04 22:26                         ` Thiago Jung Bauermann
@ 2009-02-05  0:55                           ` Tom Tromey
  2009-02-05 12:21                             ` Thiago Jung Bauermann
  2009-02-05 15:55                           ` Tom Tromey
  1 sibling, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2009-02-05  0:55 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Thiago> Right. Here's the new version. It also uses c_get_string for Ada and
Thiago> minimal...

I read it a little more closely and I'm afraid I have a couple more
comments.  Mostly they are little formatting nits, nothing important.

If you agree with the more substantive comments, then I think this
patch is ok with the suggested changes -- no need for another review.
If you don't agree, let's discuss it.

Thiago> +void
Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int *length,
Thiago> +	      const char **charset)
Thiago> +{
Thiago> +  int err, width;
Thiago> +  unsigned int fetchlimit;
Thiago> +  struct type *type = value_type (value);

I think this should probably use check_typedef.  Otherwise it seems
like we would not extract some strings properly.  E.g., consider a
case like "typedef char *string;" ... what will happen?

Thiago> +  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)

This should use 'type' instead of calling value_type again.

Thiago> +	if (extract_unsigned_integer (contents + i*width, width) == 0)

Need spaces around the "*".

Thiago> +      /* i is now either the number of non-null characters, or fetchlimit.  */

Use "I", not "i" .. a tiny nit for the GNU style, where the value of a
variable is referred to by uppercasing the name.

Thiago> +      *length = i*width;

Spaces.

Thiago> +  /* If the last character is null, subtract it from length.  */
Thiago> +  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
Thiago> +      *length -= width;

This second line looks like it has too much indentation.

Also, I suspect this should start "if (*length > 0 &&".
Otherwise, it seems like it will read memory before *BUFFER when
*LENGTH==0.

Thiago> +error:

Emacs claims that labels are indented one space in the GNU style.
I didn't see this in the standards manual though.

Thiago> +++ b/gdb/value.h
[...]
Thiago> +extern char * type_to_string (struct type *type);

Extra space after the "*".

thanks for persevering,
Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-05  0:55                           ` Tom Tromey
@ 2009-02-05 12:21                             ` Thiago Jung Bauermann
  2009-02-05 16:01                               ` Pierre Muller
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-05 12:21 UTC (permalink / raw)
  To: tromey; +Cc: Daniel Jacobowitz, gdb-patches ml

El miÃ©, 04-02-2009 a las 17:53 -0700, Tom Tromey escribiÃ³:
> >>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:
> I read it a little more closely and I'm afraid I have a couple more
> comments.  Mostly they are little formatting nits, nothing important.
> 
> If you agree with the more substantive comments, then I think this
> patch is ok with the suggested changes -- no need for another review.
> If you don't agree, let's discuss it.
> 
> Thiago> +void
> Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int *length,
> Thiago> +	      const char **charset)
> Thiago> +{
> Thiago> +  int err, width;
> Thiago> +  unsigned int fetchlimit;
> Thiago> +  struct type *type = value_type (value);
> 
> I think this should probably use check_typedef.  Otherwise it seems
> like we would not extract some strings properly.  E.g., consider a
> case like "typedef char *string;" ... what will happen?

Nice catch. Fixed.

> Thiago> +  else if (TYPE_CODE (value_type (value)) == TYPE_CODE_PTR)
> 
> This should use 'type' instead of calling value_type again.

Fixed.

> Thiago> +	if (extract_unsigned_integer (contents + i*width, width) == 0)
> 
> Need spaces around the "*".

Fixed.

> Thiago> +      /* i is now either the number of non-null characters, or fetchlimit.  */
> 
> Use "I", not "i" .. a tiny nit for the GNU style, where the value of a
> variable is referred to by uppercasing the name.

Ok. I also uppercased fetchlimit.

> Thiago> +      *length = i*width;
> 
> Spaces.

Fixed.

> Thiago> +  /* If the last character is null, subtract it from length.  */
> Thiago> +  if (extract_unsigned_integer (*buffer + *length - width, width) == 0)
> Thiago> +      *length -= width;
> 
> This second line looks like it has too much indentation.

Fixed. I also uppercased length in the comment.

> Also, I suspect this should start "if (*length > 0 &&".
> Otherwise, it seems like it will read memory before *BUFFER when
> *LENGTH==0.

Nice catch again. Fixed.

> Thiago> +error:
> 
> Emacs claims that labels are indented one space in the GNU style.
> I didn't see this in the standards manual though.

Ok, changed.

> Thiago> +++ b/gdb/value.h
> [...]
> Thiago> +extern char * type_to_string (struct type *type);
> 
> Extra space after the "*".

Fixed.

> thanks for persevering,

:-)

I committed the following. Thanks for the review.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


2009-02-05  Thiago Jung Bauermann  <bauerman@br.ibm.com>

	* language.h (language_dfn): Add la_get_string member.
	(LA_GET_STRING): New macro.
	(default_get_string): New prototype.
	* language.c (default_get_string): New function.
	(unknown_language_defn, auto_language_defn, local_language_defn): Use
	default_get_string for la_get_string.
	* c-lang.c (c_get_string): New function.
	(c_language_defn, cplus_language_defn, asm_language_defn): Use
	c_get_string for la_get_string.
	(minimal_language_defn): Likewise
	* ada-lang.c (ada_language_defn): Likewise.
	* f-lang.c (f_language_defn): Use default_get_string for
	la_get_string.
	* jv-lang.c (java_language_defn): Likewise.
	* m2-lang.c (m2_language_defn): Likewise.
	* objc-lang.c (objc_language_defn): Likewise.
	* p-lang.c (p_language_defn): Likewise.
	* scm-lang.c (scm_language_defn): Likewise.
	* typeprint.c (type_to_string): New function.
	* value.h (type_to_string): New prototype.
	* valprint.c (val_print_string): Factor out code for reading string
	from the inferior into its own function.  Put 2 spaces after period
	in comments.
	(read_string): New function.
	* valprint.h (read_string): New prototype.

diff --git a/gdb/ada-lang.c b/gdb/ada-lang.c
index 656e771..a01d45c 100644
--- a/gdb/ada-lang.c
+++ b/gdb/ada-lang.c
@@ -1,7 +1,7 @@
 /* Ada language support routines for GDB, the GNU debugger.  Copyright (C)
 
-   1992, 1993, 1994, 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007
-   Free Software Foundation, Inc.
+   1992, 1993, 1994, 1997, 1998, 1999, 2000, 2003, 2004, 2005, 2007, 2008,
+   2009 Free Software Foundation, Inc.
 
    This file is part of GDB.
 
@@ -11055,6 +11055,7 @@ const struct language_defn ada_language_defn = {
   ada_language_arch_info,
   ada_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/c-lang.c b/gdb/c-lang.c
index 309a0b0..8b5410f 100644
--- a/gdb/c-lang.c
+++ b/gdb/c-lang.c
@@ -181,6 +181,122 @@ c_printstr (struct ui_file *stream, const gdb_byte *string,
   if (force_ellipses || i < length)
     fputs_filtered ("...", stream);
 }
+
+/* Obtain a C string from the inferior storing it in a newly allocated
+   buffer in BUFFER, which should be freed by the caller.  The string is
+   read until a null character is found. If VALUE is an array with known
+   length, the function will not read past the end of the array.  LENGTH
+   will contain the size of the string in bytes (not counting the null
+   character).
+
+   Assumes strings are terminated by a null character.  The size of a character
+   is determined by the length of the target type of the pointer or array.
+   This means that a null byte present in a multi-byte character will not
+   terminate the string unless the whole character is null.
+
+   CHARSET is always set to the target charset.  */
+
+void
+c_get_string (struct value *value, gdb_byte **buffer, int *length,
+	      const char **charset)
+{
+  int err, width;
+  unsigned int fetchlimit;
+  struct type *type = check_typedef (value_type (value));
+  struct type *element_type = TYPE_TARGET_TYPE (type);
+
+  if (element_type == NULL)
+    goto error;
+
+  if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+    {
+      /* If we know the size of the array, we can use it as a limit on the
+	 number of characters to be fetched.  */
+      if (TYPE_NFIELDS (type) == 1
+	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
+	{
+	  LONGEST low_bound, high_bound;
+
+	  get_discrete_bounds (TYPE_FIELD_TYPE (type, 0),
+			       &low_bound, &high_bound);
+	  fetchlimit = high_bound - low_bound + 1;
+	}
+      else
+	fetchlimit = UINT_MAX;
+    }
+  else if (TYPE_CODE (type) == TYPE_CODE_PTR)
+    fetchlimit = UINT_MAX;
+  else
+    /* We work only with arrays and pointers.  */
+    goto error;
+
+  element_type = check_typedef (element_type);
+  if (TYPE_CODE (element_type) != TYPE_CODE_INT
+      && TYPE_CODE (element_type) != TYPE_CODE_CHAR)
+    /* If the elements are not integers or characters, we don't consider it
+       a string.  */
+    goto error;
+
+  width = TYPE_LENGTH (element_type);
+
+  /* If the string lives in GDB's memory intead of the inferior's, then we
+     just need to copy it to BUFFER.  Also, since such strings are arrays
+     with known size, FETCHLIMIT will hold the size of the array.  */
+  if ((VALUE_LVAL (value) == not_lval
+       || VALUE_LVAL (value) == lval_internalvar)
+      && fetchlimit != UINT_MAX)
+    {
+      int i;
+      const gdb_byte *contents = value_contents (value);
+
+      /* Look for a null character.  */
+      for (i = 0; i < fetchlimit; i++)
+	if (extract_unsigned_integer (contents + i * width, width) == 0)
+	  break;
+
+      /* I is now either the number of non-null characters, or FETCHLIMIT.  */
+      *length = i * width;
+      *buffer = xmalloc (*length);
+      memcpy (*buffer, contents, *length);
+      err = 0;
+    }
+  else
+    {
+      err = read_string (value_as_address (value), -1, width, fetchlimit,
+			 buffer, length);
+      if (err)
+	{
+	  xfree (buffer);
+	  error (_("Error reading string from inferior: %s"),
+		 safe_strerror (err));
+	}
+    }
+
+  /* If the last character is null, subtract it from LENGTH.  */
+  if (*length > 0
+      && extract_unsigned_integer (*buffer + *length - width, width) == 0)
+    *length -= width;
+
+  *charset = target_charset ();
+
+  return;
+
+ error:
+  {
+    char *type_str;
+
+    type_str = type_to_string (type);
+    if (type_str)
+      {
+	make_cleanup (xfree, type_str);
+	error (_("Trying to read string with inappropriate type `%s'."),
+	       type_str);
+      }
+    else
+      error (_("Trying to read string with inappropriate type."));
+  }
+}
+
 \f
 /* Preprocessing and parsing C and C++ expressions.  */
 
@@ -314,6 +430,7 @@ const struct language_defn c_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -432,6 +549,7 @@ const struct language_defn cplus_language_defn =
   cplus_language_arch_info,
   default_print_array_index,
   cp_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -469,6 +587,7 @@ const struct language_defn asm_language_defn =
   c_language_arch_info, /* FIXME: la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
@@ -511,6 +630,7 @@ const struct language_defn minimal_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  c_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/f-lang.c b/gdb/f-lang.c
index cf4abaa..6359841 100644
--- a/gdb/f-lang.c
+++ b/gdb/f-lang.c
@@ -343,6 +343,7 @@ const struct language_defn f_language_defn =
   f_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/jv-lang.c b/gdb/jv-lang.c
index 83fc064..0728866 100644
--- a/gdb/jv-lang.c
+++ b/gdb/jv-lang.c
@@ -1128,6 +1128,7 @@ const struct language_defn java_language_defn =
   java_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/language.c b/gdb/language.c
index bdf2ed7..3c37a64 100644
--- a/gdb/language.c
+++ b/gdb/language.c
@@ -1043,6 +1043,13 @@ default_print_array_index (struct value *index_value, struct ui_file *stream,
   fprintf_filtered (stream, "] = ");
 }
 
+void
+default_get_string (struct value *value, gdb_byte **buffer, int *length,
+		    const char **charset)
+{
+  error (_("Getting a string is unsupported in this language."));
+}
+
 /* Define the language that is no language.  */
 
 static int
@@ -1165,6 +1172,7 @@ const struct language_defn unknown_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1203,6 +1211,7 @@ const struct language_defn auto_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
@@ -1240,6 +1249,7 @@ const struct language_defn local_language_defn =
   unknown_language_arch_info,	/* la_language_arch_info.  */
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 \f
diff --git a/gdb/language.h b/gdb/language.h
index e7dd01e..85826fd 100644
--- a/gdb/language.h
+++ b/gdb/language.h
@@ -282,6 +282,14 @@ struct language_defn
        reference at the language level.  */
     int (*la_pass_by_reference) (struct type *type);
 
+    /* Obtain a string from the inferior, storing it in a newly allocated
+       buffer in BUFFER, which should be freed by the caller.  LENGTH will
+       hold the size in bytes of the string (only actual characters, excluding
+       an eventual terminating null character).  CHARSET will hold the encoding
+       used in the string.  */
+    void (*la_get_string) (struct value *value, gdb_byte **buffer, int *length,
+			  const char **charset);
+
     /* Add fields above this point, so the magic number is always last. */
     /* Magic number for compat checking */
 
@@ -380,6 +388,8 @@ extern enum language set_language (enum language);
 				 force_ellipses,options))
 #define LA_EMIT_CHAR(ch, stream, quoter) \
   (current_language->la_emitchar(ch, stream, quoter))
+#define LA_GET_STRING(value, buffer, length, encoding) \
+  (current_language->la_get_string(value, buffer, length, encoding))
 
 #define LA_PRINT_ARRAY_INDEX(index_value, stream, optins) \
   (current_language->la_print_array_index(index_value, stream, options))
@@ -489,4 +499,10 @@ int default_pass_by_reference (struct type *type);
 void default_print_typedef (struct type *type, struct symbol *new_symbol,
 			    struct ui_file *stream);
 
+void default_get_string (struct value *value, gdb_byte **buffer, int *length,
+			 const char **charset);
+
+void c_get_string (struct value *value, gdb_byte **buffer, int *length,
+		   const char **charset);
+
 #endif /* defined (LANGUAGE_H) */
diff --git a/gdb/m2-lang.c b/gdb/m2-lang.c
index a8dd8ad..9e4bb1b 100644
--- a/gdb/m2-lang.c
+++ b/gdb/m2-lang.c
@@ -394,6 +394,7 @@ const struct language_defn m2_language_defn =
   m2_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/objc-lang.c b/gdb/objc-lang.c
index 466221c..a6c74a3 100644
--- a/gdb/objc-lang.c
+++ b/gdb/objc-lang.c
@@ -531,6 +531,7 @@ const struct language_defn objc_language_defn = {
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/p-lang.c b/gdb/p-lang.c
index 0e0de2e..4ba6136 100644
--- a/gdb/p-lang.c
+++ b/gdb/p-lang.c
@@ -433,6 +433,7 @@ const struct language_defn pascal_language_defn =
   pascal_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/scm-lang.c b/gdb/scm-lang.c
index b604e40..345befd 100644
--- a/gdb/scm-lang.c
+++ b/gdb/scm-lang.c
@@ -273,6 +273,7 @@ const struct language_defn scm_language_defn =
   c_language_arch_info,
   default_print_array_index,
   default_pass_by_reference,
+  default_get_string,
   LANG_MAGIC
 };
 
diff --git a/gdb/typeprint.c b/gdb/typeprint.c
index 678aaa1..1f824fa 100644
--- a/gdb/typeprint.c
+++ b/gdb/typeprint.c
@@ -33,6 +33,7 @@
 #include "cp-abi.h"
 #include "typeprint.h"
 #include "gdb_string.h"
+#include "exceptions.h"
 #include "valprint.h"
 #include <errno.h>
 
@@ -78,6 +79,34 @@ type_print (struct type *type, char *varstring, struct ui_file *stream,
   LA_PRINT_TYPE (type, varstring, stream, show, 0);
 }
 
+/* Print TYPE to a string, returning it.  The caller is responsible for
+   freeing the string.  */
+
+char *
+type_to_string (struct type *type)
+{
+  char *s = NULL;
+  long dummy;
+  struct ui_file *stb;
+  struct cleanup *old_chain;
+  volatile struct gdb_exception except;
+
+  stb = mem_fileopen ();
+  old_chain = make_cleanup_ui_file_delete (stb);
+
+  TRY_CATCH (except, RETURN_MASK_ALL)
+    {
+      type_print (type, "", stb, -1);
+      s = ui_file_xstrdup (stb, &dummy);
+    }
+  if (except.reason < 0)
+    s = NULL;
+
+  do_cleanups (old_chain);
+
+  return s;
+}
+
 /* Print type of EXP, or last thing in value history if EXP == NULL.
    show is passed to type_print.  */
 
diff --git a/gdb/valprint.c b/gdb/valprint.c
index b61da54..d13dc4c 100644
--- a/gdb/valprint.c
+++ b/gdb/valprint.c
@@ -1177,43 +1177,44 @@ partial_memory_read (CORE_ADDR memaddr, gdb_byte *myaddr, int len, int *errnoptr
   return (nread);
 }
 
-/*  Print a string from the inferior, starting at ADDR and printing up to LEN
-   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
-   stops at the first null byte, otherwise printing proceeds (including null
-   bytes) until either print_max or LEN characters have been printed,
-   whichever is smaller. */
-
-/* FIXME: Use target_read_string.  */
+/* Read a string from the inferior, at ADDR, with LEN characters of WIDTH bytes
+   each.  Fetch at most FETCHLIMIT characters.  BUFFER will be set to a newly
+   allocated buffer containing the string, which the caller is responsible to
+   free, and BYTES_READ will be set to the number of bytes read.  Returns 0 on
+   success, or errno on failure.
+
+   If LEN > 0, reads exactly LEN characters (including eventual NULs in
+   the middle or end of the string).  If LEN is -1, stops at the first
+   null character (not necessarily the first null byte) up to a maximum
+   of FETCHLIMIT characters.  Set FETCHLIMIT to UINT_MAX to read as many
+   characters as possible from the string.
+
+   Unless an exception is thrown, BUFFER will always be allocated, even on
+   failure.  In this case, some characters might have been read before the
+   failure happened.  Check BYTES_READ to recognize this situation.
+
+   Note: There was a FIXME asking to make this code use target_read_string,
+   but this function is more general (can read past null characters, up to
+   given LEN). Besides, it is used much more often than target_read_string
+   so it is more tested.  Perhaps callers of target_read_string should use
+   this function instead?  */
 
 int
-val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
-		  const struct value_print_options *options)
+read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+	     gdb_byte **buffer, int *bytes_read)
 {
-  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero. */
-  int errcode;			/* Errno returned from bad reads. */
-  unsigned int fetchlimit;	/* Maximum number of chars to print. */
-  unsigned int nfetch;		/* Chars to fetch / chars fetched. */
-  unsigned int chunksize;	/* Size of each fetch, in chars. */
-  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer. */
-  gdb_byte *bufptr;		/* Pointer to next available byte in buffer. */
-  gdb_byte *limit;		/* First location past end of fetch buffer. */
-  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain. */
-  int found_nul;		/* Non-zero if we found the nul char */
-
-  /* First we need to figure out the limit on the number of characters we are
-     going to attempt to fetch and print.  This is actually pretty simple.  If
-     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
-     LEN is -1, then the limit is print_max.  This is true regardless of
-     whether print_max is zero, UINT_MAX (unlimited), or something in between,
-     because finding the null byte (or available memory) is what actually
-     limits the fetch. */
-
-  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
-
-  /* Now decide how large of chunks to try to read in one operation.  This
+  int found_nul;		/* Non-zero if we found the nul char.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  unsigned int nfetch;		/* Chars to fetch / chars fetched.  */
+  unsigned int chunksize;	/* Size of each fetch, in chars.  */
+  gdb_byte *bufptr;		/* Pointer to next available byte in buffer.  */
+  gdb_byte *limit;		/* First location past end of fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* Decide how large of chunks to try to read in one operation.  This
      is also pretty simple.  If LEN >= zero, then we want fetchlimit chars,
      so we might as well read them all in one operation.  If LEN is -1, we
-     are looking for a null terminator to end the fetching, so we might as
+     are looking for a NUL terminator to end the fetching, so we might as
      well read in blocks that are large enough to be efficient, but not so
      large as to be slow if fetchlimit happens to be large.  So we choose the
      minimum of 8 and fetchlimit.  We used to use 200 instead of 8 but
@@ -1221,17 +1222,17 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 
   chunksize = (len == -1 ? min (8, fetchlimit) : fetchlimit);
 
-  /* Loop until we either have all the characters to print, or we encounter
-     some error, such as bumping into the end of the address space. */
+  /* Loop until we either have all the characters, or we encounter
+     some error, such as bumping into the end of the address space.  */
 
   found_nul = 0;
   old_chain = make_cleanup (null_cleanup, 0);
 
   if (len > 0)
     {
-      buffer = (gdb_byte *) xmalloc (len * width);
-      bufptr = buffer;
-      old_chain = make_cleanup (xfree, buffer);
+      *buffer = (gdb_byte *) xmalloc (len * width);
+      bufptr = *buffer;
+      old_chain = make_cleanup (xfree, *buffer);
 
       nfetch = partial_memory_read (addr, bufptr, len * width, &errcode)
 	/ width;
@@ -1241,32 +1242,36 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
   else if (len == -1)
     {
       unsigned long bufsize = 0;
+
+      *buffer = NULL;
+
       do
 	{
 	  QUIT;
 	  nfetch = min (chunksize, fetchlimit - bufsize);
 
-	  if (buffer == NULL)
-	    buffer = (gdb_byte *) xmalloc (nfetch * width);
+	  if (*buffer == NULL)
+	    *buffer = (gdb_byte *) xmalloc (nfetch * width);
 	  else
 	    {
 	      discard_cleanups (old_chain);
-	      buffer = (gdb_byte *) xrealloc (buffer, (nfetch + bufsize) * width);
+	      *buffer = (gdb_byte *) xrealloc (*buffer,
+					       (nfetch + bufsize) * width);
 	    }
 
-	  old_chain = make_cleanup (xfree, buffer);
-	  bufptr = buffer + bufsize * width;
+	  old_chain = make_cleanup (xfree, *buffer);
+	  bufptr = *buffer + bufsize * width;
 	  bufsize += nfetch;
 
-	  /* Read as much as we can. */
+	  /* Read as much as we can.  */
 	  nfetch = partial_memory_read (addr, bufptr, nfetch * width, &errcode)
-	    / width;
+		    / width;
 
-	  /* Scan this chunk for the null byte that terminates the string
+	  /* Scan this chunk for the null character that terminates the string
 	     to print.  If found, we don't need to fetch any more.  Note
 	     that bufptr is explicitly left pointing at the next character
-	     after the null byte, or at the next character after the end of
-	     the buffer. */
+	     after the null character, or at the next character after the end
+	     of the buffer.  */
 
 	  limit = bufptr + nfetch * width;
 	  while (bufptr < limit)
@@ -1279,7 +1284,7 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	      if (c == 0)
 		{
 		  /* We don't care about any error which happened after
-		     the NULL terminator.  */
+		     the NUL terminator.  */
 		  errcode = 0;
 		  found_nul = 1;
 		  break;
@@ -1287,26 +1292,71 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	    }
 	}
       while (errcode == 0	/* no error */
-	     && bufptr - buffer < fetchlimit * width	/* no overrun */
-	     && !found_nul);	/* haven't found nul yet */
+	     && bufptr - *buffer < fetchlimit * width	/* no overrun */
+	     && !found_nul);	/* haven't found NUL yet */
     }
   else
-    {				/* length of string is really 0! */
-      buffer = bufptr = NULL;
+    {				/* Length of string is really 0!  */
+      /* We always allocate *buffer.  */
+      *buffer = bufptr = xmalloc (1);
       errcode = 0;
     }
 
   /* bufptr and addr now point immediately beyond the last byte which we
      consider part of the string (including a '\0' which ends the string).  */
+  *bytes_read = bufptr - *buffer;
+
+  QUIT;
+
+  discard_cleanups (old_chain);
+
+  return errcode;
+}
+
+/* Print a string from the inferior, starting at ADDR and printing up to LEN
+   characters, of WIDTH bytes a piece, to STREAM.  If LEN is -1, printing
+   stops at the first null byte, otherwise printing proceeds (including null
+   bytes) until either print_max or LEN characters have been printed,
+   whichever is smaller.  */
+
+int
+val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
+		  const struct value_print_options *options)
+{
+  int force_ellipsis = 0;	/* Force ellipsis to be printed if nonzero.  */
+  int errcode;			/* Errno returned from bad reads.  */
+  int found_nul;		/* Non-zero if we found the nul char */
+  unsigned int fetchlimit;	/* Maximum number of chars to print.  */
+  int bytes_read;
+  gdb_byte *buffer = NULL;	/* Dynamically growable fetch buffer.  */
+  struct cleanup *old_chain = NULL;	/* Top of the old cleanup chain.  */
+
+  /* First we need to figure out the limit on the number of characters we are
+     going to attempt to fetch and print.  This is actually pretty simple.  If
+     LEN >= zero, then the limit is the minimum of LEN and print_max.  If
+     LEN is -1, then the limit is print_max.  This is true regardless of
+     whether print_max is zero, UINT_MAX (unlimited), or something in between,
+     because finding the null byte (or available memory) is what actually
+     limits the fetch.  */
+
+  fetchlimit = (len == -1 ? options->print_max : min (len, options->print_max));
+
+  errcode = read_string (addr, len, width, fetchlimit, &buffer, &bytes_read);
+  old_chain = make_cleanup (xfree, buffer);
+
+  addr += bytes_read;
 
   /* We now have either successfully filled the buffer to fetchlimit, or
-     terminated early due to an error or finding a null char when LEN is -1. */
+     terminated early due to an error or finding a null char when LEN is -1.  */
+
+  /* Determine found_nul by looking at the last character read.  */
+  found_nul = extract_unsigned_integer (buffer + bytes_read - width, width) == 0;
 
   if (len == -1 && !found_nul)
     {
       gdb_byte *peekbuf;
 
-      /* We didn't find a null terminator we were looking for.  Attempt
+      /* We didn't find a NUL terminator we were looking for.  Attempt
          to peek at the next character.  If not successful, or it is not
          a null byte, then force ellipsis to be printed.  */
 
@@ -1316,26 +1366,24 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	  && extract_unsigned_integer (peekbuf, width) != 0)
 	force_ellipsis = 1;
     }
-  else if ((len >= 0 && errcode != 0) || (len > (bufptr - buffer) / width))
+  else if ((len >= 0 && errcode != 0) || (len > bytes_read / width))
     {
       /* Getting an error when we have a requested length, or fetching less
          than the number of characters actually requested, always make us
-         print ellipsis. */
+         print ellipsis.  */
       force_ellipsis = 1;
     }
 
-  QUIT;
-
   /* If we get an error before fetching anything, don't print a string.
      But if we fetch something and then get an error, print the string
      and then the error message.  */
-  if (errcode == 0 || bufptr > buffer)
+  if (errcode == 0 || bytes_read > 0)
     {
       if (options->addressprint)
 	{
 	  fputs_filtered (" ", stream);
 	}
-      LA_PRINT_STRING (stream, buffer, (bufptr - buffer) / width, width, force_ellipsis, options);
+      LA_PRINT_STRING (stream, buffer, bytes_read / width, width, force_ellipsis, options);
     }
 
   if (errcode != 0)
@@ -1353,9 +1401,11 @@ val_print_string (CORE_ADDR addr, int len, int width, struct ui_file *stream,
 	  fprintf_filtered (stream, ": %s>", safe_strerror (errcode));
 	}
     }
+
   gdb_flush (stream);
   do_cleanups (old_chain);
-  return ((bufptr - buffer) / width);
+
+  return (bytes_read / width);
 }
 \f
 
diff --git a/gdb/valprint.h b/gdb/valprint.h
index 6e85bb4..8b65af6 100644
--- a/gdb/valprint.h
+++ b/gdb/valprint.h
@@ -136,4 +136,7 @@ extern void print_hex_chars (struct ui_file *, const gdb_byte *,
 
 extern void print_char_chars (struct ui_file *, const gdb_byte *,
 			      unsigned int, enum bfd_endian);
+
+int read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
+		 gdb_byte **buffer, int *bytes_read);
 #endif
diff --git a/gdb/value.h b/gdb/value.h
index 724aa5a..9a11190 100644
--- a/gdb/value.h
+++ b/gdb/value.h
@@ -524,6 +524,8 @@ extern void modify_field (gdb_byte *addr, LONGEST fieldval, int bitpos,
 extern void type_print (struct type *type, char *varstring,
 			struct ui_file *stream, int show);
 
+extern char *type_to_string (struct type *type);
+
 extern gdb_byte *baseclass_addr (struct type *type, int index,
 				 gdb_byte *valaddr,
 				 struct value **valuep, int *errp);



^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFA] Add la_getstr member to language_defn
  2009-02-05 12:21                             ` Thiago Jung Bauermann
@ 2009-02-05 16:01                               ` Pierre Muller
  2009-02-05 16:30                                 ` Thiago Jung Bauermann
  2009-02-05 16:46                                 ` Tom Tromey
  0 siblings, 2 replies; 33+ messages in thread
From: Pierre Muller @ 2009-02-05 16:01 UTC (permalink / raw)
  To: 'Thiago Jung Bauermann'; +Cc: 'gdb-patches ml'


  Hi Thiago,

  I just wrote a pascal version of your c_get_string,
which basically calls c_get_string but also handles
Pascal string types specifically.

  The problem is that I found no reference to 
this la_get_string in the code yet, so that I am
unable to really check if my patch is correct.

  Is this in some submitted patch for python
or isn't it ready for submission yet?
Could you please send me a pointer to that patch?


Pierre Muller
Pascal language support maintainer for GDB


This is the function I wrote:


static void
pascal_get_string (struct value *value, gdb_byte **buffer, int *length,
              const char **charset)
{
  int err, width;
  unsigned int fetchlimit;
  struct type *type = check_typedef (value_type (value));
  struct type *element_type;
  int length_pos, length_size, string_pos;

  if (is_pascal_string_type (type, &length_pos, &length_size,
                             &string_pos, &width, NULL))
    {
      CORE_ADDR addr = value_as_address (value);
      ULONGEST string_length;
      *buffer = xmalloc (length_size);
      if (target_read_memory (addr + length_pos, *buffer, length_size)
          == length_size)
        {
          string_length = extract_unsigned_integer (*buffer, length_size);
          xfree (*buffer);
          *buffer = xmalloc (length_size * width);
          fetchlimit = string_length;
          err = read_string (addr + string_pos, string_length,
                             width, fetchlimit, buffer, length);
          *charset = target_charset ();
        }
      else
        err = 1;
      if (err)
        {
          xfree (*buffer);
          error (_("Error reading pascal string from inferior: %s"),
                 safe_strerror (err));
        }
      return;
    }
  else
    c_get_string (value, buffer, length, charset);
}


^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFA] Add la_getstr member to language_defn
  2009-02-05 16:01                               ` Pierre Muller
@ 2009-02-05 16:30                                 ` Thiago Jung Bauermann
  2009-02-05 16:46                                 ` Tom Tromey
  1 sibling, 0 replies; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-05 16:30 UTC (permalink / raw)
  To: Pierre Muller; +Cc: 'gdb-patches ml'

El jue, 05-02-2009 a las 17:01 +0100, Pierre Muller escribiÃ³:
>   I just wrote a pascal version of your c_get_string,
> which basically calls c_get_string but also handles
> Pascal string types specifically.

This is great. Thanks!

>   The problem is that I found no reference to 
> this la_get_string in the code yet, so that I am
> unable to really check if my patch is correct.

I committed the patch a few hours ago. If you update your CVS checkout,
you should see it (unless I messed up the commit, which I do
sometimes!).

> This is the function I wrote:

Your function looks good AFAICS.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-05 16:01                               ` Pierre Muller
  2009-02-05 16:30                                 ` Thiago Jung Bauermann
@ 2009-02-05 16:46                                 ` Tom Tromey
  1 sibling, 0 replies; 33+ messages in thread
From: Tom Tromey @ 2009-02-05 16:46 UTC (permalink / raw)
  To: Pierre Muller; +Cc: 'Thiago Jung Bauermann', 'gdb-patches ml'

>>>>> "Pierre" == Pierre Muller <muller@ics.u-strasbg.fr> writes:

Pierre>   The problem is that I found no reference to 
Pierre> this la_get_string in the code yet, so that I am
Pierre> unable to really check if my patch is correct.

Pierre>   Is this in some submitted patch for python
Pierre> or isn't it ready for submission yet?
Pierre> Could you please send me a pointer to that patch?

It is here:

http://sourceware.org/ml/gdb-patches/2009-01/msg00004.html

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-04 22:26                         ` Thiago Jung Bauermann
  2009-02-05  0:55                           ` Tom Tromey
@ 2009-02-05 15:55                           ` Tom Tromey
  2009-02-05 16:07                             ` Pierre Muller
  1 sibling, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2009-02-05 15:55 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Thiago> Right. Here's the new version. It also uses c_get_string for Ada and
Thiago> minimal...

Last night I thought of something else...

Thiago> +   Assumes strings are terminated by a null character.
[...]
Thiago> +void
Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int *length,
Thiago> +	      const char **charset)

My understanding is that the contract for la_getstr says that the
resulting string will be zero-terminated.  But if that is the case...

Thiago> +read_string (CORE_ADDR addr, int len, int width, unsigned int fetchlimit,
[...]
Thiago> +    {				/* Length of string is really 0!  */
Thiago> +      /* We always allocate *buffer.  */
Thiago> +      *buffer = bufptr = xmalloc (1);

... I think this code ought to allocate a buffer of WIDTH bytes, and
set them all to 0.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFA] Add la_getstr member to language_defn
  2009-02-05 15:55                           ` Tom Tromey
@ 2009-02-05 16:07                             ` Pierre Muller
  2009-02-05 16:33                               ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Pierre Muller @ 2009-02-05 16:07 UTC (permalink / raw)
  To: 'Tom Tromey', 'Thiago Jung Bauermann'
  Cc: 'Daniel Jacobowitz', 'gdb-patches ml'



> -----Message d'origine-----
> De : gdb-patches-owner@sourceware.org [mailto:gdb-patches-
> owner@sourceware.org] De la part de Tom Tromey
> Envoyé : Thursday, February 05, 2009 4:55 PM
> À : Thiago Jung Bauermann
> Cc : Daniel Jacobowitz; gdb-patches ml
> Objet : Re: [RFA] Add la_getstr member to language_defn
> 
> >>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:
> 
> Thiago> Right. Here's the new version. It also uses c_get_string for
> Ada and
> Thiago> minimal...
> 
> Last night I thought of something else...
> 
> Thiago> +   Assumes strings are terminated by a null character.
> [...]
> Thiago> +void
> Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int
> *length,
> Thiago> +	      const char **charset)
> 
> My understanding is that the contract for la_getstr says that the
> resulting string will be zero-terminated.  But if that is the case...

But this would also mean that the string could not contain
zero in the middle, which sometimes happens for pascal strings...

 Isn't the length argument there for this?


Pierre Muller
Pascal language support maintainer for GDB





^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [RFA] Add la_getstr member to language_defn
  2009-02-05 16:07                             ` Pierre Muller
@ 2009-02-05 16:33                               ` Thiago Jung Bauermann
  2009-02-05 16:46                                 ` Tom Tromey
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-05 16:33 UTC (permalink / raw)
  To: Pierre Muller
  Cc: 'Tom Tromey', 'Daniel Jacobowitz',
	'gdb-patches ml'

El jue, 05-02-2009 a las 17:06 +0100, Pierre Muller escribiÃ³:
> > Thiago> +   Assumes strings are terminated by a null character.
> > [...]
> > Thiago> +void
> > Thiago> +c_get_string (struct value *value, gdb_byte **buffer, int
> > *length,
> > Thiago> +	      const char **charset)
> > 
> > My understanding is that the contract for la_getstr says that the
> > resulting string will be zero-terminated.  But if that is the case...
> 
> But this would also mean that the string could not contain
> zero in the middle, which sometimes happens for pascal strings...
> 
>  Isn't the length argument there for this?

Right. The contract (the comment above la_get_string in language_defn,
assuming it is legally binding :-) ) doesn't say that the string should
be null-terminated. And the length argument doesn't count the ending
null character, should it be present.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-05 16:33                               ` Thiago Jung Bauermann
@ 2009-02-05 16:46                                 ` Tom Tromey
  0 siblings, 0 replies; 33+ messages in thread
From: Tom Tromey @ 2009-02-05 16:46 UTC (permalink / raw)
  To: Thiago Jung Bauermann
  Cc: Pierre Muller, 'Daniel Jacobowitz', 'gdb-patches ml'

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Thiago> Right. The contract (the comment above la_get_string in language_defn,
Thiago> assuming it is legally binding :-) ) doesn't say that the string should
Thiago> be null-terminated. And the length argument doesn't count the ending
Thiago> null character, should it be present.

Ah, thanks.  I misunderstood the comment in language.h.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-01-03  2:27             ` Thiago Jung Bauermann
  2009-02-02 18:42               ` Tom Tromey
@ 2009-02-03  0:23               ` Joel Brobecker
  2009-02-03 13:02                 ` Thiago Jung Bauermann
  1 sibling, 1 reply; 33+ messages in thread
From: Joel Brobecker @ 2009-02-03  0:23 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

> The patch doesn't apply anymore. This is the same patch, refreshed
> against HEAD as of Dec 28th. Ok?

Just a few comments and questions in addition to Tom's comments...

> +      if ((TYPE_NFIELDS (type) == 1)
> +	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)

The extra parens around "TYPE_NFIELDS (type) == 1" shouldn't be
necessary, right? In this case, it's pretty harmless, but a little
bit below, this really starts making it hard to read a condition...

> +  if (((VALUE_LVAL (value) == not_lval)
> +      || (VALUE_LVAL (value) == lval_internalvar)) && (fetchlimit != UINT_MAX))

Can this be formatted as follow:

  if ((VALUE_LVAL (value) == not_lval
       || VALUE_LVAL (value) == lval_internalvar)
      && fetchlimit != UINT_MAX)

? (assuming my reading is correct!) The reason I like my suggestion
is because there are less parentheses, so it's easier to match them
without using my favorite editor; also, the formatting makes it clear
at which level the || and the && operators are.

> +  if ((TYPE_CODE (element_type) != TYPE_CODE_INT)
> +      && (TYPE_CODE (element_type) != TYPE_CODE_CHAR))

Same here.

> @@ -511,6 +626,7 @@ const struct language_defn minimal_language_defn =
>    c_language_arch_info,
>    default_print_array_index,
>    default_pass_by_reference,
> +  default_get_string,
>    LANG_MAGIC
>  };

I was wondering if it wouldn't be more useful to use the c_get_string
function as the default rather than the default_get_string stub.
What do you guys think? Granted, I know that there are cases for Ada
where this isn't going to be enough, but I also know that it's going
to handle some cases fine. I don't know how other languages such as
Fortran or Pascal encode their strings, but chances are that if we
call this function with a value whose type is an array of chars/ints
or a char/int pointer, the c_get_string value would probably work.

As for the Ada implementation, it's probably going to look like this:

  if (value_type is char/int_array or char/int_pointer)
    {
      /* A string a-la-C: Call the C get_string routine...  */
      c_get_string (...);
      return;
    }

  /* Ada-specific strings handled here.  */

So setting the la_get_string method to c_get_string would be a good
starting point for Ada.  The implementation is robust enough that
we'd get an error if the value type is not handled properly, which
is exactly what the default_get_string routine always does.

-- 
Joel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-03  0:23               ` Joel Brobecker
@ 2009-02-03 13:02                 ` Thiago Jung Bauermann
  2009-02-03 17:01                   ` Joel Brobecker
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-03 13:02 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: Daniel Jacobowitz, gdb-patches ml

El lun, 02-02-2009 a las 16:23 -0800, Joel Brobecker escribiÃ³:
> > The patch doesn't apply anymore. This is the same patch, refreshed
> > against HEAD as of Dec 28th. Ok?
> 
> Just a few comments and questions in addition to Tom's comments...

Great. Thanks!

> > +      if ((TYPE_NFIELDS (type) == 1)
> > +	  && TYPE_CODE (TYPE_FIELD_TYPE (type, 0)) == TYPE_CODE_RANGE)
> 
> The extra parens around "TYPE_NFIELDS (type) == 1" shouldn't be
> necessary, right? In this case, it's pretty harmless, but a little
> bit below, this really starts making it hard to read a condition...

Right. Changed.

> > +  if (((VALUE_LVAL (value) == not_lval)
> > +      || (VALUE_LVAL (value) == lval_internalvar)) && (fetchlimit != UINT_MAX))
> 
> Can this be formatted as follow:
> 
>   if ((VALUE_LVAL (value) == not_lval
>        || VALUE_LVAL (value) == lval_internalvar)
>       && fetchlimit != UINT_MAX)
> 
> ? (assuming my reading is correct!) The reason I like my suggestion
> is because there are less parentheses, so it's easier to match them
> without using my favorite editor; also, the formatting makes it clear
> at which level the || and the && operators are.

Changed.

> > +  if ((TYPE_CODE (element_type) != TYPE_CODE_INT)
> > +      && (TYPE_CODE (element_type) != TYPE_CODE_CHAR))
> 
> Same here.

Ok.

> > @@ -511,6 +626,7 @@ const struct language_defn minimal_language_defn =
> >    c_language_arch_info,
> >    default_print_array_index,
> >    default_pass_by_reference,
> > +  default_get_string,
> >    LANG_MAGIC
> >  };
> 
> I was wondering if it wouldn't be more useful to use the c_get_string
> function as the default rather than the default_get_string stub.
> What do you guys think?

I don't have an opinion here, since I know little about how other
languages represent strings...
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-03 13:02                 ` Thiago Jung Bauermann
@ 2009-02-03 17:01                   ` Joel Brobecker
  2009-02-03 17:40                     ` Tom Tromey
  2009-02-05 17:01                     ` Thiago Jung Bauermann
  0 siblings, 2 replies; 33+ messages in thread
From: Joel Brobecker @ 2009-02-03 17:01 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

> > I was wondering if it wouldn't be more useful to use the c_get_string
> > function as the default rather than the default_get_string stub.
> > What do you guys think?
> 
> I don't have an opinion here, since I know little about how other
> languages represent strings...

I think it's worth a try, but probably Tom and others might want
to comment on that. From my end, for Ada, c_get_string is  a good
intermediate solution until the full solution is implemented.
For the "minimal language", I think that this routine would be
the best we could do. Not sure about m2 and Fortran, but maybe
this can equally help...

-- 
Joel


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-03 17:01                   ` Joel Brobecker
@ 2009-02-03 17:40                     ` Tom Tromey
  2009-02-05 17:01                     ` Thiago Jung Bauermann
  1 sibling, 0 replies; 33+ messages in thread
From: Tom Tromey @ 2009-02-03 17:40 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: Thiago Jung Bauermann, Daniel Jacobowitz, gdb-patches ml

>>>>> "Joel" == Joel Brobecker <brobecker@adacore.com> writes:

Joel> I think it's worth a try, but probably Tom and others might want
Joel> to comment on that. From my end, for Ada, c_get_string is  a good
Joel> intermediate solution until the full solution is implemented.
Joel> For the "minimal language", I think that this routine would be
Joel> the best we could do. Not sure about m2 and Fortran, but maybe
Joel> this can equally help...

It isn't the right thing for Java, but I don't think that matters much.
I'll look at writing the correct function for Java as a follow-up.

I'm ok either with the current patch (plus changes to Ada and the
minimal language to use c_get_string) or with your suggestion.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-03 17:01                   ` Joel Brobecker
  2009-02-03 17:40                     ` Tom Tromey
@ 2009-02-05 17:01                     ` Thiago Jung Bauermann
  2009-02-05 23:51                       ` Joel Brobecker
  1 sibling, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2009-02-05 17:01 UTC (permalink / raw)
  To: Joel Brobecker; +Cc: Daniel Jacobowitz, gdb-patches ml

El mar, 03-02-2009 a las 09:01 -0800, Joel Brobecker escribiÃ³:
> > > I was wondering if it wouldn't be more useful to use the c_get_string
> > > function as the default rather than the default_get_string stub.
> > > What do you guys think?
> > 
> > I don't have an opinion here, since I know little about how other
> > languages represent strings...
> 
> I think it's worth a try, but probably Tom and others might want
> to comment on that. From my end, for Ada, c_get_string is  a good
> intermediate solution until the full solution is implemented.
> For the "minimal language", I think that this routine would be
> the best we could do. Not sure about m2 and Fortran, but maybe
> this can equally help...

For the time being I committed the patch with default_get_string, but
with Ada and minimal using c_get_string, I hope you don't mind... We can
change this when someone knowledgeable about m2 and fortran speaks up.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2009-02-05 17:01                     ` Thiago Jung Bauermann
@ 2009-02-05 23:51                       ` Joel Brobecker
  0 siblings, 0 replies; 33+ messages in thread
From: Joel Brobecker @ 2009-02-05 23:51 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: Daniel Jacobowitz, gdb-patches ml

> For the time being I committed the patch with default_get_string, but
> with Ada and minimal using c_get_string, I hope you don't mind... We can
> change this when someone knowledgeable about m2 and fortran speaks up.

Not at all. I think this is fine, it was actually an open question.
Thanks for doing this.

-- 
Joel


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 13:24 [RFA] Add la_getstr member to language_defn Thiago Jung Bauermann
  2008-11-24 13:32 ` Daniel Jacobowitz
@ 2008-11-24 20:03 ` Tom Tromey
  2008-11-24 21:19   ` Thiago Jung Bauermann
  1 sibling, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2008-11-24 20:03 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Thiago> Each language supported by GDB can provide a way to read
Thiago> strings from a value by suppling its version of the
Thiago> function. I implemented it for C, in c_getstr.

Thanks.  For those not tracking the Python branch, this functionality
is important to many scripts.

Thiago> 	* language.h (language_dfn): Add la_getstr member.
Thiago> 	(LA_GET_STRING): New macro.

A nit: the macro is called LA_GET_STRING, but the field is la_getstr.
How about la_get_string for the field instead?

Thiago> +    int (*la_getstr) (struct value *value, gdb_byte **buffer, int *length);

I was thinking about writing this function for Java, sort of as a
proof of the API.  One oddity here is that a String there has a fixed
encoding, which may or may not be the same as the target charset (and
in any case, is not convertible using the charset.c code).

One idea for fixing this is to let this new method optionally return
an encoding.  That way a language implementation could fill in this
info if it is known.  The C implementation would simply do nothing
here.

What do you think?

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 20:03 ` Tom Tromey
@ 2008-11-24 21:19   ` Thiago Jung Bauermann
  2008-11-25  0:55     ` Tom Tromey
  0 siblings, 1 reply; 33+ messages in thread
From: Thiago Jung Bauermann @ 2008-11-24 21:19 UTC (permalink / raw)
  To: tromey; +Cc: gdb-patches ml

El lun, 24-11-2008 a las 09:44 -0700, Tom Tromey escribiÃ³:
> Thiago> 	* language.h (language_dfn): Add la_getstr member.
> Thiago> 	(LA_GET_STRING): New macro.
> 
> A nit: the macro is called LA_GET_STRING, but the field is la_getstr.
> How about la_get_string for the field instead?

I was just following the la_printstr/LA_PRINT_STRING example.
I agree la_get_string is better. Since there doesn't seem to be a strict
pattern for these names, I'll make the change.

> Thiago> +    int (*la_getstr) (struct value *value, gdb_byte **buffer, int *length);
> 
> I was thinking about writing this function for Java, sort of as a
> proof of the API.  One oddity here is that a String there has a fixed
> encoding, which may or may not be the same as the target charset (and
> in any case, is not convertible using the charset.c code).
> 
> One idea for fixing this is to let this new method optionally return
> an encoding.  That way a language implementation could fill in this
> info if it is known.  The C implementation would simply do nothing
> here.
> 
> What do you think?

I'm fine with that. What about adding a const char **encoding argument?
c_getstr could return the value from target_charset.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-24 21:19   ` Thiago Jung Bauermann
@ 2008-11-25  0:55     ` Tom Tromey
  2008-11-25 11:27       ` Thiago Jung Bauermann
  0 siblings, 1 reply; 33+ messages in thread
From: Tom Tromey @ 2008-11-25  0:55 UTC (permalink / raw)
  To: Thiago Jung Bauermann; +Cc: gdb-patches ml

>>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:

Thiago> I was just following the la_printstr/LA_PRINT_STRING example.
Thiago> I agree la_get_string is better. Since there doesn't seem to
Thiago> be a strict pattern for these names, I'll make the change.

Oh, I didn't realize there were already discrepancies.  Sorry about that.

Thiago> I'm fine with that. What about adding a const char **encoding argument?
Thiago> c_getstr could return the value from target_charset.

Sounds reasonable to me.

Tom

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [RFA] Add la_getstr member to language_defn
  2008-11-25  0:55     ` Tom Tromey
@ 2008-11-25 11:27       ` Thiago Jung Bauermann
  0 siblings, 0 replies; 33+ messages in thread
From: Thiago Jung Bauermann @ 2008-11-25 11:27 UTC (permalink / raw)
  To: Tom Tromey; +Cc: gdb-patches ml

El lun, 24-11-2008 a las 13:10 -0700, Tom Tromey escribiÃ³:
> >>>>> "Thiago" == Thiago Jung Bauermann <bauerman@br.ibm.com> writes:
> 
> Thiago> I was just following the la_printstr/LA_PRINT_STRING example.
> Thiago> I agree la_get_string is better. Since there doesn't seem to
> Thiago> be a strict pattern for these names, I'll make the change.
> 
> Oh, I didn't realize there were already discrepancies.  Sorry about that.

No problem. I renamed the function.

> Thiago> I'm fine with that. What about adding a const char **encoding argument?
> Thiago> c_getstr could return the value from target_charset.
> 
> Sounds reasonable to me.

It is present in the version of the patch I just sent.
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2009-02-05 23:51 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-24 13:24 [RFA] Add la_getstr member to language_defn Thiago Jung Bauermann
2008-11-24 13:32 ` Daniel Jacobowitz
2008-11-24 14:59   ` Thiago Jung Bauermann
2008-11-24 16:19     ` Daniel Jacobowitz
2008-11-24 20:22       ` Thiago Jung Bauermann
2008-11-25  2:16         ` Daniel Jacobowitz
2008-11-25  8:53           ` Thiago Jung Bauermann
2009-01-03  2:27             ` Thiago Jung Bauermann
2009-02-02 18:42               ` Tom Tromey
2009-02-03 12:51                 ` Thiago Jung Bauermann
2009-02-03 17:44                   ` Tom Tromey
2009-02-04 12:37                     ` Thiago Jung Bauermann
2009-02-04 19:19                       ` Tom Tromey
2009-02-04 22:26                         ` Thiago Jung Bauermann
2009-02-05  0:55                           ` Tom Tromey
2009-02-05 12:21                             ` Thiago Jung Bauermann
2009-02-05 16:01                               ` Pierre Muller
2009-02-05 16:30                                 ` Thiago Jung Bauermann
2009-02-05 16:46                                 ` Tom Tromey
2009-02-05 15:55                           ` Tom Tromey
2009-02-05 16:07                             ` Pierre Muller
2009-02-05 16:33                               ` Thiago Jung Bauermann
2009-02-05 16:46                                 ` Tom Tromey
2009-02-03  0:23               ` Joel Brobecker
2009-02-03 13:02                 ` Thiago Jung Bauermann
2009-02-03 17:01                   ` Joel Brobecker
2009-02-03 17:40                     ` Tom Tromey
2009-02-05 17:01                     ` Thiago Jung Bauermann
2009-02-05 23:51                       ` Joel Brobecker
2008-11-24 20:03 ` Tom Tromey
2008-11-24 21:19   ` Thiago Jung Bauermann
2008-11-25  0:55     ` Tom Tromey
2008-11-25 11:27       ` Thiago Jung Bauermann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox