* bfd function to read ELF file image from memory
@ 2003-05-14 1:28 Roland McGrath
2003-05-16 11:24 ` Nick Clifton
0 siblings, 1 reply; 9+ messages in thread
From: Roland McGrath @ 2003-05-14 1:28 UTC (permalink / raw)
To: binutils, gdb-patches
This bfd patch adds a function that creates an in-memory BFD by reading
from process memory to read ELF headers and from that determine the size of
the whole file image to read out of the process. I am using this in gdb to
read the Linux vsyscall DSO image from a live process; the details of what
that is all about have been discussed on the gdb@sources.redhat.com list.
I wrote this as a BFD function that takes a callback function pointer
with the same signature as gdb's `target_read_memory'.
This function is not really all that generally useful as it is. It assumes
that the memory image simply maps the whole file image verbatim. That
works great for the vsyscall DSO. But for a normal DSO that has a data
segment, that is not how the memory looks. For those, this will create a
bfd that has bogus contents for the parts of the file after the end of the
text segment; the data segment will be bogus, and so will the section
headers because they appear at the end of the file (and perhaps not at all
in the memory image). Ultimately a more useful function would be one that
only tries to read the contents of the PT_LOAD segments and places their
contents appropriately in the bfd's in-memory file image. As things stand
in gdb we need the section headers for the vsyscall DSO, but it so happens
that its sole PT_LOAD segment when taken with alignment covers the whole
file image. So I may rewrite it that way.
But all of that is a bit of a digression. I'm posting now because I don't
quite know where to put this function even it its implementation were
perfect. It was by far easier to write it in elfcode.h than to put it
elsewhere. It gets to use the local helper code there, and automagically
define 32 and 64 bit flavors, which keeps the code quite simple. It would
be a lot more hair to put the code elsewhere, copy the header swapping
code, and do the 32/64 versions half as cleanly.
The proper way to put it there is to make it yet another bfd target vector
function. I don't know what all rigamorole is involved in doing that, and
it seems like overkill for something probably only ever used by gdb and
only with ELF.
For testing purposes, I have added a gdb command that makes use of this.
It's code that doesn't need to be target-specific except for by what name
to call this function. Right now I just have it kludged to do runtime
checks with bfd_get_flavour and bfd_get_arch_size, but that is code that
won't build properly if libbfd doesn't have some elf32 and elf64 targets
configured in.
Can anyone offer advice on where this function ought to live? If it
doesn't live in the bfd elf backend, then I'll have to copy or
hand-integrate some sanity checking and byte-swapping code for ELF headers.
Thanks,
Roland
--- elfcode.h.~1.41.~ Sat May 10 15:09:29 2003
+++ elfcode.h Tue May 13 17:49:15 2003
@@ -1568,6 +1568,201 @@ elf_symbol_flags (flags)
}
#endif
\f
+/* Create a new BFD as if by bfd_openr. Rather than opening a file, read
+ an ELF file image out of remote memory based on the ELF file header
+ found at EHDR_VMA. The function TARGET_READ_MEMORY is called to read
+ remote memory regions by VMA. TEMPL must be a BFD for a target with the
+ word size and byte order found in the remote memory. */
+
+bfd *
+NAME(bfd_elf,bfd_from_memory)
+ (bfd *templ, bfd_vma ehdr_vma,
+ int (*target_read_memory) (bfd_vma vma, char *myaddr, int len))
+{
+ Elf_External_Ehdr x_ehdr; /* Elf file header, external form */
+ Elf_Internal_Ehdr ehdr; /* Elf file header, internal form */
+ bfd *nbfd;
+ struct bfd_in_memory *bim;
+ int contents_size;
+ char *contents;
+ int err;
+
+ /* Read in the ELF header in external format. */
+ err = target_read_memory (ehdr_vma, (char *) &x_ehdr, sizeof x_ehdr);
+ if (err)
+ {
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+
+ /* Now check to see if we have a valid ELF file, and one that BFD can
+ make use of. The magic number must match, the address size ('class')
+ and byte-swapping must match our XVEC entry. */
+
+ if (! elf_file_p (&x_ehdr)
+ || x_ehdr.e_ident[EI_VERSION] != EV_CURRENT
+ || x_ehdr.e_ident[EI_CLASS] != ELFCLASS)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ /* Check that file's byte order matches xvec's */
+ switch (x_ehdr.e_ident[EI_DATA])
+ {
+ case ELFDATA2MSB: /* Big-endian */
+ if (! bfd_header_big_endian (templ))
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+ break;
+ case ELFDATA2LSB: /* Little-endian */
+ if (! bfd_header_little_endian (templ))
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+ break;
+ case ELFDATANONE: /* No data encoding specified */
+ default: /* Unknown data encoding specified */
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ elf_swap_ehdr_in (templ, &x_ehdr, &ehdr);
+
+ /* The file header tells where to find the phdrs and section headers,
+ which tells us how big the file is. */
+
+ contents_size = ehdr.e_ehsize;
+#define BUMP(to) if (contents_size < (int) (to)) contents_size = (to)
+
+ BUMP (ehdr.e_phoff + ehdr.e_phnum * ehdr.e_phentsize);
+ if (ehdr.e_phentsize == sizeof (Elf_External_Phdr))
+ {
+ unsigned int i;
+ Elf_Internal_Phdr phdr;
+ Elf_External_Phdr *x_phdrs
+ = (Elf_External_Phdr *) bfd_malloc (ehdr.e_phnum * sizeof *x_phdrs);
+ if (x_phdrs == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ err = target_read_memory (ehdr_vma + ehdr.e_phoff,
+ (char *) x_phdrs,
+ ehdr.e_phnum * sizeof *x_phdrs);
+ if (err)
+ {
+ free (x_phdrs);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+ for (i = 0; i < ehdr.e_phnum; ++i)
+ {
+ elf_swap_phdr_in (templ, &x_phdrs[i], &phdr);
+ BUMP (phdr.p_offset + phdr.p_filesz);
+ }
+ free (x_phdrs);
+ }
+ else if (ehdr.e_phentsize != 0 && ehdr.e_phnum != 0)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ BUMP (ehdr.e_shoff + ehdr.e_shnum * ehdr.e_shentsize);
+ if (ehdr.e_shentsize == sizeof (Elf_External_Shdr))
+ {
+ unsigned int i;
+ Elf_Internal_Shdr shdr;
+ Elf_External_Shdr *x_shdrs
+ = (Elf_External_Shdr *) bfd_malloc (ehdr.e_shnum * sizeof *x_shdrs);
+ if (x_shdrs == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ err = target_read_memory (ehdr_vma + ehdr.e_shoff,
+ (char *) x_shdrs,
+ ehdr.e_shnum * sizeof *x_shdrs);
+ if (err)
+ {
+ free (x_shdrs);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+ for (i = 0; i < ehdr.e_shnum; ++i)
+ {
+ elf_swap_shdr_in (templ, &x_shdrs[i], &shdr);
+ BUMP (shdr.sh_offset
+ + ((shdr.sh_flags & SHF_ALLOC) ? shdr.sh_size : 0));
+ }
+ free (x_shdrs);
+ }
+ else if (ehdr.e_shentsize != 0 && ehdr.e_shnum != 0)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+#undef BUMP
+
+ /* Now we know the size of the whole image we want read in. */
+
+ contents = (char *) bfd_malloc ((bfd_size_type) contents_size);
+ if (contents == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ /* We already read the ELF header, no need to copy it in again. */
+ memcpy (contents, &x_ehdr, sizeof x_ehdr);
+ err = target_read_memory (ehdr_vma + sizeof x_ehdr, contents + sizeof x_ehdr,
+ contents_size - sizeof x_ehdr);
+ if (err)
+ {
+ free (contents);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+
+ /* Now we have a memory image of the ELF file contents. Make a BFD. */
+ bim = ((struct bfd_in_memory *)
+ bfd_malloc ((bfd_size_type) sizeof (struct bfd_in_memory)));
+ if (bim == NULL)
+ {
+ free (contents);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ nbfd = _bfd_new_bfd ();
+ if (nbfd == NULL)
+ {
+ free (bim);
+ free (contents);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ nbfd->filename = "<in-memory>";
+ if (templ)
+ nbfd->xvec = templ->xvec;
+ bim->size = contents_size;
+ bim->buffer = contents;
+ nbfd->iostream = (PTR) bim;
+ nbfd->flags = BFD_IN_MEMORY;
+ nbfd->direction = read_direction;
+ nbfd->mtime = time (NULL);
+ nbfd->mtime_set = TRUE;
+
+ return nbfd;
+}
+\f
#include "elfcore.h"
#include "elflink.h"
\f
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: bfd function to read ELF file image from memory
2003-05-14 1:28 bfd function to read ELF file image from memory Roland McGrath
@ 2003-05-16 11:24 ` Nick Clifton
2003-05-16 21:30 ` Roland McGrath
2003-05-17 5:39 ` Andrew Cagney
0 siblings, 2 replies; 9+ messages in thread
From: Nick Clifton @ 2003-05-16 11:24 UTC (permalink / raw)
To: Roland McGrath; +Cc: binutils, gdb-patches
Hi Roland,
> But all of that is a bit of a digression. I'm posting now because I
> don't quite know where to put this function even it its
> implementation were perfect. It was by far easier to write it in
> elfcode.h than to put it elsewhere. It gets to use the local helper
> code there, and automagically define 32 and 64 bit flavors, which
> keeps the code quite simple. It would be a lot more hair to put the
> code elsewhere, copy the header swapping code, and do the 32/64
> versions half as cleanly.
>
> The proper way to put it there is to make it yet another bfd target
> vector function. I don't know what all rigamorole is involved in
> doing that, and it seems like overkill for something probably only
> ever used by gdb and only with ELF.
Well since your function is creating a new bfd, one obvious place for
it would be in opncls.c. Have you considered using an interface like
the one provided by BFD_IN_MEMORY ? You could make the in-memory
interface generic and provide different memory accessing functions
(for local memory, remote memory, mmap'ed files, etc).
I do not think that treating this in-memory bfd as a new target is
appropriate, since it is not really a new file format or instruction
set architecture, but a new method of loading (part of) a binary file
into a normal bfd structure.
> Can anyone offer advice on where this function ought to live? If it
> doesn't live in the bfd elf backend, then I'll have to copy or
> hand-integrate some sanity checking and byte-swapping code for ELF
> headers.
I am all for keeping things simple, so if it is easier to put it in
elfcode.h then that is probably where it should live. Given the
probably limited utility of this function it probably ought to be only
conditionally defined, based on the particular configured target or
maybe a configure time switch.
Cheers
Nick
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: bfd function to read ELF file image from memory
2003-05-16 11:24 ` Nick Clifton
@ 2003-05-16 21:30 ` Roland McGrath
2003-05-19 10:29 ` Nick Clifton
2003-05-17 5:39 ` Andrew Cagney
1 sibling, 1 reply; 9+ messages in thread
From: Roland McGrath @ 2003-05-16 21:30 UTC (permalink / raw)
To: Nick Clifton; +Cc: binutils, gdb-patches
> Well since your function is creating a new bfd, one obvious place for
> it would be in opncls.c. Have you considered using an interface like
> the one provided by BFD_IN_MEMORY ? You could make the in-memory
> interface generic and provide different memory accessing functions
> (for local memory, remote memory, mmap'ed files, etc).
>
> I do not think that treating this in-memory bfd as a new target is
> appropriate, since it is not really a new file format or instruction
> set architecture, but a new method of loading (part of) a binary file
> into a normal bfd structure.
I think you have misunderstood what the code does, or else I am just very
confused about what your comments mean. The terminology in my remarks
about the code was surely sloppy, since I was mostly expecting folks to
look at the code itself.
This is a new function that bfd_malloc's a buffer, fills in its contents,
and makes that into a bfd with BFD_IN_MEMORY set. It has nothing to do
with adding a new bfd backend. The function's implementation is
format-specific, but its calling interface is generic. So it seems like it
ought to be accessed through a new function pointer in struct bfd_target,
which we would at least so far only have implemented in the ELF backend.
But, as I said, I don't really know what all the pieces are that one ought
to write to properly add another dispatched function to bfd.
> I am all for keeping things simple, so if it is easier to put it in
> elfcode.h then that is probably where it should live. Given the
> probably limited utility of this function it probably ought to be only
> conditionally defined, based on the particular configured target or
> maybe a configure time switch.
I am not aware of functions in libbfd that are conditionalized this way.
Can you point me to an example that would be a model of what you are
suggesting? I guess I can figure out what compile-time check to use in gdb
for an ELF target and call bfd_elf_bfd_from_memory from libbfd under that
#ifdef, but it doesn't seem very BFDish. Still, I would just to have one
calling interface that handles 32-bit and 64-bit ELF. To implement that
front-end function I need a compile-time check I can use in elfcode.h that
tells me whether 32-bit and/or 64-bit targets are being built. What works?
It so happens that I rewrote the function in the fashion I was describing.
So here's the new version. This one should be robustly generally useful to
extract the runtime-loaded portions of any ELF DSO from inferior memory.
Please take a look at the code.
For any DSO that isn't very very tiny, if it's not stripped (and perhaps
even if it is) then its section headers won't be in the memory image and so
what the extraction gives you appears to be an ELF ET_DYN file that never
had section headers at all. Right now the ELF backend doesn't really cope
with such objects, though it could certainly be made to (grokking the phdrs
and synthesizing fake sections similar to how ELF core files are handled).
But this function produces a valid ELF file image and feeds it to the rest
of BFD to choke on. (The Linux vsyscall DSO is in fact very very tiny and
so you get the whole thing with section headers and the ELF backend is
perfectly happy.)
This version also figures out the offset between actual VMAs read from and
those in the ELF headers for you. With that, the new gdb command I've
added using this function works in the general case.
Thanks,
Roland
--- elfcode.h.~1.41.~ Sat May 10 15:09:29 2003
+++ elfcode.h Thu May 15 21:44:46 2003
@@ -1568,6 +1568,221 @@ elf_symbol_flags (flags)
}
#endif
\f
+/* Create a new BFD as if by bfd_openr. Rather than opening a file,
+ reconstruct an ELF file by reading the segments out of remote memory
+ based on the ELF file header at EHDR_VMA and the ELF program headers it
+ points to. The function TARGET_READ_MEMORY is called to read remote
+ memory regions by VMA. TEMPL must be a BFD for a target with the word
+ size and byte order found in the remote memory. */
+
+bfd *
+NAME(bfd_elf,bfd_from_memory)
+ (bfd *templ, bfd_vma ehdr_vma, bfd_vma *loadbasep,
+ int (*target_read_memory) (bfd_vma vma, char *myaddr, int len))
+{
+ Elf_External_Ehdr x_ehdr; /* Elf file header, external form */
+ Elf_Internal_Ehdr ehdr; /* Elf file header, internal form */
+ Elf_External_Phdr *x_phdrs;
+ Elf_Internal_Phdr *i_phdrs;
+ bfd *nbfd;
+ struct bfd_in_memory *bim;
+ int contents_size;
+ char *contents;
+ int err;
+ unsigned int i;
+ bfd_vma loadbase;
+
+ /* Read in the ELF header in external format. */
+ err = target_read_memory (ehdr_vma, (char *) &x_ehdr, sizeof x_ehdr);
+ if (err)
+ {
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+
+ /* Now check to see if we have a valid ELF file, and one that BFD can
+ make use of. The magic number must match, the address size ('class')
+ and byte-swapping must match our XVEC entry. */
+
+ if (! elf_file_p (&x_ehdr)
+ || x_ehdr.e_ident[EI_VERSION] != EV_CURRENT
+ || x_ehdr.e_ident[EI_CLASS] != ELFCLASS)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ /* Check that file's byte order matches xvec's */
+ switch (x_ehdr.e_ident[EI_DATA])
+ {
+ case ELFDATA2MSB: /* Big-endian */
+ if (! bfd_header_big_endian (templ))
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+ break;
+ case ELFDATA2LSB: /* Little-endian */
+ if (! bfd_header_little_endian (templ))
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+ break;
+ case ELFDATANONE: /* No data encoding specified */
+ default: /* Unknown data encoding specified */
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ elf_swap_ehdr_in (templ, &x_ehdr, &ehdr);
+
+ /* The file header tells where to find the program headers.
+ These are what we use to actually choose what to read. */
+
+ if (ehdr.e_phentsize != sizeof (Elf_External_Phdr) || ehdr.e_phnum == 0)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ x_phdrs = (Elf_External_Phdr *)
+ bfd_malloc (ehdr.e_phnum * (sizeof *x_phdrs + sizeof *i_phdrs));
+ if (x_phdrs == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ err = target_read_memory (ehdr_vma + ehdr.e_phoff, (char *) x_phdrs,
+ ehdr.e_phnum * sizeof x_phdrs[0]);
+ if (err)
+ {
+ free (x_phdrs);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+ i_phdrs = (Elf_Internal_Phdr *) &x_phdrs[ehdr.e_phnum];
+
+ contents_size = ehdr.e_phoff + ehdr.e_phnum;
+ if ((bfd_vma) contents_size < ehdr.e_ehsize)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ loadbase = ehdr_vma;
+ for (i = 0; i < ehdr.e_phnum; ++i)
+ {
+ elf_swap_phdr_in (templ, &x_phdrs[i], &i_phdrs[i]);
+ if (i_phdrs[i].p_type == PT_LOAD)
+ {
+ bfd_vma segment_end;
+ segment_end = (i_phdrs[i].p_offset + i_phdrs[i].p_filesz
+ + i_phdrs[i].p_align - 1) & -i_phdrs[i].p_align;
+ if (segment_end > (bfd_vma) contents_size)
+ contents_size = segment_end;
+
+ if ((i_phdrs[i].p_offset & -i_phdrs[i].p_align) == 0)
+ loadbase = ehdr_vma - (i_phdrs[i].p_vaddr & -i_phdrs[i].p_align);
+ }
+ }
+
+ /* Back up to the last PT_LOAD header. */
+ do
+ --i;
+ while (i > 0 && i_phdrs[i].p_type != PT_LOAD);
+
+ /* Trim the last segment so we don't bother with zeros in the last page
+ that are off the end of the file. However, if the extra bit in that
+ page includes the section headers, keep them. */
+ if ((bfd_vma) contents_size > i_phdrs[i].p_offset + i_phdrs[i].p_filesz
+ && (bfd_vma) contents_size >= (ehdr.e_shoff
+ + ehdr.e_shnum * ehdr.e_shentsize))
+ {
+ contents_size = i_phdrs[i].p_offset + i_phdrs[i].p_filesz;
+ if ((bfd_vma) contents_size < (ehdr.e_shoff
+ + ehdr.e_shnum * ehdr.e_shentsize))
+ contents_size = ehdr.e_shoff + ehdr.e_shnum * ehdr.e_shentsize;
+ }
+ else
+ contents_size = i_phdrs[i].p_offset + i_phdrs[i].p_filesz;
+
+ /* Now we know the size of the whole image we want read in. */
+ contents = (char *) bfd_zmalloc ((bfd_size_type) contents_size);
+ if (contents == NULL)
+ {
+ free (x_phdrs);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+
+ for (i = 0; i < ehdr.e_phnum; ++i)
+ if (i_phdrs[i].p_type == PT_LOAD)
+ {
+ bfd_vma start = i_phdrs[i].p_offset & -i_phdrs[i].p_align;
+ bfd_vma end = (i_phdrs[i].p_offset + i_phdrs[i].p_filesz
+ + i_phdrs[i].p_align - 1) & -i_phdrs[i].p_align;
+ if (end > (bfd_vma) contents_size)
+ end = contents_size;
+ err = target_read_memory ((loadbase + i_phdrs[i].p_vaddr)
+ & -i_phdrs[i].p_align,
+ contents + start, end - start);
+ if (err)
+ {
+ free (x_phdrs);
+ free (contents);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+ }
+ free (x_phdrs);
+
+ /* If the segments visible in memory didn't include the section headers,
+ then clear them from the file header. */
+ if ((bfd_vma) contents_size < ehdr.e_shoff + ehdr.e_shnum * ehdr.e_shentsize)
+ {
+ memset (&x_ehdr.e_shoff, 0, sizeof x_ehdr.e_shoff);
+ memset (&x_ehdr.e_shnum, 0, sizeof x_ehdr.e_shnum);
+ }
+
+ /* This will normally have been in the first PT_LOAD segment. But it
+ conceivably could be missing, and we might have just changed it. */
+ memcpy (contents, &x_ehdr, sizeof x_ehdr);
+
+ /* Now we have a memory image of the ELF file contents. Make a BFD. */
+ bim = ((struct bfd_in_memory *)
+ bfd_malloc ((bfd_size_type) sizeof (struct bfd_in_memory)));
+ if (bim == NULL)
+ {
+ free (contents);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ nbfd = _bfd_new_bfd ();
+ if (nbfd == NULL)
+ {
+ free (bim);
+ free (contents);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ nbfd->filename = "<in-memory>";
+ nbfd->xvec = templ->xvec;
+ bim->size = contents_size;
+ bim->buffer = contents;
+ nbfd->iostream = (PTR) bim;
+ nbfd->flags = BFD_IN_MEMORY;
+ nbfd->direction = read_direction;
+ nbfd->mtime = time (NULL);
+ nbfd->mtime_set = TRUE;
+
+ *loadbasep = loadbase;
+ return nbfd;
+}
+\f
#include "elfcore.h"
#include "elflink.h"
\f
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: bfd function to read ELF file image from memory
2003-05-16 21:30 ` Roland McGrath
@ 2003-05-19 10:29 ` Nick Clifton
2003-05-19 20:24 ` Roland McGrath
0 siblings, 1 reply; 9+ messages in thread
From: Nick Clifton @ 2003-05-19 10:29 UTC (permalink / raw)
To: Roland McGrath; +Cc: binutils, gdb-patches
Hi Roland,
>> Well since your function is creating a new bfd, one obvious place for
>> it would be in opncls.c. Have you considered using an interface like
>> the one provided by BFD_IN_MEMORY ? You could make the in-memory
>> interface generic and provide different memory accessing functions
>> (for local memory, remote memory, mmap'ed files, etc).
>>
>> I do not think that treating this in-memory bfd as a new target is
>> appropriate, since it is not really a new file format or instruction
>> set architecture, but a new method of loading (part of) a binary file
>> into a normal bfd structure.
>
> I think you have misunderstood what the code does, or else I am just
> very confused about what your comments mean. The terminology in my
> remarks about the code was surely sloppy, since I was mostly
> expecting folks to look at the code itself.
>
> This is a new function that bfd_malloc's a buffer, fills in its contents,
> and makes that into a bfd with BFD_IN_MEMORY set. It has nothing to do
> with adding a new bfd backend.
Agreed - hence treating it as a new target would be inappropriate.
> The function's implementation is format-specific, but its calling
> interface is generic. So it seems like it ought to be accessed
> through a new function pointer in struct bfd_target, which we would
> at least so far only have implemented in the ELF backend.
Right - my theory was that you could extend and clean up the current
BFD_IN_MEMORY implementation so that the memory was either local (as
it currently is), remote (as your patch implements) or even an mmap'ed
file (something that people have been asking for in bfd for ages, and
is partially implemented in bfdwin.c).
This would allow for lazy reading in of remote contents (ie only when
they are needed) and, in theory, would mean that it could be used by
almost any bfd backend. Any remote-memory-bfd would just appear to the
rest of the bfd code as a normal bfd structure, and when one of the
standard bfd functions calls was used to read some of the contents, a
remote fetch would be performed and that region would then be marked
as being held in local memory.
Which is probably more than you wanted to implement, but it seemed
like the right way to handle remote memory reading, at least in the
long run. Your patch as it stands, looks fine to me, although I would
probably rename the functions to ..._create_bfd_from_remote_memory().
I would also recommend adding a couple of prototypes to elf-bfd.h.
>> I am all for keeping things simple, so if it is easier to put it in
>> elfcode.h then that is probably where it should live. Given the
>> probably limited utility of this function I feel that it ought to
>> be only conditionally defined, based on the particular configured
>> target or maybe a configure time switch.
>
> I am not aware of functions in libbfd that are conditionalized this
> way. Can you point me to an example that would be a model of what
> you are suggesting?
Certainly. If you have the code in a separate file, then you could do
the same thing as the peXXigen.c file, which is conditionally compiled
and linked into the bfd library based on the configuration target.
(See bfd/configure.in). Alternatively you could use a system like the
definition of COFF_WITH_PE, which is defined in a variety of different
source files (eg bfd/pe-arm.c) and then tested for in various other
files (eg bfd/coffcode.h).
> I guess I can figure out what compile-time check to use in gdb
> for an ELF target and call bfd_elf_bfd_from_memory from libbfd under
> that #ifdef, but it doesn't seem very BFDish.
Hmm, I take your point - it would be easier if the function were
always defined in the bfd library, even if it was not used/not
intended to be used by a particular bfd target. (I think it would be
OK if it were only defined in ELF targeted versions of the library
though).
Given that the code is not target specific itself, (just the
availability of the ELF headers/file contents in remote memory), I
withdraw my suggestion that the code be conditionalised.
> Still, I would just to have one calling interface that handles
> 32-bit and 64-bit ELF. To implement that front-end function I need
> a compile-time check I can use in elfcode.h that tells me whether
> 32-bit and/or 64-bit targets are being built. What works?
You can check "#if ARCH_SIZE == 64" or "#if ARCH_SIZE == 32".
> It so happens that I rewrote the function in the fashion I was
> describing. So here's the new version. This one should be robustly
> generally useful to extract the runtime-loaded portions of any ELF
> DSO from inferior memory. Please take a look at the code.
> +/* Create a new BFD as if by bfd_openr. Rather than opening a file,
> + reconstruct an ELF file by reading the segments out of remote memory
> + based on the ELF file header at EHDR_VMA and the ELF program headers it
> + points to. The function TARGET_READ_MEMORY is called to read remote
> + memory regions by VMA. TEMPL must be a BFD for a target with the word
> + size and byte order found in the remote memory. */
I would recommend including a description of how TARGET_READ_MEMORY is
supposed to work - what parameters it takes, what values it returns,
whether it is possible for it to timeout, etc.
Also - please document the purpose of the LOADBASEP parameter.
> + Elf_External_Ehdr x_ehdr; /* Elf file header, external form */
> + Elf_Internal_Ehdr ehdr; /* Elf file header, internal form */
> + Elf_External_Phdr *x_phdrs;
> + Elf_Internal_Phdr *i_phdrs;
I think that the choice of name of the name for the internal Ehdr is
slightly confusing. I would suggest that you change it to "i_ehdr",
> + contents_size = ehdr.e_phoff + ehdr.e_phnum;
> + if ((bfd_vma) contents_size < ehdr.e_ehsize)
> + {
> + bfd_set_error (bfd_error_wrong_format);
> + return NULL;
> + }
This bit of code has me confused. I assume that 'contents_size' is
being used to reserve space in the (to be allocated) 'contents'
buffer for the ELF header, but why are e_phoff and e_phnum being
used ?
> + /* Back up to the last PT_LOAD header. */
> + do
> + --i;
> + while (i > 0 && i_phdrs[i].p_type != PT_LOAD);
Couldn't you just remember the value of 'i' each time you encounter a
PT_LOAD section in the for() loop, so saving yourself this do...while
loop ? It would also be a good idea to check that there actually *is*
a PT_LOAD section.
> + /* Now we know the size of the whole image we want read in. */
> + contents = (char *) bfd_zmalloc ((bfd_size_type) contents_size);
Hmmm, what about files with disjoint segments ? ie files which load
some segments to low addresses and some to high addresses ? With the
current code 'contents' will be some huge number and so this malloc
will fail. Wouldn't it be better to load each segment contiguously
into the 'contents' buffer and adjust the p_vaddr values in the
program header appropriately ?
Cheers
Nick
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: bfd function to read ELF file image from memory
2003-05-19 10:29 ` Nick Clifton
@ 2003-05-19 20:24 ` Roland McGrath
2003-05-19 20:30 ` Ian Lance Taylor
0 siblings, 1 reply; 9+ messages in thread
From: Roland McGrath @ 2003-05-19 20:24 UTC (permalink / raw)
To: Nick Clifton; +Cc: binutils, gdb-patches
> Right - my theory was that you could extend and clean up the current
> BFD_IN_MEMORY implementation so that the memory was either local (as
> it currently is), remote (as your patch implements) or even an mmap'ed
> file (something that people have been asking for in bfd for ages, and
> is partially implemented in bfdwin.c).
My patch implements copying from remote memory once, not generalized access
to it. Lazy reading would be pretty straightforward to implement. But it
is rather a digression from the motivating purpose here. The case at hand
is an ELF file image of about 2kb, so it would surely be more overhead to
split it up and access memory piecemeal than just to read the whole thing.
Moreover, I have a problem to solve in gdb and the BFD function I've
written is the naturally clean and somewhat general BFD component of it.
It's pretty trivial to implement because of the existing BFD_IN_MEMORY
functionality. I have no impetus to add a more complex BFD facility for
which there is no actual demand, and everyone's time is probably better
spent elsewhere. (All that said, what you've described is a few hours'
hacking on bfdio.c--much more time writing something using such a facility
so as to test it.)
> probably rename the functions to ..._create_bfd_from_remote_memory().
> I would also recommend adding a couple of prototypes to elf-bfd.h.
Of course. I posted the function for comments about how to integrate it,
not as a finished addition. (And I never care what the names are.)
> Certainly. If you have the code in a separate file, then you could do
> the same thing as the peXXigen.c file, which is conditionally compiled
> and linked into the bfd library based on the configuration target.
> (See bfd/configure.in). Alternatively you could use a system like the
> definition of COFF_WITH_PE, which is defined in a variety of different
> source files (eg bfd/pe-arm.c) and then tested for in various other
> files (eg bfd/coffcode.h).
I don't see how any of this is apropos. Your examples are about different
backends sharing common code. For that purpose, elfcode.h is already the
right place for the new function's implementation. The question is about
libbfd interfaces used by applications, that are available only conditionally.
I don't think you've cited an example like that for me to examine. If you
have, please be more specific about what functions I should look at.
> Hmm, I take your point - it would be easier if the function were
> always defined in the bfd library, even if it was not used/not
> intended to be used by a particular bfd target. (I think it would be
> OK if it were only defined in ELF targeted versions of the library
> though).
>
> Given that the code is not target specific itself, (just the
> availability of the ELF headers/file contents in remote memory), I
> withdraw my suggestion that the code be conditionalised.
Ok. That still doesn't tell me explicitly what to do. Are you saying that
it should in fact be a new backend function pointer in struct bfd_target?
Can you tell me where to find the checklist of things I need to do to
properly add such an interface?
> > Still, I would just to have one calling interface that handles
> > 32-bit and 64-bit ELF. To implement that front-end function I need
> > a compile-time check I can use in elfcode.h that tells me whether
> > 32-bit and/or 64-bit targets are being built. What works?
>
> You can check "#if ARCH_SIZE == 64" or "#if ARCH_SIZE == 32".
I must be misunderstanding you, or else that can't be right. I asked about
how front-end source code can know whether it needs to call just the 32-bit
function from the 32-bit compile of elfcode.h, just the 64-bit function
from the 64-bit compile of elfcode.h, or switch between the two at runtime
based on bfd_get_arch_size.
> I would recommend including a description of how TARGET_READ_MEMORY is
> supposed to work - what parameters it takes, what values it returns,
> whether it is possible for it to timeout, etc.
>
> Also - please document the purpose of the LOADBASEP parameter.
Sure.
> I think that the choice of name of the name for the internal Ehdr is
> slightly confusing. I would suggest that you change it to "i_ehdr",
Whatever floats your boat.
>
> > + contents_size = ehdr.e_phoff + ehdr.e_phnum;
> > + if ((bfd_vma) contents_size < ehdr.e_ehsize)
> > + {
> > + bfd_set_error (bfd_error_wrong_format);
> > + return NULL;
> > + }
>
> This bit of code has me confused. I assume that 'contents_size' is
> being used to reserve space in the (to be allocated) 'contents'
> buffer for the ELF header, but why are e_phoff and e_phnum being
> used ?
No. That variable is tallying the total size of the buffer that will be
allocated to hold the file image. A valid ELF file can have its phdrs not
covered by any PT_LOAD segment, placed anywhere in the file. The original
version of this code tried to read a whole contiguous file and so to be
pedantically correct this calculation made absolutely sure the phdrs were
covered. The current version of the function only reads from the PT_LOAD
segments, so that bit of code doesn't make much sense any more.
> Couldn't you just remember the value of 'i' each time you encounter a
> PT_LOAD section in the for() loop, so saving yourself this do...while
> loop ?
Sure. The loops costs very little, but there is nothing wrong with saving
the last index.
> It would also be a good idea to check that there actually *is* a PT_LOAD
> section.
Yes, another holdover from the original implementation. The first version
would be happy just reading an ELF file header and phdrs with no PT_LOAD
segments and delivering a file image containing just those headers.
That no longer makes sense.
> Hmmm, what about files with disjoint segments ? ie files which load
> some segments to low addresses and some to high addresses ? With the
> current code 'contents' will be some huge number and so this malloc
> will fail.
You are mistaken. The load addresses have no bearing on those
calculations. Perhaps you saw `p_offset' in the code and thought
`p_vaddr'. The size computed is that of the file image, not the
memory image. In an ELF file with a segment loaded at 0x1000 and a
segment loaded at 0x1000000, the two segments are contiguous in the
file (modulo alignment padding).
> Wouldn't it be better to load each segment contiguously into the
> 'contents' buffer and adjust the p_vaddr values in the program header
> appropriately ?
Changing p_vaddr values would completely break all address-using
calculations ever done on anything in the file. It would be possible
to pack segments together by adjusting their p_offset values while
keeping them congruent to p_vaddr modulo p_align. However, in
practice you will never see an ELF image that isn't already packed
that way. Given that such overly clever code would in reality never
do anything different, I prefer the simplicity of the current
approach wherein I am replicating the file as it was (according to
its original headers) rather than doing any calculations (into which
I could introduce bugs) to freshly lay out the file. I will change
it if you prefer, but be aware that the added complexity will never
get any testing unless we contrive ELF images unlike any that exist
in nature.
I've changed the couple of small things you cited, and here is the
new version of the function.
Thanks,
Roland
\f
/* Create a new BFD as if by bfd_openr. Rather than opening a file,
reconstruct an ELF file by reading the segments out of remote memory
based on the ELF file header at EHDR_VMA and the ELF program headers it
points to. If not null, *LOADBASEP is filled in with the difference
between the VMAs from which the segments were read, and the VMAs the
file headers (and hence BFD's idea of each section's VMA) put them at.
The function TARGET_READ_MEMORY is called to copy LEN bytes from the
remote memory at target address VMA into the local buffer at MYADDR; it
should return zero on success or an `errno' code on failure. TEMPL must
be a BFD for a target with the word size and byte order found in the
remote memory. */
bfd *
NAME(bfd_elf,bfd_from_memory)
(bfd *templ, bfd_vma ehdr_vma, bfd_vma *loadbasep,
int (*target_read_memory) (bfd_vma vma, char *myaddr, int len))
{
Elf_External_Ehdr x_ehdr; /* Elf file header, external form */
Elf_Internal_Ehdr i_ehdr; /* Elf file header, internal form */
Elf_External_Phdr *x_phdrs;
Elf_Internal_Phdr *i_phdrs, *last_phdr;
bfd *nbfd;
struct bfd_in_memory *bim;
int contents_size;
char *contents;
int err;
unsigned int i;
bfd_vma loadbase;
/* Read in the ELF header in external format. */
err = target_read_memory (ehdr_vma, (char *) &x_ehdr, sizeof x_ehdr);
if (err)
{
bfd_set_error (bfd_error_system_call);
errno = err;
return NULL;
}
/* Now check to see if we have a valid ELF file, and one that BFD can
make use of. The magic number must match, the address size ('class')
and byte-swapping must match our XVEC entry. */
if (! elf_file_p (&x_ehdr)
|| x_ehdr.e_ident[EI_VERSION] != EV_CURRENT
|| x_ehdr.e_ident[EI_CLASS] != ELFCLASS)
{
bfd_set_error (bfd_error_wrong_format);
return NULL;
}
/* Check that file's byte order matches xvec's */
switch (x_ehdr.e_ident[EI_DATA])
{
case ELFDATA2MSB: /* Big-endian */
if (! bfd_header_big_endian (templ))
{
bfd_set_error (bfd_error_wrong_format);
return NULL;
}
break;
case ELFDATA2LSB: /* Little-endian */
if (! bfd_header_little_endian (templ))
{
bfd_set_error (bfd_error_wrong_format);
return NULL;
}
break;
case ELFDATANONE: /* No data encoding specified */
default: /* Unknown data encoding specified */
bfd_set_error (bfd_error_wrong_format);
return NULL;
}
elf_swap_ehdr_in (templ, &x_ehdr, &i_ehdr);
/* The file header tells where to find the program headers.
These are what we use to actually choose what to read. */
if (i_ehdr.e_phentsize != sizeof (Elf_External_Phdr) || i_ehdr.e_phnum == 0)
{
bfd_set_error (bfd_error_wrong_format);
return NULL;
}
x_phdrs = (Elf_External_Phdr *)
bfd_malloc (i_ehdr.e_phnum * (sizeof *x_phdrs + sizeof *i_phdrs));
if (x_phdrs == NULL)
{
bfd_set_error (bfd_error_no_memory);
return NULL;
}
err = target_read_memory (ehdr_vma + i_ehdr.e_phoff, (char *) x_phdrs,
i_ehdr.e_phnum * sizeof x_phdrs[0]);
if (err)
{
free (x_phdrs);
bfd_set_error (bfd_error_system_call);
errno = err;
return NULL;
}
i_phdrs = (Elf_Internal_Phdr *) &x_phdrs[i_ehdr.e_phnum];
contents_size = 0;
last_phdr = NULL;
loadbase = ehdr_vma;
for (i = 0; i < i_ehdr.e_phnum; ++i)
{
elf_swap_phdr_in (templ, &x_phdrs[i], &i_phdrs[i]);
if (i_phdrs[i].p_type == PT_LOAD)
{
bfd_vma segment_end;
segment_end = (i_phdrs[i].p_offset + i_phdrs[i].p_filesz
+ i_phdrs[i].p_align - 1) & -i_phdrs[i].p_align;
if (segment_end > (bfd_vma) contents_size)
contents_size = segment_end;
if ((i_phdrs[i].p_offset & -i_phdrs[i].p_align) == 0)
loadbase = ehdr_vma - (i_phdrs[i].p_vaddr & -i_phdrs[i].p_align);
last_phdr = &i_phdrs[i];
}
}
if (last_phdr == NULL)
{
bfd_set_error (bfd_error_wrong_format);
return NULL;
}
/* Trim the last segment so we don't bother with zeros in the last page
that are off the end of the file. However, if the extra bit in that
page includes the section headers, keep them. */
if ((bfd_vma) contents_size > last_phdr->p_offset + last_phdr->p_filesz
&& (bfd_vma) contents_size >= (i_ehdr.e_shoff
+ i_ehdr.e_shnum * i_ehdr.e_shentsize))
{
contents_size = last_phdr->p_offset + last_phdr->p_filesz;
if ((bfd_vma) contents_size < (i_ehdr.e_shoff
+ i_ehdr.e_shnum * i_ehdr.e_shentsize))
contents_size = i_ehdr.e_shoff + i_ehdr.e_shnum * i_ehdr.e_shentsize;
}
else
contents_size = last_phdr->p_offset + last_phdr->p_filesz;
/* Now we know the size of the whole image we want read in. */
contents = (char *) bfd_zmalloc ((bfd_size_type) contents_size);
if (contents == NULL)
{
free (x_phdrs);
bfd_set_error (bfd_error_no_memory);
return NULL;
}
for (i = 0; i < i_ehdr.e_phnum; ++i)
if (i_phdrs[i].p_type == PT_LOAD)
{
bfd_vma start = i_phdrs[i].p_offset & -i_phdrs[i].p_align;
bfd_vma end = (i_phdrs[i].p_offset + i_phdrs[i].p_filesz
+ i_phdrs[i].p_align - 1) & -i_phdrs[i].p_align;
if (end > (bfd_vma) contents_size)
end = contents_size;
err = target_read_memory ((loadbase + i_phdrs[i].p_vaddr)
& -i_phdrs[i].p_align,
contents + start, end - start);
if (err)
{
free (x_phdrs);
free (contents);
bfd_set_error (bfd_error_system_call);
errno = err;
return NULL;
}
}
free (x_phdrs);
/* If the segments visible in memory didn't include the section headers,
then clear them from the file header. */
if ((bfd_vma) contents_size < (i_ehdr.e_shoff
+ i_ehdr.e_shnum * i_ehdr.e_shentsize))
{
memset (&x_ehdr.e_shoff, 0, sizeof x_ehdr.e_shoff);
memset (&x_ehdr.e_shnum, 0, sizeof x_ehdr.e_shnum);
memset (&x_ehdr.e_shstrndx, 0, sizeof x_ehdr.e_shstrndx);
}
/* This will normally have been in the first PT_LOAD segment. But it
conceivably could be missing, and we might have just changed it. */
memcpy (contents, &x_ehdr, sizeof x_ehdr);
/* Now we have a memory image of the ELF file contents. Make a BFD. */
bim = ((struct bfd_in_memory *)
bfd_malloc ((bfd_size_type) sizeof (struct bfd_in_memory)));
if (bim == NULL)
{
free (contents);
bfd_set_error (bfd_error_no_memory);
return NULL;
}
nbfd = _bfd_new_bfd ();
if (nbfd == NULL)
{
free (bim);
free (contents);
bfd_set_error (bfd_error_no_memory);
return NULL;
}
nbfd->filename = "<in-memory>";
nbfd->xvec = templ->xvec;
bim->size = contents_size;
bim->buffer = contents;
nbfd->iostream = (PTR) bim;
nbfd->flags = BFD_IN_MEMORY;
nbfd->direction = read_direction;
nbfd->mtime = time (NULL);
nbfd->mtime_set = TRUE;
if (loadbasep)
*loadbasep = loadbase;
return nbfd;
}
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: bfd function to read ELF file image from memory
2003-05-19 20:24 ` Roland McGrath
@ 2003-05-19 20:30 ` Ian Lance Taylor
2003-05-19 20:46 ` Roland McGrath
0 siblings, 1 reply; 9+ messages in thread
From: Ian Lance Taylor @ 2003-05-19 20:30 UTC (permalink / raw)
To: Roland McGrath; +Cc: Nick Clifton, binutils, gdb-patches
Roland McGrath <roland@redhat.com> writes:
> > > Still, I would just to have one calling interface that handles
> > > 32-bit and 64-bit ELF. To implement that front-end function I need
> > > a compile-time check I can use in elfcode.h that tells me whether
> > > 32-bit and/or 64-bit targets are being built. What works?
> >
> > You can check "#if ARCH_SIZE == 64" or "#if ARCH_SIZE == 32".
>
> I must be misunderstanding you, or else that can't be right. I asked about
> how front-end source code can know whether it needs to call just the 32-bit
> function from the 32-bit compile of elfcode.h, just the 64-bit function
> from the 64-bit compile of elfcode.h, or switch between the two at runtime
> based on bfd_get_arch_size.
I doubt there are any publically visible functions with name changes
from 32-bit to 64-bit. But there are certainly functions like
bfd_get_gp_size() which look at the BFD to decide where to get the
backend specific information. You could do something like that,
assuming you put the appropriate hook in the elf_backend_data struct,
which is less intrusive than putting it in the bfd_target struct.
Ian
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bfd function to read ELF file image from memory
2003-05-16 11:24 ` Nick Clifton
2003-05-16 21:30 ` Roland McGrath
@ 2003-05-17 5:39 ` Andrew Cagney
1 sibling, 0 replies; 9+ messages in thread
From: Andrew Cagney @ 2003-05-17 5:39 UTC (permalink / raw)
To: Nick Clifton; +Cc: Roland McGrath, binutils, gdb-patches
> Hi Roland,
>
>
>> But all of that is a bit of a digression. I'm posting now because I
>> don't quite know where to put this function even it its
>> implementation were perfect. It was by far easier to write it in
>> elfcode.h than to put it elsewhere. It gets to use the local helper
>> code there, and automagically define 32 and 64 bit flavors, which
>> keeps the code quite simple. It would be a lot more hair to put the
>> code elsewhere, copy the header swapping code, and do the 32/64
>> versions half as cleanly.
>>
>> The proper way to put it there is to make it yet another bfd target
>> vector function. I don't know what all rigamorole is involved in
>> doing that, and it seems like overkill for something probably only
>> ever used by gdb and only with ELF.
>
>
> Well since your function is creating a new bfd, one obvious place for
> it would be in opncls.c. Have you considered using an interface like
> the one provided by BFD_IN_MEMORY ? You could make the in-memory
> interface generic and provide different memory accessing functions
> (for local memory, remote memory, mmap'ed files, etc).
That sounds right. It lets BFD create a bfd using an arbitrary [memory]
read (or even memory map) function.
It may also open the way for mmaping the entire executable and then
creating a bfd pointing at that (assuming that isn't yet possible :-).
> I do not think that treating this in-memory bfd as a new target is
> appropriate, since it is not really a new file format or instruction
> set architecture, but a new method of loading (part of) a binary file
> into a normal bfd structure.
Yes.
>> Can anyone offer advice on where this function ought to live? If it
>> doesn't live in the bfd elf backend, then I'll have to copy or
>> hand-integrate some sanity checking and byte-swapping code for ELF
>> headers.
>
>
> I am all for keeping things simple, so if it is easier to put it in
> elfcode.h then that is probably where it should live. Given the
> probably limited utility of this function it probably ought to be only
> conditionally defined, based on the particular configured target or
> maybe a configure time switch.
It could go in a separate file? That way, if *-pc-linux-gnu-gdb needs
it, it will link it in.
However, seriously, if bfd did have a generic and clean memory map
interface I think people would use it.
Andrew
PS: To explain what's, er, gone wrong
Recent Linux kernel's have taken to including an entire ELF object in
the target process's address space (the object file comes from the
Kernel and is not part of the file system). You'll see it refered to as
the vsyscall region. It appears that GDB is going to need to open this
in target memory (or in core) object, as GDB needs things from it (such
as .eh_frame info) if GDB is to manage to unwind from a syscall.
Just in case you're getting any false impressions over the current
status of GDB, nothing has really resolved. Right now, the kernel/core
file don't even provide the address of the above beastie (It's been
suggested on gdb@ that the Kernel provide it via /proc/PID/auxv and a
.note in the core file, but don't know if it's been taken to the kernel
list. Hmm, dig dig,
http://www.ussg.iu.edu/hypermail/linux/kernel/0305.1/2566.html
seems Roland is following it through but forgot to advice the GDB group.
^ permalink raw reply [flat|nested] 9+ messages in thread
* bfd function to read ELF file image from memory
@ 2003-05-14 1:20 Roland McGrath
0 siblings, 0 replies; 9+ messages in thread
From: Roland McGrath @ 2003-05-14 1:20 UTC (permalink / raw)
To: bfd, gdb-patches
This bfd patch adds a function that creates an in-memory BFD by reading
from process memory to read ELF headers and from that determine the size of
the whole file image to read out of the process. I am using this in gdb to
read the Linux vsyscall DSO image from a live process; the details of what
that is all about have been discussed on the gdb@sources.redhat.com list.
I wrote this as a BFD function that takes a callback function pointer
with the same signature as gdb's `target_read_memory'.
This function is not really all that generally useful as it is. It assumes
that the memory image simply maps the whole file image verbatim. That
works great for the vsyscall DSO. But for a normal DSO that has a data
segment, that is not how the memory looks. For those, this will create a
bfd that has bogus contents for the parts of the file after the end of the
text segment; the data segment will be bogus, and so will the section
headers because they appear at the end of the file (and perhaps not at all
in the memory image). Ultimately a more useful function would be one that
only tries to read the contents of the PT_LOAD segments and places their
contents appropriately in the bfd's in-memory file image. As things stand
in gdb we need the section headers for the vsyscall DSO, but it so happens
that its sole PT_LOAD segment when taken with alignment covers the whole
file image. So I may rewrite it that way.
But all of that is a bit of a digression. I'm posting now because I don't
quite know where to put this function even it its implementation were
perfect. It was by far easier to write it in elfcode.h than to put it
elsewhere. It gets to use the local helper code there, and automagically
define 32 and 64 bit flavors, which keeps the code quite simple. It would
be a lot more hair to put the code elsewhere, copy the header swapping
code, and do the 32/64 versions half as cleanly.
The proper way to put it there is to make it yet another bfd target vector
function. I don't know what all rigamorole is involved in doing that, and
it seems like overkill for something probably only ever used by gdb and
only with ELF.
For testing purposes, I have added a gdb command that makes use of this.
It's code that doesn't need to be target-specific except for by what name
to call this function. Right now I just have it kludged to do runtime
checks with bfd_get_flavour and bfd_get_arch_size, but that is code that
won't build properly if libbfd doesn't have some elf32 and elf64 targets
configured in.
Can anyone offer advice on where this function ought to live? If it
doesn't live in the bfd elf backend, then I'll have to copy or
hand-integrate some sanity checking and byte-swapping code for ELF headers.
Thanks,
Roland
--- elfcode.h.~1.41.~ Sat May 10 15:09:29 2003
+++ elfcode.h Tue May 13 17:49:15 2003
@@ -1568,6 +1568,201 @@ elf_symbol_flags (flags)
}
#endif
\f
+/* Create a new BFD as if by bfd_openr. Rather than opening a file, read
+ an ELF file image out of remote memory based on the ELF file header
+ found at EHDR_VMA. The function TARGET_READ_MEMORY is called to read
+ remote memory regions by VMA. TEMPL must be a BFD for a target with the
+ word size and byte order found in the remote memory. */
+
+bfd *
+NAME(bfd_elf,bfd_from_memory)
+ (bfd *templ, bfd_vma ehdr_vma,
+ int (*target_read_memory) (bfd_vma vma, char *myaddr, int len))
+{
+ Elf_External_Ehdr x_ehdr; /* Elf file header, external form */
+ Elf_Internal_Ehdr ehdr; /* Elf file header, internal form */
+ bfd *nbfd;
+ struct bfd_in_memory *bim;
+ int contents_size;
+ char *contents;
+ int err;
+
+ /* Read in the ELF header in external format. */
+ err = target_read_memory (ehdr_vma, (char *) &x_ehdr, sizeof x_ehdr);
+ if (err)
+ {
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+
+ /* Now check to see if we have a valid ELF file, and one that BFD can
+ make use of. The magic number must match, the address size ('class')
+ and byte-swapping must match our XVEC entry. */
+
+ if (! elf_file_p (&x_ehdr)
+ || x_ehdr.e_ident[EI_VERSION] != EV_CURRENT
+ || x_ehdr.e_ident[EI_CLASS] != ELFCLASS)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ /* Check that file's byte order matches xvec's */
+ switch (x_ehdr.e_ident[EI_DATA])
+ {
+ case ELFDATA2MSB: /* Big-endian */
+ if (! bfd_header_big_endian (templ))
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+ break;
+ case ELFDATA2LSB: /* Little-endian */
+ if (! bfd_header_little_endian (templ))
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+ break;
+ case ELFDATANONE: /* No data encoding specified */
+ default: /* Unknown data encoding specified */
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ elf_swap_ehdr_in (templ, &x_ehdr, &ehdr);
+
+ /* The file header tells where to find the phdrs and section headers,
+ which tells us how big the file is. */
+
+ contents_size = ehdr.e_ehsize;
+#define BUMP(to) if (contents_size < (int) (to)) contents_size = (to)
+
+ BUMP (ehdr.e_phoff + ehdr.e_phnum * ehdr.e_phentsize);
+ if (ehdr.e_phentsize == sizeof (Elf_External_Phdr))
+ {
+ unsigned int i;
+ Elf_Internal_Phdr phdr;
+ Elf_External_Phdr *x_phdrs
+ = (Elf_External_Phdr *) bfd_malloc (ehdr.e_phnum * sizeof *x_phdrs);
+ if (x_phdrs == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ err = target_read_memory (ehdr_vma + ehdr.e_phoff,
+ (char *) x_phdrs,
+ ehdr.e_phnum * sizeof *x_phdrs);
+ if (err)
+ {
+ free (x_phdrs);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+ for (i = 0; i < ehdr.e_phnum; ++i)
+ {
+ elf_swap_phdr_in (templ, &x_phdrs[i], &phdr);
+ BUMP (phdr.p_offset + phdr.p_filesz);
+ }
+ free (x_phdrs);
+ }
+ else if (ehdr.e_phentsize != 0 && ehdr.e_phnum != 0)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+ BUMP (ehdr.e_shoff + ehdr.e_shnum * ehdr.e_shentsize);
+ if (ehdr.e_shentsize == sizeof (Elf_External_Shdr))
+ {
+ unsigned int i;
+ Elf_Internal_Shdr shdr;
+ Elf_External_Shdr *x_shdrs
+ = (Elf_External_Shdr *) bfd_malloc (ehdr.e_shnum * sizeof *x_shdrs);
+ if (x_shdrs == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ err = target_read_memory (ehdr_vma + ehdr.e_shoff,
+ (char *) x_shdrs,
+ ehdr.e_shnum * sizeof *x_shdrs);
+ if (err)
+ {
+ free (x_shdrs);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+ for (i = 0; i < ehdr.e_shnum; ++i)
+ {
+ elf_swap_shdr_in (templ, &x_shdrs[i], &shdr);
+ BUMP (shdr.sh_offset
+ + ((shdr.sh_flags & SHF_ALLOC) ? shdr.sh_size : 0));
+ }
+ free (x_shdrs);
+ }
+ else if (ehdr.e_shentsize != 0 && ehdr.e_shnum != 0)
+ {
+ bfd_set_error (bfd_error_wrong_format);
+ return NULL;
+ }
+
+#undef BUMP
+
+ /* Now we know the size of the whole image we want read in. */
+
+ contents = (char *) bfd_malloc ((bfd_size_type) contents_size);
+ if (contents == NULL)
+ {
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ /* We already read the ELF header, no need to copy it in again. */
+ memcpy (contents, &x_ehdr, sizeof x_ehdr);
+ err = target_read_memory (ehdr_vma + sizeof x_ehdr, contents + sizeof x_ehdr,
+ contents_size - sizeof x_ehdr);
+ if (err)
+ {
+ free (contents);
+ bfd_set_error (bfd_error_system_call);
+ errno = err;
+ return NULL;
+ }
+
+ /* Now we have a memory image of the ELF file contents. Make a BFD. */
+ bim = ((struct bfd_in_memory *)
+ bfd_malloc ((bfd_size_type) sizeof (struct bfd_in_memory)));
+ if (bim == NULL)
+ {
+ free (contents);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ nbfd = _bfd_new_bfd ();
+ if (nbfd == NULL)
+ {
+ free (bim);
+ free (contents);
+ bfd_set_error (bfd_error_no_memory);
+ return NULL;
+ }
+ nbfd->filename = "<in-memory>";
+ if (templ)
+ nbfd->xvec = templ->xvec;
+ bim->size = contents_size;
+ bim->buffer = contents;
+ nbfd->iostream = (PTR) bim;
+ nbfd->flags = BFD_IN_MEMORY;
+ nbfd->direction = read_direction;
+ nbfd->mtime = time (NULL);
+ nbfd->mtime_set = TRUE;
+
+ return nbfd;
+}
+\f
#include "elfcore.h"
#include "elflink.h"
\f
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2003-05-19 20:46 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-14 1:28 bfd function to read ELF file image from memory Roland McGrath
2003-05-16 11:24 ` Nick Clifton
2003-05-16 21:30 ` Roland McGrath
2003-05-19 10:29 ` Nick Clifton
2003-05-19 20:24 ` Roland McGrath
2003-05-19 20:30 ` Ian Lance Taylor
2003-05-19 20:46 ` Roland McGrath
2003-05-17 5:39 ` Andrew Cagney
-- strict thread matches above, loose matches on Subject: below --
2003-05-14 1:20 Roland McGrath
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox