From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 109148 invoked by alias); 19 Aug 2015 01:27:39 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Received: (qmail 109138 invoked by uid 89); 19 Aug 2015 01:27:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL,BAYES_50,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 19 Aug 2015 01:27:36 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 0BCE0341AC4; Wed, 19 Aug 2015 01:27:34 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t7J1RVFT009445; Tue, 18 Aug 2015 21:27:32 -0400 Message-ID: <55D3DB83.4050204@redhat.com> Date: Wed, 19 Aug 2015 01:27:00 -0000 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Sandra Loosemore , Gary Benson CC: Joel Brobecker , Doug Evans , Jan Kratochvil , gdb-patches , =?windows-1252?Q?Andr=E9_P=F6?= =?windows-1252?Q?nitz?= , Paul Koning Subject: Re: [PATCH 0/2] Better handling of slow remote transfers References: <001a11c301b0388ac5051d0c5ab8@google.com> <20150811185519.GA28644@host1.jankratochvil.net> <20150811195943.GC22245@adacore.com> <20150812094831.GD11096@blade.nx> <20150814182648.GO22245@adacore.com> <55CE6AA3.8000300@codesourcery.com> <20150816184913.GA2998@adacore.com> <20150817085310.GC25320@blade.nx> <55D1EE96.9060202@codesourcery.com> <20150818095858.GB9815@blade.nx> <55D3625B.40101@codesourcery.com> In-Reply-To: <55D3625B.40101@codesourcery.com> Content-Type: multipart/mixed; boundary="------------090305060906070406000000" X-SW-Source: 2015-08/txt/msg00489.txt.bz2 This is a multi-part message in MIME format. --------------090305060906070406000000 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Content-length: 8879 On 08/18/2015 05:50 PM, Sandra Loosemore wrote: > I would be happy to try to help debug what is going wrong here, but I'm > not a GDB expert by any means and I'm totally unfamiliar with this part > of the code and don't know where to start. If somebody has a theory on > this, is there someplace I could put in printfs to trace what is > happening? From the previous discussion, it seemed like the flow > through the different layers of abstraction to what the RSP was actually > doing was very obscure, in terms of where the chunking is happening, etc. That was just a misunderstanding. It's not really obscure. GDB has a file io abstraction that wires the open/close/pread/pwrite syscalls to RSP packets. Callers just need to use the target_fileio_open/pread etc. functions instead of the usual syscalls. Whenever BFD or GDB accesses a file that is prefixed target: or remote:, the open/read/write requests go through those calls. For remote targets, those end up in remote.c:remote_hostio_pread, etc. The only chunking going on is the server trimming the pread results to the maximum packet size it can return to gdb. Both gdb and gdbserver have that maximum buffer size hardcoded at 16384 for historical reasons. >> Your PandaBoard takes 8 seconds. That doesn't seem so bad to me. >> If this Altera board is the only one with the massive slowdown then >> I don't think we should delay 7.10 any further on this issue--and I >> certainly don't think we should lose the functionality that the >> default sysroot of "target:" brings. > > FWIW, I have just tried on a random sampling of our MIPS target boards > as well, with the same test program and same scenario of continuing to a > breakpoint on main. The timings were 1:40 (SEAD-3 LX110 board, M14Kc > microMips), 1:38 (MALTA 74Kc MIPS16), and 1:20 (MALTA 5Kef MIPS64). I > could try some PowerPC boards as well if we need more datapoints. Thanks for doing this. As yet another data point, it'd be interesting to know whether Gary's rate-limit patch speeds up any other of these boards as well. Did you try getting the nfs mount out of the picture? That is, copy the debug-build files to the target, and have gdb pull them automatically. It's expected that reading a file out of remote filesystem is going to be slower than not reading it. Obviously some slowdown is expected. There's no free lunch. The fact that Gary's chunk size limiting patch made things much faster on the nios2 board is still mysterious to me. I'd expect the slowness to be latency bound, given the request/response nature of the RSP, but then I'd expect that more chunking would slow things down, not speed it up. I'm trying to build a more complete picture of the potential for trouble. The more data, the better... So tonight I played with this. To amplify latency issues against a remote target, I've built gdbserver on gcc76 on the gcc compile farm. Anyone can request access to the gcc compile farm machines and to reproduce this. That's an x86-64 in France almost 2000km away from me. I've got a ping time of ~85ms to that machine, and I'm behind a ~16Mbit ADSL. As an illustration, here's what it takes to download the gdbserver binary using ssh/scp. Not the fastest file transfer method, but the most convenient: $ time scp gcc76.fsffrance.org:/home/palves/gdb/build/gdb/gdbserver/gdbserver /tmp/gdbserver gdbserver 100% 1451KB 1.4MB/s 00:01 real 0m1.296s user 0m0.015s sys 0m0.013s Now let's compare with GDB's file retrieval: [palves@gcc76] $ ./gdbserver --multi :9999 [palves@home] $ ssh -L 9999:localhost:9999 gcc76.fsffrance.org [palves@home] $ cat /tmp/remote-get define getfile shell date remote get /home/palves/gdb/build/gdb/gdbserver/gdbserver /tmp/gdbserver shell date end [palves@home] $ ./gdb -data-directory=data-directory -ex "tar extended-rem :9999" -x /tmp/remote-get (gdb) getfile Detaching after fork from child process 31567. Wed Aug 19 01:04:13 WEST 2015 Detaching after fork from child process 31568. Wed Aug 19 01:04:23 WEST 2015 (gdb) ~10s GDB/vFile vs ~1.3s scp/ssh. Not that impressive. So then I tried bumping the RSP max packet buffer size. That required a patch to gdb and another to gdbserver. I bumped the buffer size 16 times. The result was: (gdb) getfile Wed Aug 19 01:13:01 WEST 2015 Wed Aug 19 01:13:04 WEST 2015 (gdb) getfile Wed Aug 19 01:13:12 WEST 2015 Wed Aug 19 01:13:14 WEST 2015 (gdb) getfile Wed Aug 19 01:13:19 WEST 2015 Wed Aug 19 01:13:21 WEST 2015 So 2s/3s. Not bad. Bumping the packet buffer size to 4MB, to make sure a single vFile:pread covers the ~1.5MB file, I consistently get ~1s: (gdb) getfile Wed Aug 19 01:17:52 WEST 2015 Wed Aug 19 01:17:53 WEST 2015 (gdb) getfile Wed Aug 19 01:17:55 WEST 2015 Wed Aug 19 01:17:56 WEST 2015 (gdb) getfile Wed Aug 19 01:17:57 WEST 2015 Wed Aug 19 01:17:58 WEST 2015 (gdb) which is then in the ball park of scp. Seems clear to me that for whole-file transfers, the bottleneck (on this use case) is the number of chunks / round trips. For file transfer, obviously a stream-based transfer, like scp's, that avoids roundtrip hiccups, wins. For the use case of connecting to gdbserver with "target remote", and having gdb fetch the binary out of gdbserver with vFile, I tried something else. This time, I'll simply debug gdbserver itself remotely, letting gdb fetch the gdbserver binary, and run to main. Once main is reached, quit: [palves@gcc76] $ ./gdbserver :9999 ./gdbserver [palves@home] $ ssh -L 9999:localhost:9999 gcc76.fsffrance.org [palves@home] time ./gdb -data-directory=data-directory -ex "tar rem :9999" -ex "b main" -ex "c" -ex "set confirm off" -ex "quit" Pristine gdb 7.10.50.20150816-cvs gets me: ... Remote debugging using :9999 Reading symbols from target:/home/palves/gdb/build/gdb/gdbserver/gdbserver...done. Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. 0x00007ffff7ddd190 in ?? () from target:/lib64/ld-linux-x86-64.so.2 Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635. Continuing. Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635 3635 ../../../src/gdb/gdbserver/server.c: No such file or directory. /home/palves/gdb/build/gdb/gdbserver/gdbserver: No such file or directory. real 0m55.315s user 0m0.400s sys 0m0.087s Bumping the remote max packet buffer size to (16384*16) helps a bit, though not impressive: real 0m51.600s user 0m0.404s sys 0m0.085s Bumping to 0x100000 helps a bit further: real 0m48.335s user 0m0.421s sys 0m0.091s Looking at "set debug remote 1" output, I noticed that gdb often issues a sequence of small vFile:pread requests for the same file. Ctrl-c'ing gdb while it was doing that and getting a backtrace shows that those requests often came from bfd. So I thought that maybe adding a readahead buffer/cache to gdb's vFile:pread requests could help. And indeed, that dropped the time for the same use case further down to: real 0m29.055s user 0m2.625s sys 0m0.090s I added a few counters to show cache hit/miss, and got: readahead cache miss 44 readahead cache hit 377 Not sure exactly what makes the "user" time go up that much (maybe reading too much ahead causes too much RSP escaping, etc.?) If I repeat the same leaving gdbserver's maximum buffer size set at the original 16384, I still get <30s: real 0m27.169s user 0m0.464s sys 0m0.061s though notice the user time goes down again. Maybe this indicates that bigger chunk sizes indeed increase CPU usage, which on slower boards would have a more profound effect than on the x86_64 machines I'm using. An observation in line with Sandra's observation of Gary's patch speeding up the nios2 board. Or maybe that's just a coincidence. BTW, the transfers seem to be always interruptible for me, even without Gary's patch, and even the slow ones. And finally, here's the time using a local sysroot and specifying a local program: $ time $g -ex "set sysroot /" -ex "tar rem :9999" -ex "b main" -ex "c" -ex "set confirm off" -ex "quit" /tmp/gdbserver Reading symbols from /tmp/gdbserver...done. Remote debugging using :9999 Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib64/ld-2.18.so.debug...done. done. 0x00007ffff7ddd190 in ?? () Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635. Continuing. Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635 3635 ../../../src/gdb/gdbserver/server.c: No such file or directory. real 0m7.385s user 0m0.132s sys 0m0.028s I'm attaching the 3 patches that I used. I wouldn't be surprised if they contain obvious blunders; it's getting quite late here... Thanks, Pedro Alves --------------090305060906070406000000 Content-Type: text/x-patch; name="0001-gdb-remove-packet-size-limit.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0001-gdb-remove-packet-size-limit.patch" Content-length: 4014 >From 5fec4911ec7cd9ea2ee4b5d3be0502e7a7df1f9f Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Tue, 18 Aug 2015 22:53:56 +0100 Subject: [PATCH 1/3] gdb: remove packet size limit --- gdb/remote.c | 36 +++++++++++++++--------------------- 1 file changed, 15 insertions(+), 21 deletions(-) diff --git a/gdb/remote.c b/gdb/remote.c index ca1f0df..a839adf 100644 --- a/gdb/remote.c +++ b/gdb/remote.c @@ -894,14 +894,6 @@ get_memory_packet_size (struct memory_packet_config *config) struct remote_state *rs = get_remote_state (); struct remote_arch_state *rsa = get_remote_arch_state (); - /* NOTE: The somewhat arbitrary 16k comes from the knowledge (folk - law?) that some hosts don't cope very well with large alloca() - calls. Eventually the alloca() code will be replaced by calls to - xmalloc() and make_cleanups() allowing this restriction to either - be lifted or removed. */ -#ifndef MAX_REMOTE_PACKET_SIZE -#define MAX_REMOTE_PACKET_SIZE 16384 -#endif /* NOTE: 20 ensures we can write at least one byte. */ #ifndef MIN_REMOTE_PACKET_SIZE #define MIN_REMOTE_PACKET_SIZE 20 @@ -910,7 +902,7 @@ get_memory_packet_size (struct memory_packet_config *config) if (config->fixed_p) { if (config->size <= 0) - what_they_get = MAX_REMOTE_PACKET_SIZE; + what_they_get = 16384; else what_they_get = config->size; } @@ -929,8 +921,6 @@ get_memory_packet_size (struct memory_packet_config *config) && what_they_get > rsa->actual_register_packet_size) what_they_get = rsa->actual_register_packet_size; } - if (what_they_get > MAX_REMOTE_PACKET_SIZE) - what_they_get = MAX_REMOTE_PACKET_SIZE; if (what_they_get < MIN_REMOTE_PACKET_SIZE) what_they_get = MIN_REMOTE_PACKET_SIZE; @@ -3911,6 +3901,7 @@ remote_check_symbols (void) char *msg, *reply, *tmp; struct bound_minimal_symbol sym; int end; + struct cleanup *old_chain; /* The remote side has no concept of inferiors that aren't running yet, it only knows about running processes. If we're connected @@ -3929,7 +3920,8 @@ remote_check_symbols (void) /* Allocate a message buffer. We can't reuse the input buffer in RS, because we need both at the same time. */ - msg = alloca (get_remote_packet_size ()); + msg = xmalloc (get_remote_packet_size ()); + old_chain = make_cleanup (xfree, msg); /* Invite target to request symbol lookups. */ @@ -3967,6 +3959,8 @@ remote_check_symbols (void) getpkt (&rs->buf, &rs->buf_size, 0); reply = rs->buf; } + + do_cleanups (old_chain); } static struct serial * @@ -4089,13 +4083,6 @@ remote_packet_size (const struct protocol_feature *feature, return; } - if (packet_size > MAX_REMOTE_PACKET_SIZE) - { - warning (_("limiting remote suggested packet size (%d bytes) to %d"), - packet_size, MAX_REMOTE_PACKET_SIZE); - packet_size = MAX_REMOTE_PACKET_SIZE; - } - /* Record the new maximum packet size. */ rs->explicit_packet_size = packet_size; } @@ -7645,7 +7632,8 @@ putpkt_binary (const char *buf, int cnt) struct remote_state *rs = get_remote_state (); int i; unsigned char csum = 0; - char *buf2 = alloca (cnt + 6); + char *buf2 = xmalloc (cnt + 6); + struct cleanup *old_chain = make_cleanup (xfree, buf2); int ch; int tcount = 0; @@ -7738,6 +7726,7 @@ putpkt_binary (const char *buf, int cnt) case '+': if (remote_debug) fprintf_unfiltered (gdb_stdlog, "Ack\n"); + do_cleanups (old_chain); return 1; case '-': if (remote_debug) @@ -7746,7 +7735,10 @@ putpkt_binary (const char *buf, int cnt) case SERIAL_TIMEOUT: tcount++; if (tcount > 3) - return 0; + { + do_cleanups (old_chain); + return 0; + } break; /* Retransmit buffer. */ case '$': { @@ -7833,6 +7825,8 @@ putpkt_binary (const char *buf, int cnt) } #endif } + + do_cleanups (old_chain); return 0; } -- 1.9.3 --------------090305060906070406000000 Content-Type: text/x-patch; name="0002-gdbserver-bump-max-packet-buffer-size.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="0002-gdbserver-bump-max-packet-buffer-size.patch" Content-length: 935 >From 0becb8d88cc5e364ce413258de32d326c6395ad6 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Wed, 19 Aug 2015 00:04:18 +0200 Subject: [PATCH 2/3] gdbserver: bump max packet buffer size --- gdb/gdbserver/server.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/gdb/gdbserver/server.h b/gdb/gdbserver/server.h index 9080151..9ec3708 100644 --- a/gdb/gdbserver/server.h +++ b/gdb/gdbserver/server.h @@ -127,6 +127,8 @@ extern int handle_target_event (int err, gdb_client_data client_data); /* Buffer sizes for transferring memory, registers, etc. Set to a constant value to accomodate multiple register formats. This value must be at least as large as the largest register set supported by gdbserver. */ -#define PBUFSIZ 16384 - +//#define PBUFSIZ (16384) +#define PBUFSIZ (16384*16) +//#define PBUFSIZ 0x100000 +//#define PBUFSIZ 0x400000 #endif /* SERVER_H */ -- 1.9.3 --------------090305060906070406000000 Content-Type: text/x-patch; name="0003-add-readahead-cache-to-gdb-s-vFile-pread.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0003-add-readahead-cache-to-gdb-s-vFile-pread.patch" Content-length: 4512 >From d426ce0a36830378a8ec2e2cbdd94d9fcb47b891 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Tue, 18 Aug 2015 23:27:20 +0100 Subject: [PATCH 3/3] add readahead cache to gdb's vFile:pread --- gdb/remote.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 99 insertions(+), 3 deletions(-) diff --git a/gdb/remote.c b/gdb/remote.c index a839adf..8a739c8 100644 --- a/gdb/remote.c +++ b/gdb/remote.c @@ -10311,6 +10311,26 @@ remote_hostio_send_command (int command_bytes, int which_packet, return ret; } +struct readahead_cache +{ + int fd; + gdb_byte *buf; + ULONGEST offset; + size_t bufsize; +}; + +static struct readahead_cache *readahead_cache; + +static void +readahead_cache_invalidate (void) +{ + if (readahead_cache != NULL) + readahead_cache->fd = -1; +} + +static unsigned int readahead_cache_hit_count; +static unsigned int readahead_cache_miss_count; + /* Set the filesystem remote_hostio functions that take FILENAME arguments will use. Return 0 on success, or -1 if an error occurs (and set *REMOTE_ERRNO). */ @@ -10325,6 +10345,8 @@ remote_hostio_set_filesystem (struct inferior *inf, int *remote_errno) char arg[9]; int ret; + readahead_cache_invalidate (); + if (packet_support (PACKET_vFile_setfs) == PACKET_DISABLE) return 0; @@ -10389,6 +10411,9 @@ remote_hostio_pwrite (struct target_ops *self, int left = get_remote_packet_size (); int out_len; + if (readahead_cache != NULL && readahead_cache->fd == fd) + readahead_cache_invalidate (); + remote_buffer_add_string (&p, &left, "vFile:pwrite:"); remote_buffer_add_int (&p, &left, fd); @@ -10407,9 +10432,9 @@ remote_hostio_pwrite (struct target_ops *self, /* Implementation of to_fileio_pread. */ static int -remote_hostio_pread (struct target_ops *self, - int fd, gdb_byte *read_buf, int len, - ULONGEST offset, int *remote_errno) +remote_hostio_pread_1 (struct target_ops *self, + int fd, gdb_byte *read_buf, int len, + ULONGEST offset, int *remote_errno) { struct remote_state *rs = get_remote_state (); char *p = rs->buf; @@ -10443,6 +10468,74 @@ remote_hostio_pread (struct target_ops *self, return ret; } +static int +remote_hostio_pread_from_cache (struct readahead_cache *cache, + int fd, gdb_byte *read_buf, int len, + ULONGEST offset) +{ + if (cache->fd == fd + && cache->offset <= offset + && offset < cache->offset + cache->bufsize) + { + ULONGEST max = cache->offset + cache->bufsize; + + if (offset + len > max) + len = max - offset; + + if (remote_debug) + fprintf_unfiltered (gdb_stdlog, + "readahead cache hit %d\n", + ++readahead_cache_hit_count); + memcpy (read_buf, cache->buf + offset - cache->offset, len); + return len; + } + + return -1; +} + +static int +remote_hostio_pread (struct target_ops *self, + int fd, gdb_byte *read_buf, int len, + ULONGEST offset, int *remote_errno) +{ + int ret; + + if (readahead_cache != NULL) + { + ret = remote_hostio_pread_from_cache (readahead_cache, fd, + read_buf, len, offset); + if (ret >= 0) + return ret; + + readahead_cache_invalidate (); + } + + if (remote_debug) + fprintf_unfiltered (gdb_stdlog, "readahead cache miss %d\n", + ++readahead_cache_miss_count); + + if (readahead_cache == NULL) + readahead_cache = XCNEW (struct readahead_cache); + readahead_cache->fd = fd; + readahead_cache->offset = offset; + readahead_cache->bufsize = get_remote_packet_size (); + readahead_cache->buf = xrealloc (readahead_cache->buf, readahead_cache->bufsize); + + ret = remote_hostio_pread_1 (self, + readahead_cache->fd, + readahead_cache->buf, readahead_cache->bufsize, + readahead_cache->offset, remote_errno); + if (ret <= 0) + { + readahead_cache_invalidate (); + return ret; + } + + readahead_cache->bufsize = ret; + return remote_hostio_pread_from_cache (readahead_cache, fd, + read_buf, len, offset); +} + /* Implementation of to_fileio_close. */ static int @@ -10452,6 +10545,9 @@ remote_hostio_close (struct target_ops *self, int fd, int *remote_errno) char *p = rs->buf; int left = get_remote_packet_size () - 1; + if (readahead_cache != NULL && readahead_cache->fd == fd) + readahead_cache_invalidate (); + remote_buffer_add_string (&p, &left, "vFile:close:"); remote_buffer_add_int (&p, &left, fd); -- 1.9.3 --------------090305060906070406000000--