From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16043 invoked by alias); 21 Jun 2002 17:49:04 -0000 Mailing-List: contact gdb-patches-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sources.redhat.com Received: (qmail 16014 invoked from network); 21 Jun 2002 17:49:02 -0000 Received: from unknown (HELO gash2.peakpeak.com) (207.174.178.17) by sources.redhat.com with SMTP; 21 Jun 2002 17:49:02 -0000 Received: from fleche.redhat.com (ta0196.peakpeak.com [204.144.244.196]) by gash2.peakpeak.com (8.9.3/8.9.3) with ESMTP id LAA02440 for ; Fri, 21 Jun 2002 11:48:59 -0600 Received: by fleche.redhat.com (Postfix, from userid 1000) id BB61D4F80BA; Fri, 21 Jun 2002 12:02:22 -0600 (MDT) To: Andrew Cagney Cc: gdb-patches@sources.redhat.com Subject: Re: RFA: start of i18n References: <87elf0wnp2.fsf@fleche.redhat.com> <3D1360ED.6030201@cygnus.com> From: Tom Tromey Reply-To: tromey@redhat.com X-Attribution: Tom X-Zippy: Barbie says, Take quaaludes in gin and go to a disco right away! But Ken says, WOO-WOO!! No credit at ``Mr. Liquor''!! Date: Fri, 21 Jun 2002 10:49:00 -0000 In-Reply-To: <3D1360ED.6030201@cygnus.com> Message-ID: <874rfwwlw1.fsf@fleche.redhat.com> User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-SW-Source: 2002-06/txt/msg00431.txt.bz2 >>>>> "Andrew" == Andrew Cagney writes: Andrew> I guess your stepping up to take the bat on this one (a nice Andrew> suprise). Well, I'm not going to commit to it. But I thought I could make some useful progress even if I can't do it all. Andrew> Can you perhaphs sketch out exactly what internationalizing Andrew> GDB is going to entail? So people, including me, can Andrew> understand what we can expect to be comming our way. Sure, no problem. There are many places in gdb that could be made i18n-aware. This patch only addresses the start of one of them: translating the messages printed by gdb into the user's native language. The idea here is that "help foo" would print the help in German, or French, or Farsi, according to the locale. In the GNU world this is done with a message catalog tool called gettext. Gettext works using source markup of the actual text strings (some tools use tokens or numbers the gettext manual has a nice rationale for why its approach is superior). There are basically 3 parts to gettextizing an application: * The code infrastructure (this patch) * The build infrastructure. This would be the omnipresent `po' directory, `.pot' files, and accompanying Makefile hacks. I haven't done this part for gdb yet. You can look in bfd/, gas/, etc, to see how this is done. (Actually I suppose there are several plausible ways to do it, but for gdb it seems easiest to follow what the rest of the binutils do.) * Marking all the strings. This is the bulk of the work. The idea here is that whenever gdb prints a message: printf ("Hi\n"); You must wrap the string so that (1) xgettext can extract the string and put it into the message catalog, and (2) at runtime the string will be translated: printf (_("Hi\n")); In gdb we'll be using the `gettext' function to do this translation. But it is typical in GNU programs to define `_' as a macro instead. This is shorter and doesn't confuse the source as much. (My patch adds this macro.) I haven't investigated this particular issue in gdb in too much depth. So for instance I don't know how marking will or will not affect MI. I do know, from the last time I looked at gettextizing gdb (that would have been April 98) that parts of gdb aren't i18n-safe. For instance, add_show_from_set is not. This means that as part of the work marking strings, some code changes will have to be made. (Or something else will have to be done; for instance you could imagine an ugly implementation where we have a translated sed expression that is run over the `set' help text. I don't know whether that would actually work though.) Exactly what changes will need to be made is hard to predict. If you look at the ChangeLog-9899 for bfd/gas/ld/..., you can see what I had to do for those. Generally speaking it isn't very much, just reworking computed printf()s and stuff like that. For instance, something like this isn't safe: printf ("No such %s %s\n", is_var ? "variable" : "macro", name); That's because in other languages things may need different orderings, or different endings in different contexts, etc. So here you'd typically rewrite this as an `if' and two `printf's. There's a long section on this in the gettext manual that explains it well. Maintenance-wise there are also some things to be aware of. Only the first one is difficult. * When new strings are added, they must be marked. This is a coding style change and unfortunately isn't really automatable. * You have to regenerate the primary message catalog whenever strings are added, changed, or deleted. * You have to periodically send the master catalog to the appropriate site so that translation teams can work on it. I know some projects (Gnome) try to do a string freeze before a release so that the translation teams can catch up. * You have to periodically incorporate new translation catalogs from the translation teams into the source repository. A couple other ways gdb could be made i18n-aware, plus my comments: * gdb could be aware of the encoding of strings in the inferior, and convert to the correct output encoding when printing. In some situations this might be nice. I don't plan to look at this. * In theory gdb commands could be translated. However, I think there is some consensus that in general translating commands and arguments is too difficult in practice. I don't think anybody is doing this, so for now I think gdb should not either. (For instance: program names and command-line arguments aren't translated, Emacs function names aren't translated.) Tom