improving i18n-to-gettext.pl
Commit Message
Hi there!
This patch should improve i18n-to-gettext.pl to work with all plugins
1.It also lists i18n.h as there seem to be some plugin-writer putting
translations there.
2. It also strips // comments being in lines without "," at the end.
3. it strips /* ... */ comments being in just one line
4. it strips /* ... */ comments spreading over multiple line
5. it just ignores #if and #endif lines (like used to make plugins compile
with older vdr-versions which had less languages supported).
General question: Why is the locale dir called de_DE and not just de - as that
seems what most other programs on my system do?
Matthias
Comments
Hi Matthias,
Matthias Schwarzott wrote:
> General question: Why is the locale dir called de_DE and not just de - as that
> seems what most other programs on my system do?
because German is spoken in more then one country: de_DE, de_AT and I
think de_CH and more. I havn't not list with all locales here now.
Best regards
Matthias
On Mittwoch, 15. August 2007, Matthias Fechner wrote:
> Hi Matthias,
>
> Matthias Schwarzott wrote:
> > General question: Why is the locale dir called de_DE and not just de - as
> > that seems what most other programs on my system do?
>
> because German is spoken in more then one country: de_DE, de_AT and I
> think de_CH and more. I havn't not list with all locales here now.
>
Yeah, german is spoken in other countries. Is there then a reason to restrict
the translation to germany?
some example:
wget installs the file /usr/share/locale/de/LC_MESSAGES/wget.mo
this is to provide translations for "all" de* locales. Not just the german
one, but also for austria and swiss.
Matthias
On 08/15/07 14:02, Matthias Schwarzott wrote:
> On Mittwoch, 15. August 2007, Matthias Fechner wrote:
>> Hi Matthias,
>>
>> Matthias Schwarzott wrote:
>>> General question: Why is the locale dir called de_DE and not just de - as
>>> that seems what most other programs on my system do?
>> because German is spoken in more then one country: de_DE, de_AT and I
>> think de_CH and more. I havn't not list with all locales here now.
>>
>
> Yeah, german is spoken in other countries. Is there then a reason to restrict
> the translation to germany?
>
> some example:
> wget installs the file /usr/share/locale/de/LC_MESSAGES/wget.mo
> this is to provide translations for "all" de* locales. Not just the german
> one, but also for austria and swiss.
I just tried renaming VDR's "de_DE" locale to "de" and did
LC_ALL=de_AT ./vdr
but it came up with the default English texts. Then I renamed
"de" to "de_AT" and did the same again, and I got the German texts.
I was hoping that gettext would be a little more intelligent and
look for
- an exact match ("de_AT")
- a default ("de")
- any suitable language ("de_DE")
but apparently that's not the case - unless I'm doing something wrong.
Klaus
On Mittwoch, 15. August 2007, Klaus Schmidinger wrote:
> On 08/15/07 14:02, Matthias Schwarzott wrote:
> > On Mittwoch, 15. August 2007, Matthias Fechner wrote:
> >> Hi Matthias,
> >>
> >>
> >> because German is spoken in more then one country: de_DE, de_AT and I
> >> think de_CH and more. I havn't not list with all locales here now.
> >
> > Yeah, german is spoken in other countries. Is there then a reason to
> > restrict the translation to germany?
> >
> > some example:
> > wget installs the file /usr/share/locale/de/LC_MESSAGES/wget.mo
> > this is to provide translations for "all" de* locales. Not just the
> > german one, but also for austria and swiss.
>
> I just tried renaming VDR's "de_DE" locale to "de" and did
>
> LC_ALL=de_AT ./vdr
>
This will work, but only if the locale de_AT you set does exist (being in
output of locale -a).
> but it came up with the default English texts. Then I renamed
> "de" to "de_AT" and did the same again, and I got the German texts.
>
> I was hoping that gettext would be a little more intelligent and
> look for
>
> - an exact match ("de_AT")
> - a default ("de")
> - any suitable language ("de_DE")
I think it does this but not doing "any suitable language".
trying it with ls:
# LC_ALL=de_DE strace ls xxx
open("/usr/share/locale/de_DE.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/de_DE/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT
(No such file or directory)
open("/usr/share/locale/de.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
ENOENT (No such file or directory)
open("/usr/share/locale/de/LC_MESSAGES/coreutils.mo", O_RDONLY) = 3
You can see that gettext does this:
1. Trying the set locale with some different charsets (de_DE.utf8, de_DE)
2. stripping of country and trying language with different charsets (de.utf8,
de).
Only condition is that the locale one sets LC_MESSAGES to must exist.
Now my tests: I created de_DE and de_AT locales on my system, but not de_CH.
# LC_ALL=de_DE ls zzzz
ls: Zugriff auf zzzz nicht möglich: Datei oder Verzeichnis nicht gefunden
# LC_ALL=de_AT ls zzzz
ls: Zugriff auf zzzz nicht möglich: Datei oder Verzeichnis nicht gefunden
# LC_ALL=de_CH ls zzzz
ls: cannot access zzzz: No such file or directory
# LC_ALL=de ls zzz
ls: cannot access zzzz: No such file or directory
The reason vdr does not work with directory called de is the same as LC_ALL=de
will not work.
There is no locale called de even if the directory is called de.
Matthias
On 08/15/07 15:07, Matthias Schwarzott wrote:
> On Mittwoch, 15. August 2007, Klaus Schmidinger wrote:
>> On 08/15/07 14:02, Matthias Schwarzott wrote:
>>> On Mittwoch, 15. August 2007, Matthias Fechner wrote:
>>>> Hi Matthias,
>>>>
>>>>
>>>> because German is spoken in more then one country: de_DE, de_AT and I
>>>> think de_CH and more. I havn't not list with all locales here now.
>>> Yeah, german is spoken in other countries. Is there then a reason to
>>> restrict the translation to germany?
>>>
>>> some example:
>>> wget installs the file /usr/share/locale/de/LC_MESSAGES/wget.mo
>>> this is to provide translations for "all" de* locales. Not just the
>>> german one, but also for austria and swiss.
>> I just tried renaming VDR's "de_DE" locale to "de" and did
>>
>> LC_ALL=de_AT ./vdr
>>
> This will work, but only if the locale de_AT you set does exist (being in
> output of locale -a).
>> but it came up with the default English texts. Then I renamed
>> "de" to "de_AT" and did the same again, and I got the German texts.
>>
>> I was hoping that gettext would be a little more intelligent and
>> look for
>>
>> - an exact match ("de_AT")
>> - a default ("de")
>> - any suitable language ("de_DE")
>
> I think it does this but not doing "any suitable language".
>
> trying it with ls:
> # LC_ALL=de_DE strace ls xxx
>
> open("/usr/share/locale/de_DE.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/de_DE/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT
> (No such file or directory)
> open("/usr/share/locale/de.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1
> ENOENT (No such file or directory)
> open("/usr/share/locale/de/LC_MESSAGES/coreutils.mo", O_RDONLY) = 3
>
> You can see that gettext does this:
> 1. Trying the set locale with some different charsets (de_DE.utf8, de_DE)
> 2. stripping of country and trying language with different charsets (de.utf8,
> de).
>
> Only condition is that the locale one sets LC_MESSAGES to must exist.
>
>
> Now my tests: I created de_DE and de_AT locales on my system, but not de_CH.
>
> # LC_ALL=de_DE ls zzzz
> ls: Zugriff auf zzzz nicht möglich: Datei oder Verzeichnis nicht gefunden
> # LC_ALL=de_AT ls zzzz
> ls: Zugriff auf zzzz nicht möglich: Datei oder Verzeichnis nicht gefunden
> # LC_ALL=de_CH ls zzzz
> ls: cannot access zzzz: No such file or directory
>
> # LC_ALL=de ls zzz
> ls: cannot access zzzz: No such file or directory
>
> The reason vdr does not work with directory called de is the same as LC_ALL=de
> will not work.
> There is no locale called de even if the directory is called de.
Well, if setlocale() can only be called with something like "de_DE"
and not kust "de", then it is clear that VDR's locale directories
must be named "de_DE" etc., because VDR uses these names to call
setlocale() when switching the language during runtime.
I'll add the missing intelligence to I18nInitialize...
Klaus
===================================================================
@@ -75,7 +75,7 @@ die "can't find plugin name!" unless ($P
# Locate the file containing the texts:
$I18NFILE = "";
-for ("i18n.c", `ls *.c`) { # try i18n.c explicitly first
+for ("i18n.c", "i18n.h", `ls *.c`) { # try i18n.c explicitly first
chomp($f = $_);
if (-f $f && `grep tI18nPhrase $f`) {
$I18NFILE = $f;
@@ -204,13 +204,26 @@ $POTFILE = "$PODIR/$PLUGIN.pot";
# Collect all translated texts:
open(F, $I18NFILE) || die "$I18NFILE: $!\n";
+$in_comment=0;
while (<F>) {
chomp;
s/\t/ /g; # get rid of tabs
s/ *$//; # get rid of trailing blanks
s/^ *\/\/.*//; # remove comment lines
- s/, *\/\/.*/,/; # strip trailing comments
+ s/ *\/\/.*//; # strip trailing comments
+ s/\/\*.*\*\///g; # strip c comments
+ if (/\/\*/) {
+ $in_comment=1;
+ s/\/\*.*$//; # remove starting of comment
+ } elsif (/\*\//) {
+ $in_comment=0;
+ s/^.*\*\///; # remove end of comment
+ } elsif ($in_comment) {
+ next;
+ }
next if (/^ *$/); # skip empty lines
+ next if (/#if/);
+ next if (/#endif/);
next unless ($found or $found = /const *tI18nPhrase .*{/); # sync on phrases
next if (/const *tI18nPhrase .*{/); # skip sync line
last if (/{ *NULL *}/); # stop after last phrase