dvb-apps: charset support

Message ID 20110411174841.268990@gmx.net (mailing list archive)
State Not Applicable, archived
Headers

Commit Message

wk April 11, 2011, 5:48 p.m. UTC
  Hi Mauro,
 
> I added some patches to dvb-apps/util/scan.c in order to properly support
> EN 300 468 charsets.
> Before the patch, scan were producing invalid UTF-8 codes here, for
> ISO-8859-15 charsets, as
> scan were simply filling service/provider name with whatever non-control
> characters that were
> there. So, if your computer uses the same character as your service
> provider, you're lucky.
> Otherwise, invalid characters will appear at the scan tables.
> 
> After the changes, scan gets the locale environment charset, and use it as
> the output charset
> on the output files.

This implementation in scan expects the environment settings to be 'language_country.encoding', but i think the more general way is 'language_country.encoding@variant'.

i get the following error from scan, because iconv doesnt know 'ISO-8859-15@euro'.

<snip>
WARNING: Conversion from ISO-8859-9 to ISO-8859-15@euro not supported
WARNING: Conversion from ISO-8859-9 to ISO-8859-15@euro not supported
...
WARNING: Conversion from ISO-8859-15 to ISO-8859-15@euro not supported
WARNING: Conversion from ISO-8859-15 to ISO-8859-15@euro not supported
</snap>

I suggest to change scan.c as follows:


This cuts the '@variant' part from charset, so that iconv will find its way.

cheers,
Winfried
  

Comments

Mauro Carvalho Chehab April 11, 2011, 6:24 p.m. UTC | #1
Em 11-04-2011 14:48, handygewinnspiel@gmx.de escreveu:
> Hi Mauro,
>  
>> I added some patches to dvb-apps/util/scan.c in order to properly support
>> EN 300 468 charsets.
>> Before the patch, scan were producing invalid UTF-8 codes here, for
>> ISO-8859-15 charsets, as
>> scan were simply filling service/provider name with whatever non-control
>> characters that were
>> there. So, if your computer uses the same character as your service
>> provider, you're lucky.
>> Otherwise, invalid characters will appear at the scan tables.
>>
>> After the changes, scan gets the locale environment charset, and use it as
>> the output charset
>> on the output files.
> 
> This implementation in scan expects the environment settings to be 'language_country.encoding', but i think the more general way is 'language_country.encoding@variant'.
> 
> i get the following error from scan, because iconv doesnt know 'ISO-8859-15@euro'.

Ah, ok. I never saw such syntax. Thanks for pinging me about that!

> 
> <snip>
> WARNING: Conversion from ISO-8859-9 to ISO-8859-15@euro not supported
> WARNING: Conversion from ISO-8859-9 to ISO-8859-15@euro not supported
> ...
> WARNING: Conversion from ISO-8859-15 to ISO-8859-15@euro not supported
> WARNING: Conversion from ISO-8859-15 to ISO-8859-15@euro not supported
> </snap>
> 
> I suggest to change scan.c as follows:
> 
> --- dvb-apps-5e68946b0e0d_orig/util/scan/scan.c 2011-04-10 20:22:52.000000000 +0200
> +++ dvb-apps-5e68946b0e0d/util/scan/scan.c      2011-04-11 19:41:21.460000060 +0200
> @@ -2570,14 +2570,14 @@
>         if ((charset = getenv("LC_ALL")) ||
>             (charset = getenv("LC_CTYPE")) ||
>             (charset = getenv ("LANG"))) {
> -               while (*charset != '.' && *charset)
> -                       charset++;
> -               if (*charset == '.')
> -                       charset++;
> -               if (*charset)
> -                       output_charset = charset;
> -               else
> -                       output_charset = nl_langinfo(CODESET);
> +               // assuming 'language_country.encoding@variant'
> +               char * p;
> +
> +               if ((p = strchr(charset, '.')))
> +                       charset = p + 1;
> +               if ((p = strchr(charset, '@')))
> +                       *p = 0;
> +               output_charset = charset;

This will fail if LANG=C

Basically, if charset doesn't contain '.', this block should not set output_charset.


>         } else
>                 output_charset = nl_langinfo(CODESET);
> 
> 
> This cuts the '@variant' part from charset, so that iconv will find its way.
> 
> cheers,
> Winfried
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
  

Patch

--- dvb-apps-5e68946b0e0d_orig/util/scan/scan.c 2011-04-10 20:22:52.000000000 +0200
+++ dvb-apps-5e68946b0e0d/util/scan/scan.c      2011-04-11 19:41:21.460000060 +0200
@@ -2570,14 +2570,14 @@ 
        if ((charset = getenv("LC_ALL")) ||
            (charset = getenv("LC_CTYPE")) ||
            (charset = getenv ("LANG"))) {
-               while (*charset != '.' && *charset)
-                       charset++;
-               if (*charset == '.')
-                       charset++;
-               if (*charset)
-                       output_charset = charset;
-               else
-                       output_charset = nl_langinfo(CODESET);
+               // assuming 'language_country.encoding@variant'
+               char * p;
+
+               if ((p = strchr(charset, '.')))
+                       charset = p + 1;
+               if ((p = strchr(charset, '@')))
+                       *p = 0;
+               output_charset = charset;
        } else
                output_charset = nl_langinfo(CODESET);