[Templates] UTF8 support and issues

Mark Proctor m.proctor@bigfoot.com
Tue, 19 Nov 2002 17:27:24 -0000


I've just read that CGI variables default to ISO-8859-1, I don't know if
this is true - if it is, could that be the problem, the combining of
latin1 cgi variables with UTF8 variables from XML::Simple.

Under perl 5.6.1 - assuming that all browsers are ie5.5+ with
charset=3DUTF-8 - how can I ensure that all cgi variables are UTF8 =
flagged
as well? Sorry if I'm going a little off topic here.

Mark

-----Original Message-----
From: templates-admin@template-toolkit.org
[mailto:templates-admin@template-toolkit.org] On Behalf Of Mark Proctor
Sent: 19 November 2002 15:47
To: templates@template-toolkit.org
Subject: RE: [Templates] UTF8 support and issues


I've managed to prove its nothing to do with Template Toolkit, by
replicating the problem outside of Template Toolkit. It actually happens
when you concatonate a string derived from a file read in by XML::Simple
and a cgi variable.

I'm strill trying to acertain why this happens, I thought xml was
suppose to default to UTF-8.

Mark

-----Original Message-----
From: templates-admin@template-toolkit.org
[mailto:templates-admin@template-toolkit.org] On Behalf Of Richard
Tietjen
Sent: 19 November 2002 14:50
To: templates@template-toolkit.org
Subject: Re: [Templates] UTF8 support and issues


> Message: 2
> Date: Tue, 19 Nov 2002 13:04:53 +0000
> From: Andy Wardley <abw@andywardley.com>
> To: Mark Proctor <m.proctor@bigfoot.com>
> Cc: templates@template-toolkit.org,
> 	"Leslie Fuller (lefuller)" <lefuller@cisco.com>
> Subject: Re: [Templates] UTF8 support and issues

> Mark Proctor wrote:
> > Is there something I need to do to tell template toolkit to use
utf8?
> > Will upgrading to the latest version fix this? We have:

...

> I haven't been able to reliably reproduce the problem.  For example,=20
> this test works fine for me under 5.6.1 with TT 2.08c.

>   use strict;
>   use Template;

>   my $leon =3D 'L=E9on Brocard';

I don't think 'L=E9on' is an example of UTF8 encoding. but rather of
ISO-8859-1 encoding.

In UTF8 an &eacute; would be represented with a double-byte sequence: =
=C3=A9
or in case of email mangling \xc3 \xa9.

In perl 5.6 I think you'd also need to

  use utf8;

to turn it on.

It's confusing and the only thing I can offer regarding the actual
problem is that is I only see UTF8 data when I use XML data and that
XML::DOM's toString() turns double-bytes into &#xxxx; I think.  Maybe
there's a clue or hint of a technique in XML approach.
=20
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Richard Tietjen <rdtietjen@pobox.com>       www.pobox.com/~rdtietjen
          "Irony is what they make two-edged swords from."


_______________________________________________
templates mailing list
templates@template-toolkit.org
http://lists.ourshack.com/mailman/listinfo/templates


_______________________________________________
templates mailing list
templates@template-toolkit.org
http://lists.ourshack.com/mailman/listinfo/templates