[Templates] UTF8 support and issues

Richard Tietjen rdtietjen@pobox.com
Tue, 19 Nov 2002 08:50:02 -0600 (CST)


> Message: 2
> Date: Tue, 19 Nov 2002 13:04:53 +0000
> From: Andy Wardley <abw@andywardley.com>
> To: Mark Proctor <m.proctor@bigfoot.com>
> Cc: templates@template-toolkit.org,
> 	"Leslie Fuller (lefuller)" <lefuller@cisco.com>
> Subject: Re: [Templates] UTF8 support and issues

> Mark Proctor wrote:
> > Is there something I need to do to tell template toolkit to use utf8?
> > Will upgrading to the latest version fix this? We have:

...

> I haven't been able to reliably reproduce the problem.  For example, 
> this test works fine for me under 5.6.1 with TT 2.08c.

>   use strict;
>   use Template;

>   my $leon = 'Léon Brocard';

I don't think 'Léon' is an example of UTF8 encoding. but rather of
ISO-8859-1 encoding.

In UTF8 an &eacute; would be represented with a double-byte sequence: é
or in case of email mangling \xc3 \xa9.

In perl 5.6 I think you'd also need to

  use utf8;

to turn it on.

It's confusing and the only thing I can offer regarding the actual
problem is that is I only see UTF8 data when I use XML data and that
XML::DOM's toString() turns double-bytes into &#xxxx; I think.  Maybe
there's a clue or hint of a technique in XML approach.
 
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Richard Tietjen <rdtietjen@pobox.com>       www.pobox.com/~rdtietjen
          "Irony is what they make two-edged swords from."