[Templates] UTF8 support and issues
Mark Proctor (mproctor)
mproctor@cisco.com
Wed, 20 Nov 2002 19:05:17 -0000
This is a multi-part message in MIME format.
------_=_NextPart_001_01C290C7.C3C6EE3B
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Success - I found this
=20
<http://groups.google.com/groups?hl=3Den&lr=3D&ie=3DUTF-8&oe=3DUTF-8&thre=
adm=3D200
20429145407.00874.00005678%40mb-me.aol.com&rnum=3D1&prev=3D/groups%3Fq%3D=
per
l%2Bpack%2Bcgi%2Butf%2BOR%2Butf8%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DU
TF-8%26as_qdr%3Dall%26selm%3D20020429145407.00874.00005678%2540mb-me.aol
.com%26rnum%3D1>
http://groups.google.com/groups?hl=3Den&lr=3D&ie=3DUTF-8&oe=3DUTF-8&threa=
dm=3D2002
0429145407.00874.00005678%40mb-me.aol.com&rnum=3D1&prev=3D/groups%3Fq%3Dp=
erl
%2Bpack%2Bcgi%2Butf%2BOR%2Butf8%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUT
F-8%26as_qdr%3Dall%26selm%3D20020429145407.00874.00005678%2540mb-me.aol.
com%26rnum%3D1
This line can take a UTF8 input and tag it as UTF8
$text =3D pack('U*', unpack('U*', $q->param('text')));
Which actually is what Peter Guzis said - sorry for not understanding
this the first time peter.
Is this the only way to tag a string that has come in from CGI as UTF8?
I will also pose this question on perl-unicode.=20
Not sure how this fits in with template toolkit - other than insisting
that if people are working with utf8 they need to do this with their
variables. I don't think you want to do this as default as I expect
there is a small penalty.
Thanks=20
Mark
-----Original Message-----
From: Mark Proctor [ <mailto:m.proctor@bigfoot.com>
mailto:m.proctor@bigfoot.com]=20
Sent: 20 November 2002 18:47
To: 'Barry Caplan'; 'Andreas J. Koenig'
Cc: perl-unicode@perl.org
Subject: RE: CGI and UTF
=20
Unfortunetly I have asked the cisco admins if we can have perl5.8 and
they said no way.
I have tried doing stuff like this:
$text =3D $q->param('text');
if ($q->param('text')) {
print $text . $xml->{message};
} else {
print "\x{00F3}" . $xml->{message};
}
And it works and displays fine. I display this in the textarea, so that
I can resubmit it, it comes back mangled still :(
Mark
-----Original Message-----
From: Barry Caplan [ <mailto:bcaplan@i18n.com> mailto:bcaplan@i18n.com]=20
Sent: 20 November 2002 18:42
To: Mark Proctor; 'Andreas J. Koenig'
Cc: perl-unicode@perl.org
Subject: RE: CGI and UTF
=20
Mark,
I think 5.8 has a encode module with a normalize function. CPAN probably
has something similar. The perl docs for those modules is probably a
good place to start to understand unicode normalization. unicode.org is
the definitive source but could be pretty pedantic if this is your first
exposure.
Barry Caplan
<outbind://60/www.i18n.com> www.i18n.com
At 05:38 PM 11/20/2002 +0000, Mark Proctor wrote:
>I have checked with the sysadmins at cisco and they said "no way" :(
>So I have to get this working. Someone has said that I need to
>"normalise" the params from cgi - but I have no idea what that means.
>
>Mark
=20
=20
------_=_NextPart_001_01C290C7.C3C6EE3B
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<TITLE>Message</TITLE>
<META content=3D"MSHTML 6.00.2719.2200" name=3DGENERATOR></HEAD>
<BODY>
<DIV><FONT face=3DArial size=3D2><FONT size=3D2>
<P>Success - I found this</P>
<P></FONT><A=20
href=3D"http://groups.google.com/groups?hl=3Den&lr=3D&ie=3DUTF-8&=
amp;oe=3DUTF-8&threadm=3D20020429145407.00874.00005678%40mb-me.aol.co=
m&rnum=3D1&prev=3D/groups%3Fq%3Dperl%2Bpack%2Bcgi%2Butf%2BOR%2But=
f8%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26as_qdr%3Dall%26selm%3D20=
020429145407.00874.00005678%2540mb-me.aol.com%26rnum%3D1"><U><FONT=20
color=3D#0000ff=20
size=3D2>http://groups.google.com/groups?hl=3Den&lr=3D&ie=3DUTF-8=
&oe=3DUTF-8&threadm=3D20020429145407.00874.00005678%40mb-me.aol.c=
om&rnum=3D1&prev=3D/groups%3Fq%3Dperl%2Bpack%2Bcgi%2Butf%2BOR%2Bu=
tf8%26hl%3Den%26lr%3D%26ie%3DUTF-8%26oe%3DUTF-8%26as_qdr%3Dall%26selm%3D2=
0020429145407.00874.00005678%2540mb-me.aol.com%26rnum%3D1</U></FONT></A><=
/P><FONT=20
size=3D2>
<P>This line can take a UTF8 input and tag it as UTF8</P>
<P>$text =3D pack('U*', unpack('U*', $q->param('text')));</P>
<P>Which actually is what Peter Guzis said - sorry for not understanding =
this=20
the first time peter.</P>
<P>Is this the only way to tag a string that has come in from CGI as =
UTF8? I=20
will also pose this question on perl-unicode. </P>
<P>Not sure how this fits in with template toolkit - other than =
insisting that=20
if people are working with utf8 they need to do this with their =
variables. I=20
don't think you want to do this as default as I expect there is a small=20
penalty.</P>
<P>Thanks </P>
<P>Mark</P>
<P>-----Original Message-----</P>
<P>From: Mark Proctor [</FONT><A =
href=3D"mailto:m.proctor@bigfoot.com"><U><FONT=20
color=3D#0000ff =
size=3D2>mailto:m.proctor@bigfoot.com</U></FONT></A><FONT size=3D2>]=20
</P>
<P>Sent: 20 November 2002 18:47</P>
<P>To: 'Barry Caplan'; 'Andreas J. Koenig'</P>
<P>Cc: perl-unicode@perl.org</P>
<P>Subject: RE: CGI and UTF</P>
<P> </P>
<P>Unfortunetly I have asked the cisco admins if we can have perl5.8 =
and</P>
<P>they said no way.</P>
<P>I have tried doing stuff like this:</P>
<P>$text =3D $q->param('text');</P>
<P>if ($q->param('text')) {</P>
<P>print $text . $xml->{message};</P>
<P>} else {</P>
<P>print "\x{00F3}" . $xml->{message};</P>
<P>}</P>
<P>And it works and displays fine. I display this in the textarea, so =
that</P>
<P>I can resubmit it, it comes back mangled still :(</P>
<P>Mark</P>
<P>-----Original Message-----</P>
<P>From: Barry Caplan [</FONT><A =
href=3D"mailto:bcaplan@i18n.com"><U><FONT=20
color=3D#0000ff size=3D2>mailto:bcaplan@i18n.com</U></FONT></A><FONT =
size=3D2>] </P>
<P>Sent: 20 November 2002 18:42</P>
<P>To: Mark Proctor; 'Andreas J. Koenig'</P>
<P>Cc: perl-unicode@perl.org</P>
<P>Subject: RE: CGI and UTF</P>
<P> </P>
<P>Mark,</P>
<P>I think 5.8 has a encode module with a normalize function. CPAN =
probably</P>
<P>has something similar. The perl docs for those modules is probably =
a</P>
<P>good place to start to understand unicode normalization. unicode.org =
is</P>
<P>the definitive source but could be pretty pedantic if this is your =
first</P>
<P>exposure.</P>
<P>Barry Caplan</P>
<P></FONT><A href=3D"outbind://60/www.i18n.com"><U><FONT color=3D#0000ff =
size=3D2>www.i18n.com</U></FONT></A></P><FONT size=3D2>
<P>At 05:38 PM 11/20/2002 +0000, Mark Proctor wrote:</P>
<P>>I have checked with the sysadmins at cisco and they said "no way" =
:(</P>
<P>>So I have to get this working. Someone has said that I need =
to</P>
<P>>"normalise" the params from cgi - but I have no idea what that =
means.</P>
<P>></P>
<P>>Mark</P>
<P> </P>
<P> </P></FONT></FONT></DIV></BODY></HTML>
------_=_NextPart_001_01C290C7.C3C6EE3B--