[Templates] UTF8 support and issues
   
    Barry Caplan
     
    bcaplan@i18n.com
       
    Tue, 19 Nov 2002 16:00:54 -0800
    
    
  
>I don't think 'L=E9on' is an example of UTF8 encoding. but rather of
>ISO-8859-1 encoding.
>
>In UTF8 an é would be represented with a double-byte sequence: =C3=
=A9
>or in case of email mangling \xc3 \xa9.
UTF-8 and iso-8859-1 are the same for the first 256 code points. Having it=
 otherwise would slow or halt adoption of UTF-8 because of the amount of=
 legacy data in 8859-1 and programs that are built to use it.
When using Unicode, European characters with accents can be represented in=
 more than one way: pre-composed and decomposed. 88-59 contains the=
 precomposed versions. See www.unicode.org for more details.
In my  unicode 3.0 book, \xc3 is \LATIN CAPITAL LETTER A WITH TILDE and \xa9=
 is COPYRIGHT SIGN.  LATIN SMALL LETTER E WITH ACUTE is \xe9.
Barry Caplan
www.i18n.com