[mythtvnz] XMLTV headers changed
David Moore
dmoo1790 at ihug.co.nz
Sat Jul 7 09:39:56 BST 2012
On 07/07/12 19:25, Robin Gilks wrote:
> Greetings
>
> It seems that the data contained in http://nzepg.org/freeview.xml.gz had a
> change of header last weekend.
>
> I followed the conversation about mhegsnoop having a header change but
> didn't realise it would propagate to the online data.
>
> This is a real problem for me as I merge data from epgsnoop with the
> online stuff (which has more detail for FreeView) using 'tv_cat' from the
> xmltv package and it barfs with:
> "/tmp/listings-freeview-31681.xml: this file's encoding utf-8 differs from
> others' ISO-8859-1 - aborting"
>
> So the online data now has utf-8 in the header but the epgsnoop data is
> ISO-8859-1. I tried changing outputter.py (in epgsnoop) to utf-8 but the
> data from satellite EPG has some interesting 8 bit characters (which I
> assume really are ISO-8859-1 codes).
>
> So who is right - utf-8 or ISO-8859-1 and how can I merge the two
> different encodings if they are both right!!
>
> Cheers
>
Try iconv to convert the encoding of one file to match the other. For
example:
iconv -c -f UTF-8 -t ISO_8859-1 this_file -o that_file
The -c will skip invalid chars.
Also change the header or delete it. And check the encoding with "file
-i that_file".
UTF-8 vs ISO_8859-1? Well UTF-8 is a more universal char set. ISO_8859-1
is mostly for Western European or Latin languages. I believe UTF-8 is or
is becoming the preferred encoding.
Interestingly you may have revealed a bug. UTF-8 encoding created by
mhegepgsnoop is displayed properly (e.g., by "less file" and myth) but
iconv choked on one character. Seems the byte order might be backwards
for this char but most apps handle it because bytes in multi-byte UTF-8
chars are unambiguous so order doesn't really matter. iconv may be less
tolerant and simply abort if it gets bytes in the wrong order.
More information about the mythtvnz
mailing list