[mythtvnz] XMLTV headers changed

David Moore dmoo1790 at ihug.co.nz
Sat Jul 7 10:40:16 BST 2012


On 07/07/12 21:16, Robin Gilks wrote:
>
>> On 07/07/12 19:25, Robin Gilks wrote:
>>> Greetings
>>>
>>> It seems that the data contained in http://nzepg.org/freeview.xml.gz had
>>> a
>>> change of header last weekend.
>>>
>>> I followed the conversation about mhegsnoop having a header change but
>>> didn't realise it would propagate to the online data.
>>>
>>> This is a real problem for me as I merge data from epgsnoop with the
>>> online stuff (which has more detail for FreeView) using 'tv_cat' from
>>> the
>>> xmltv package and it barfs with:
>>> "/tmp/listings-freeview-31681.xml: this file's encoding utf-8 differs
>>> from
>>> others' ISO-8859-1 - aborting"
>>>
>>> So the online data now has utf-8 in the header but the epgsnoop data is
>>> ISO-8859-1. I tried changing outputter.py (in epgsnoop) to utf-8 but the
>>> data from satellite EPG has some interesting 8 bit characters (which I
>>> assume really are ISO-8859-1 codes).
>>>
>>> So who is right - utf-8 or ISO-8859-1 and how can I merge the two
>>> different encodings if they are both right!!
>>>
>>> Cheers
>>>
>>
>> Try iconv to convert the encoding of one file to match the other. For
>> example:
>>
>> iconv -c -f UTF-8 -t ISO_8859-1 this_file -o that_file
>>
>> The -c will skip invalid chars.
>>
>> Also change the header or delete it. And check the encoding with "file
>> -i that_file".
>>
>> UTF-8 vs ISO_8859-1? Well UTF-8 is a more universal char set. ISO_8859-1
>> is mostly for Western European or Latin languages. I believe UTF-8 is or
>> is becoming the preferred encoding.
>>
>> Interestingly you may have revealed a bug. UTF-8 encoding created by
>> mhegepgsnoop is displayed properly (e.g., by "less file" and myth) but
>> iconv choked on one character. Seems the byte order might be backwards
>> for this char but most apps handle it because bytes in multi-byte UTF-8
>> chars are unambiguous so order doesn't really matter. iconv may be less
>> tolerant and simply abort if it gets bytes in the wrong order.
>
>
> So the online data is now created by mhegepgsnoop? I'm surprised it
> doesn't follow the existing (as of the last 4 years at least) encoding of
> ISO_8859-1 from epgsnoop or at least checked for compatibility with an
> existing schema.
>

No, I believe nzepg is only using mhegepgsnoop for some channels, e.g., 
ChoiceTV?



More information about the mythtvnz mailing list