[mythtvnz] CTV (CanterburyTV) EPG scraper anyone?

Steven Ellis mythtvnz@lists.linuxnut.co.nz
Mon, 20 Nov 2006 15:24:05 +1300 (NZDT)


Robin Gilks wrote:
>
>> Robin Gilks wrote:
>>>
>>>> Ok we now have permission from CTV to redistribute their EPG data.
>>>> I'll
>>>> tidy this up in the next couple of days, and put up an icon file, but
>>>> for the moment can someone please test the following.
>>>>
>>>> http://www.mythtv.co.nz/epg/ctv.xml.gz
>>>>
>>>> This means we now have permission for AltTV, Triangle Auckland and
>>>> CTV.
>>>>
>>>> Steve
>>>>
>>>
>>> I've just compared it to the CTV web site (and info I had heard "down
>>> the
>>> grapevine") and the data is totally different. For example I was going
>>> to
>>> schedule a recording of "Irish Last Night of the Proms" at 19:00 to
>>> 20:30
>>> but according to the XML file, that is some footy match program running
>>> from 19:00 to 21:00.
>>>
>>> I hope they are not trying to pull a fast one...
>>
>> The info is actually scraped off their website. I'll re-run the scraper
>> and repost it later today.
>>
>> At the moment they provide their data in a Word document, so their
>> website
>> is actually easier to parse.
>>
>> The nice bit is I have offical permission to do this and re-distribute
>> the
>> information.
>>
>> If it all appears to be Ok i'll arrange for the information to be
>> updated
>> every night around 12:30
>
> Looks like the list is only updated on a Monday morning - that means for
> example that right now, Sat and Sun are out of date (i.e. not days 6 & 7
> from now). Does this mean some scraping magic is required to ensure we
> don't go back in time thinking we have next weeks data? I certainly found
> that on Saturday I was trying to check from the program duration whether
> the Last Night of the Proms was a repeat on Monday or another part - but
> Monday was still a week behind, not a week ahead.

Ok the EPG data should be good until 6am Saturday.. can you check this.

> Perhaps a prod at them if they only update once a week!!

I'll see what I can do

> If the Word document is more up-to-date then perhaps catdoc (or antiword)
> will help get the data out in a format that can be xml-ified.

Thanks for the tips. I'll see what they can do.

And thanks for the testing.

Steve


--------------------------------------------
Steven Ellis - Technical Director
OpenMedia Limited
email   - steven@openmedia.co.nz
sales   - sales@openmedia.co.nz
support - support@openmedia.co.nz
website - http://www.openmedia.co.nz