vidconf
Posts: 42
Joined: Thu Mar 08, 2007 3:21 pm

Postby vidconf » Fri Mar 16, 2007 3:42 pm

Hi All:

I've got several feeds generated via Webscrape and all work well except for one.

I get "The script timed out." in show last feed option.

When I run the feed manaully via Webscrapesettings.exe I get error "The plugin time out"

The website scraped is online and working via IE.

Wierd part is this feed works well(on schedule) 25% of the time.

Any ideas how I can troubleshoot/remedy?

Thanks
Vidconf

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Fri Mar 16, 2007 6:38 pm

Maybe the website your scraping has something dynamic in their code causing the script to mallfunction sometimes. What is the URL and can you post your config file?

vidconf
Posts: 42
Joined: Thu Mar 08, 2007 3:21 pm

Postby vidconf » Fri Mar 16, 2007 7:20 pm

Here's the config file.
--------------------------------------------------------

[ChannelParameters]
URL=http://www.canada.com/ottawacitizen/news/business/index.html
Title=Bus News
Description=
BaseUrl=http://www.canada.com/ottawacitizen/news/business/
MaxItems=15
Shorthand=
SectionPattern=<strong>Page d01 / front</strong></div>(.*?)<b>More Ottawa Citizen</b>
ItemPattern-1=.*?<a href="(?P<L>.*?)".*?<strong>(?P<T>.*?)<
ItemPattern-2=
ItemPattern-3=(?P<D>.*?)

------------------------------------------------------------

Thanks


Vidconf

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Fri Mar 16, 2007 11:39 pm

The problem is like I thought. You are looking for "Page d01" in your config file but their page is changing dynamically. They are using "Page e01" now.


Try it on this URL:

http://www.canada.com/ottawacitizen/new ... escan.html

But to save you the trouble here is the config file:

--------------------------------------------
[ChannelParameters]
URL=http://www.canada.com/ottawacitizen/news/headlinescan.html
Title=Bus News
Description=
BaseUrl=http://www.canada.com/ottawacitizen/news/
MaxItems=15
Shorthand=
SectionPattern=<H4>Business</H4>(.*)
ItemPattern-1=<a href="(?P<L>.*?)".*?">(?P<T>.*?)<(?P<D>)
ItemPattern-2=
ItemPattern-3=
--------------------------------------------

I've made another one based on your first URL. This one will also show the desription and should keep working even if they change the page. If not you can always use the one above without description. ;)

--------------------------------------------
[ChannelParameters]
URL=http://www.canada.com/ottawacitizen/news/business/index.html
Title=Bus News
Description=
BaseUrl=http://www.canada.com/ottawacitizen/news/business/
MaxItems=15
Shorthand=
SectionPattern=<h2>Business</h2>(.*)More Ottawa Citizen
ItemPattern-1=.*?<a href="(?P<L>.*?)".*?<strong>(?P<T>.*?)<
ItemPattern-2=.*?<br />(?P<D>.*?)</p>
ItemPattern-3=
---------------------------------------------

User avatar
support
Site Admin
Posts: 3021
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sat Mar 17, 2007 2:45 am

vidconf wrote:I get "The script timed out." in show last feed option.

You can change the amount of time Awasu will wait for plugins in the Program Options, Advanced tab, <i>Script timeout</i> setting. This will apply to *all* plugins.

If you want to set it just for your WebScrape plugins, all plugins support a special parameter called <i>ScriptTimeout</i> (in addition to the ones I described here). Add the following to your <tt>WebScrape.plugin</tt> file and you see a new parameter appear in each plugin's parameters (a blank value means "use the global setting from the Program Options").

Code: Select all

[ChannelParameterDefinition-6]
Name=ScriptTimeout
Type=int
DefaultValue=
Description=Script timeout (in seconds)

vidconf
Posts: 42
Joined: Thu Mar 08, 2007 3:21 pm

Postby vidconf » Sat Mar 17, 2007 2:39 pm

Thanks diabloNL for the new config files.

Much appreciated.

Seems to work fine now. :D

Cheers

Vidconf


Return to “Awasu - Extensions”

Who is online

Users browsing this forum: No registered users and 4 guests