User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Fri Mar 09, 2007 6:53 pm

I'm working on a config file to check on a game forum to see if there are new posted items. This is the config file I wrote:

Code: Select all

[ChannelParameters]
URL=http://www.gamingonly.nl/forum/forumdisplay.php?f=20
Title=PSP Forum
Description=
BaseUrl=http://www.gamingonly.nl/forum/
MaxItems=15
Shorthand=
SectionPattern=
ItemPattern-1=skin/firstnew.*?<a href="(?P<L>.*?)
ItemPattern-2=".*?bold">(?P<T>.*?)</a>(?P<D>)
ItemPattern-3=


If do a "preview" with WebScraperSettings.exe it will show me this:

Code: Select all

<xml>
<rss>
<channel>
   <generator>Scrape Web page and convert to RSS, by Allan B. Wilson; abwilson@awasu.com</generator>
   <title>XPSP F</title>
   <link>file:///C|\DOCUME~1\Bobby\LOCALS~1\Temp\awasu34</link>
   <description>yuuh</description>
<item>
      <title>Wat heb je liever?</title>
      <link>http://www.gamingonly.nl/forum/showthread.php?t=3626</link>
      <guid>http://www.gamingonly.nl/forum/showthread.php?t=3626</guid>
      <description>
</description>
</item>
<item>
      <title>Mc Donald&apos;s of Burger King</title>
      <link>http://www.gamingonly.nl/forum/showthread.php?t=3625</link>
      <guid>http://www.gamingonly.nl/forum/showthread.php?t=3625</guid>
      <description>
</description>
</item>
</channel>
</rss>


So that is working great. It shows there are two new items in that forum part. But when I create a channel in Awasu with the webscraper plug-in and connect it to this config file than the feed stays empty. :(

I tried everything but I don't understand that if it works in the preview of WebScraperSettings.exe it doesn't give me anything when using Awasu.

Do you have an idea what I'm doing wrong?

Thanks!

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sat Mar 10, 2007 3:54 pm

What does webscraper do when you click on the preview button? The reason I ask is because after I start webscrapersetting.exe and do preview it will get all new items from the forum and if I check on the forum everything compares and is correct.

But if there comes new items and I click on the preview button he keeps showing me the old results/feed until I restart webscrapersettings.exe

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sun Mar 11, 2007 3:23 am

diabloNL wrote:But if there comes new items and I click on the preview button he keeps showing me the old results/feed until I restart webscrapersettings.exe

That's right. The first time it downloads the feed, it keeps it and re-uses it. Not everyone is on broadband :-)

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sun Mar 11, 2007 11:08 am

I understand he keeps the old results in the list but shouldn't he add the new found results?

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sun Mar 11, 2007 11:22 am

diabloNL wrote:I understand he keeps the old results in the list but shouldn't he add the new found results?

No, the web page is downloaded once and then cached so it doesn't have to be re-fetched every time. Actually, even on broadband it's very annoying to have to get it every time.

The intent of the tool is let you fiddle with the config file until it's right, not actually monitor the web page.

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sun Mar 11, 2007 2:25 pm

Aaahhh, ok, I understand. Thanks!

:wink:

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sun Mar 11, 2007 3:53 pm

Is it possible to add characters in the feed created with Webscraper?


For example: let's say I extract a link to an image on a website, but the "<src img..." etc tag is not there. Could I extract the item from a website and add the HTML tag so it will show the image inside Awasu?


I even thought about creating a feed with webscraper and then call that feed with a plug-in to be able to add the "<src img..." tag.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sun Mar 11, 2007 4:26 pm

diabloNL wrote:Is it possible to add characters in the feed created with Webscraper?

I don't have access to the source for the WebScrape plugin but I don't think so. For something like that you'd have to write a plugin.

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sat Apr 14, 2007 11:35 am

I really can't get this thing to work. In webscrapersettings it works perfectly but in Awasu he shows nothing. Any ideas how this can be? :(

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sat Apr 14, 2007 4:38 pm

diabloNL wrote:In webscrapersettings it works perfectly but in Awasu he shows nothing. Any ideas how this can be? :(

Check the raw feed that Awasu is getting and compare it with what WebScrapeSettings has.

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sat Apr 14, 2007 6:03 pm

support wrote:
diabloNL wrote:In webscrapersettings it works perfectly but in Awasu he shows nothing. Any ideas how this can be? :(

Check the raw feed that Awasu is getting and compare it with what WebScrapeSettings has.


You mean clicking right on the channel and click "show last feed" ?

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sat Apr 14, 2007 6:13 pm

diabloNL wrote:You mean clicking right on the channel and click "show last feed" ?

Yep.

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sat Apr 14, 2007 6:21 pm

Well, that shows this:


Code: Select all

<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
   <generator>Scrape Web page and convert to RSS, by Allan B. Wilson; abwilson@awasu.com</generator>
   <title>PSP Forum</title>
   <link>http://www.gamingonly.nl/forum/forumdisplay.php?f=20</link>
   <description></description>
</channel>
</rss>


So there no items, but this is what webscrapersettings generates:

Code: Select all

<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
   <generator>Scrape Web page and convert to RSS, by Allan B. Wilson; abwilson@awasu.com</generator>
   <title>PSP Forum</title>
   <link>file:///C|\DOCUME~1\Bobby\LOCALS~1\Temp\awasu34</link>
   <description></description>
<item>
      <title>It's a beautiful day!</title>
      <link>http://www.gamingonly.nl/forum/showthread.php?t=424</link>
      <guid>http://www.gamingonly.nl/forum/showthread.php?t=424</guid>
      <description>
</description>
</item>
</channel>
</rss>

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sun Apr 15, 2007 1:36 am

Didn't we have a similar problem with another site? There's probably some cookies set somewhere that is causing the web site to return a different page when you're running WebScrapeSettings vs. Awasu.

There are no hooks in Awasu to dump the web page it downloaded for WebScrape (and the 2.3 binaries have already been built so it's too late to add something) but there is a way to take a look at it.

Awasu downloads the requested web page and passes it to the plugin via the <tt>DownloadUrlFile</tt> parameter in the INI file so whip up a quick script that dumps this file to a known location and then invokes WebScrape.exe. Then just subscribe to this script. This way you can see what HTML is being passed to WebScrape and how it differs from what you've been working with in WebScrapeSettings.

User avatar
diabloNL
Posts: 55
Joined: Mon Feb 26, 2007 6:08 am

Postby diabloNL » Sun Apr 15, 2007 11:17 am

Thanks Taka.

As soon as the website changes and I update the channel it gives an error. This is the last feed where there was some error. Do you know what it means?

Code: Select all

<HEAD>
<META HTTP-EQUIV='Content-Type' CONTENT='text/html; charset=utf-8'>
</HEAD>
<HTML><BODY>
<P>The script failed: rc=0
</BODY></HTML>


Return to “Awasu - General Discussion”

Who is online

Users browsing this forum: Yahoo [Bot] and 4 guests