chad

Postby chad » Thu Jul 24, 2003 10:44 am

How can I add and view feeds that contain un-escaped ampersands? I have two feeds that use them intermittently, but since they are major sites I am unlikely to get them to change anything.

I can add the feeds if I wait until they don't have an ampersand showing, but then I get errors later when they list a new headline with an ampersand.

Any suggestions/comments on this would be greatly appreciated.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Thu Jul 24, 2003 10:58 am

chad wrote:Die ampersand die!


Sigh... I know the feeling :-(

Unfortunately, I don't think that there's any quick workaround. Unless you know of some way to force Expat to ignore the error (from their FAQ: All well-formedness errors stop processing. Note that the XML Recommendation does not permit conforming XML processors to continue normal processing after a fatal error).

I'll take a look at this again this soon. The only thing I can suggest in the meantime is write a channel plugin that downloads the feed, fixes up any errors and then pumps the new feed into Awasu.

There's probably something floating around on the net in Python or Perl that does this already. Let me know if you find something and I'll be happy to help you out with the rest.

chad

Postby chad » Fri Jul 25, 2003 11:59 am

Can you point me to the XML specification (or just a list) that provides the complete list of ampersand escape codes (e.g., &, ©, etc.)?

Also are the plugins specific to Awasu, or to another product?

Thanks!

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Fri Jul 25, 2003 12:28 pm

chad wrote:Can you point me to the XML specification (or just a list) that provides the complete list of ampersand escape codes (e.g., &, ©, etc.)?


This is a list for HTML 4: http://www.w3.org/TR/html4/sgml/entities.html

chad wrote:Also are the plugins specific to Awasu, or to another product?


Plugins are Awasu-only.

chad

Postby chad » Sun Aug 17, 2003 11:51 pm

I do not really want to learn the intricacies of plugin programming, but this would not be a difficult task for someone who has already programmed other plugins.

Here is the perl code to do a half-baked job of ampersand substitution:
$raw_data =~ s/ & / & /sg;
(Assuming the news feed data stream is in the variable raw_data, this will replace any occurrences of standalone ampersands with the appropriate escape code. By standalone, I mean an ampersand with a space on each side.)

If you have a template for plugins, all that would be necessary is to put this line into it. Any volunteers?

I've tried this on a different news reader that is written in perl, and it works for the site with which I am having problems.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Mon Aug 18, 2003 1:36 am

chad wrote:If you have a template for plugins, all that would be necessary is to put this line into it. Any volunteers?


Thanks for the input. If you look in the Samples sub-directory, you will find a whole bunch of different sample plugins. I don't really know Perl but I wrote one for you in Python - I couldn't be bothered trying to figure out how to read an INI file but it should be easy to translate.

1) Create a file called EscapeAmpersands.plugin somewhere (recommended location is $/ChannelPlugins):

Code: Select all

[Config]
AuthorName=Awasu
AuthorEmailAddress=support@awasu.com
PluginNotes=This channel plugin fixes un-escaped ampersands in a feed.

' --------------------------------------------------------------------

[ChannelParameterDefinition-1]
Name=DownloadUrl
Type=string
DefaultValue=
Description=The URL of the RSS feed to fix.


2) Translate the following script to Perl and save it in the same directory:

Code: Select all

import sys
import win32api
import re

# get the name of the INI file (first command-line parameter)
configFilename = sys.argv[1]

# read the RSS feed
rssFeedFilename = win32api.GetProfileVal( "System" , "DownloadUrlFile" , "" , configFilename )
fp = open( rssFeedFilename , "r" )
rssFeed = fp.read()

# fix any unescaped ampersands
fixedRssFeed = re.sub( " \& " , " & " , rssFeed )

# output the fixed RSS feed
print fixedRssFeed


3) Start the Channel Wizard and browse to your new script. Enter the URL of the RSS feed when asked.

Let me know how you go. If you send me your script, I'll include it in the samples and post it on the web site.

This is, of course, only a short-term work-around and I'll try to fold this fix into the next beta.
Last edited by support on Mon Aug 18, 2003 6:07 am, edited 1 time in total.

chad

Postby chad » Mon Aug 18, 2003 2:52 am

So my sample code line got translated and was wrong. Here it is, double-escaped:

$raw_data =~ s/ & / & /sg;

Thanks for the info. I knew you were programming in something starting with "P", but forgot it was Python and not perl. I don't really know how to translate that code to perl, but I guess I can plug away at it in my spare time.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Mon Aug 18, 2003 3:16 am

Give this a go (save it as EscapeAmpersands.pl):

Code: Select all

# get the name of the INI file
$configFilename = @ARGV[0] ;

# locate the RSS feed
open( FILE , $configFilename ) ;
while( $var = <FILE> )
{
   if ( $var =~ m"^DownloadUrlFile=")
   {
      ( $ignore , $rssFeedFilename ) = split( /=/ , $var ) ;
   }
}

# read the RSS feed
open( FILE , $rssFeedFilename ) ;
$rssFeed = "" ;
while( $var = <FILE> )
{
   $rssFeed = $rssFeed .= $var ;
}

# fix any unescaped ampersands
$rssFeed =~ s/ & / &amp; /sg;

# output the fixed RSS feed
print $rssFeed


Like I said, I don't really know Perl :roll: :-)

chad

Postby chad » Thu Aug 21, 2003 2:39 am

Thanks! The plugin script works so far. There currently are no un-escaped ampersands in the feed, but give them a week or two and I'm sure there will be another. I'll post back and let you know how it works.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Thu Aug 21, 2003 2:46 am

chad wrote:The plugin script works so far.


Thanks for letting us know.

I'm a Perl programmer now! Woo hoo! :-) :roll: :cry: :oops:

chad

Postby chad » Sat Aug 30, 2003 1:19 am

Got a feed today from starwars.com where the normal feed failed to parse and the plugin feed worked fine.

Note that it was necessary to install a perl scripting program in order for the plugin to work. I used ActivePerl.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sat Aug 30, 2003 1:25 am

chad wrote:Got a feed today from starwars.com where the normal feed failed to parse and the plugin feed worked fine.


Cool. Thanks for letting us know. I've already put something to handle this into the main app which will be released in a week or two.

But it's nice to hear that my first ever Perl program worked :roll: :lol: :oops:

chad

Postby chad » Sat Aug 30, 2003 8:49 am

Note, however, that it only covers a limited subset--an ampersand with spaces on each side of it. This is a stopgap measure at best, but still a useful one.

User avatar
support
Site Admin
Posts: 3022
Joined: Fri Feb 07, 2003 12:48 pm
Location: Melbourne, Australia
Contact:

Postby support » Sat Aug 30, 2003 8:55 am

chad wrote:Note, however, that it only covers a limited subset--an ampersand with spaces on each side of it. This is a stopgap measure at best, but still a useful one.


Yep. The one I wrote for Awasu is a lot smarter and should handle just about everything (famous last words :roll:).


Return to “Awasu - General Discussion”

Who is online

Users browsing this forum: No registered users and 4 guests