XmlCleaner Plugin Channel
From AwasuWiki
This plugin corrects errors in an XML feed.
Installation
- Install Python, PyWin32, and uTidyLib.
- uTidyLib contains an older version of the ctypes Python module. The included version works fine with Python 2.3, but for Python 2.4, ctypes should be updated to a newer version.
- Create the two files listed below. It is recommended to put them into the `ChannelPlugins` directory under Awasu's installation directory.
XmlCleaner.plugin:
[Config] AuthorName=Tomi Junnila AuthorEmailAddress=notlisted PluginNotes=This plugin uses uTidylib to clean up an erroneous XML feed ' -------------------------------------------------------------------------------- [ChannelParameterDefinition-1] Name=DownloadUrl Type=string DefaultValue= Description=URL to read erroneous feed from.
XmlCleaner.py:
# -*- coding: iso-8859-1 -*-
import sys, win32api
import socket
from xml.dom.minidom import parseString
import urllib
from datetime import datetime,tzinfo,timedelta
import time as _time
import tidy
# Set options for uTidyLib
tidyopts = dict(output_xml=1, input_xml=1, add_xml_decl=1, indent=1, tidy_mark=0, output_encoding='utf8')
# Awasu will already have downloaded the DownloadUrl and stored it into a
# temporary file pointed to by DownloadUrlFile. Get the file name:
filename = win32api.GetProfileVal("System","DownloadUrlFile","",sys.argv[1])
# Then read the file:
page = file(filename,'rb')
xml = ''
block = 'a'
while block!='':
block = page.read()
xml = xml + block
page.close()
# Clean up the HTML:
xml = tidy.parseString(xml, **tidyopts)
print xml
Usage
To use the XmlCleaner plugin, select File -> New channel, then "Generated by a channel plugin", and browse to the XmlCleaner.py file. Add the feed URL in the plugin's URL parameter, and you're done.
