Difference between revisions of "XmlCleaner Plugin Channel"

From AwasuWiki
Jump to: navigation, search
 
(Installation)
Line 5: Line 5:
  
 
* Install [http://www.python.org/ Python], [http://sourceforge.net/projects/pywin32/ PyWin32], and [http://utidylib.berlios.de/ uTidyLib].
 
* Install [http://www.python.org/ Python], [http://sourceforge.net/projects/pywin32/ PyWin32], and [http://utidylib.berlios.de/ uTidyLib].
* Download and extract [[media:XmlCleaner.zip|XmlCleaner.zip]]. It is recommended to extract the contents to the `ChannelPlugins` directory under Awasu's installation directory.
+
* Create the two files listed below. It is recommended to put them into the `ChannelPlugins` directory under Awasu's installation directory.
  
 +
'''XmlCleaner.plugin''':
 +
<pre>
 +
[Config]
 +
AuthorName=Tomi Junnila
 +
AuthorEmailAddress=notlisted
 +
PluginNotes=This plugin uses uTidylib to clean up an erroneous XML feed
 +
 +
' --------------------------------------------------------------------------------
 +
 +
[ChannelParameterDefinition-1]
 +
Name=URL
 +
Type=string
 +
DefaultValue=
 +
Description=URL to read erroneous feed from.
 +
</pre>
 +
 +
'''XmlCleaner.py''':
 +
<pre>
 +
# -*- coding: iso-8859-1 -*-
 +
 +
import sys, win32api
 +
import socket
 +
from xml.dom.minidom import parseString
 +
import urllib
 +
from datetime import datetime,tzinfo,timedelta
 +
import time as _time
 +
import tidy
 +
 +
# Set options for uTidyLib
 +
tidyopts = dict(output_xml=1, input_xml=1, add_xml_decl=1, indent=1, tidy_mark=0, output_encoding='utf8')
 +
 +
# Get the feed URL from the plugin configuration:
 +
url = win32api.GetProfileVal("ChannelParameters","URL","",sys.argv[1])
 +
 +
page = urllib.urlopen(url, proxies=proxies)
 +
xml = ''
 +
block = 'a'
 +
while block!='':
 +
    block = page.read()
 +
    xml = xml + block
 +
page.close()
 +
# Clean up the HTML:
 +
xml = tidy.parseString(xml, **tidyopts)
 +
print xml
 +
</pre>
  
 
== Usage ==
 
== Usage ==
  
 
To use the XmlCleaner plugin, select File -> New channel, then "Generated by a channel plugin", and browse to the XmlCleaner.py file. Add the feed URL in the plugin's URL parameter, and you're done.
 
To use the XmlCleaner plugin, select File -> New channel, then "Generated by a channel plugin", and browse to the XmlCleaner.py file. Add the feed URL in the plugin's URL parameter, and you're done.

Revision as of 12:07, 19 June 2006

This plugin corrects errors in an XML feed.


Installation

  • Install Python, PyWin32, and uTidyLib.
  • Create the two files listed below. It is recommended to put them into the `ChannelPlugins` directory under Awasu's installation directory.

XmlCleaner.plugin:

[Config]
AuthorName=Tomi Junnila
AuthorEmailAddress=notlisted
PluginNotes=This plugin uses uTidylib to clean up an erroneous XML feed

' --------------------------------------------------------------------------------

[ChannelParameterDefinition-1]
Name=URL
Type=string
DefaultValue=
Description=URL to read erroneous feed from.

XmlCleaner.py:

# -*- coding: iso-8859-1 -*-

import sys, win32api
import socket
from xml.dom.minidom import parseString
import urllib
from datetime import datetime,tzinfo,timedelta
import time as _time
import tidy

# Set options for uTidyLib
tidyopts = dict(output_xml=1, input_xml=1, add_xml_decl=1, indent=1, tidy_mark=0, output_encoding='utf8')

# Get the feed URL from the plugin configuration:
url = win32api.GetProfileVal("ChannelParameters","URL","",sys.argv[1])

page = urllib.urlopen(url, proxies=proxies)
xml = ''
block = 'a'
while block!='':
    block = page.read()
    xml = xml + block
page.close()
# Clean up the HTML:
xml = tidy.parseString(xml, **tidyopts)
print xml

Usage

To use the XmlCleaner plugin, select File -> New channel, then "Generated by a channel plugin", and browse to the XmlCleaner.py file. Add the feed URL in the plugin's URL parameter, and you're done.