Difference between revisions of "Automatic Feed Translation"

From AwasuWiki
Jump to: navigation, search
(New page: =Automatic Feed Translation= Using the XSLT file listed below you can automatically translate a foreign language feed into your language, utilizing Google's Translation service, whenever ...)
 
(Moved Google Translate API deprecation notice to top of page)
 
(11 intermediate revisions by 3 users not shown)
Line 1: Line 1:
=Automatic Feed Translation=
+
<pre style="background-color:pink;color:firebrick;font-weight:bold;">
 +
Google has announced that the Google Translate API, upon which this Awasu extension is based, has been officially
 +
deprecated as of May 26, 2011. "...the number of requests you may make per day will be limited and the API will be shut
 +
off completely on December 1, 2011."
  
Using the XSLT file listed below you can automatically translate a foreign language feed into your language, utilizing Google's Translation service, whenever Awasu updates the associated Channel.
+
http://code.google.com/apis/language/translate/overview.html
  
<pre>
+
We're in the process of seeking alternative translation functionality to provide Awasu users with the same
<?xml version="1.0" encoding="UTF-8"?>
+
or similar translation functionality.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:ktc="http://www.awasu.com/forums/profile.php?mode=viewprofile&amp;u=24618" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
+
</pre>
  <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
 
  
  <msxsl:script language="JScript" implements-prefix="ktc">
+
=Automatic Feed Translation=
  <![CDATA[
 
  // Constants...
 
  
  // Don't edit these two constants until Google says otherwise.
+
You can automatically translate a foreign language feed into your language, utilizing [http://code.google.com/apis/ajaxlanguage/ Google's Translation API], whenever Awasu updates the associated Channel. Here is the list of [http://code.google.com/apis/ajaxlanguage/documentation/#SupportedLanguages supported languages].
  var baseURL = "http://ajax.googleapis.com/ajax/services/language/translate";
 
  var version = "v=1.0";
 
  
  /*
+
Download and save this file somewhere:
  Change this constant (feedLanguage) to the two-character language code of the feed (if known).
+
    [http://www.awasu.com/downloads/extensions/AutoTranslate/AutoTranslate-v2.xsl AutoTranslate-v2.xsl]
  This will improve Google language translation accuracy.
 
 
 
  Leave is blank (empty quotation marks: "") if the feed language is unknown or
 
  the feed contains multiple languages. Google will attempt to guess the langauge
 
  each time the translateLang() function is called.
 
 
 
  Supported language codes are listed here:
 
  http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray
 
  */
 
  var feedLanguage = "pl";
 
 
 
  /*
 
  Change this constant (yourLanguage) to your desired two-character language code.
 
  Supported language codes are listed here:
 
  http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray
 
  */
 
  var yourLanguage = "en";
 
  
  var xmlhttp = new ActiveXObject("Msxml2.XMLHTTP.4.0");
+
<small><em>NOTE: Awasu was upgraded in v2.4.3 to use MSXML6 which contains changes that break this XSL. A fix has already been made and will be released in 2.4.4.alpha1.</em></small>
  var cache = {}; // Case-sensitive results cache.
 
 
 
  function translateLang(textToTranslate)
 
  {
 
    try
 
    {
 
      textToTranslate = textToTranslate.replace(/^\s+|\s+$/g,"");            // Remove leading and trailing whitespace.
 
      textToTranslate = textToTranslate.replace(/^\s*|\s(?=\s)|\s*$/g," "); // Replace repeated spaces, newlines and tabs with a single space.
 
      textToTranslate = textToTranslate.substr(0, 5000);
 
  
      if (cache[textToTranslate])      // It's it in the cache,
+
==Customising the XSLT file==
        return cache[textToTranslate]; // return it.
 
  
      if (textToTranslate)
+
===Your language===
      {
+
There is a constant in the XSLT file called "yourLanguage" which you can change to a specific [http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray two-character language code] of your language. If your language is English, the XSLT file below should work well without any modification.
        //var fullURL = baseURL + languages + yourLanguage;
 
  
        xmlhttp.open("POST", baseURL, false);
+
===Feed language===
        xmlhttp.setRequestHeader("Referer", "http://www.awasu.com/wiki/Feed_Auto_Translate");
+
There is a constant in the XSLT file called "feedLanguage" which you can change to a specific [http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray two-character language code] if you know the language of the feed; this will improve Google's language translation accuracy. Leaving the "feedLanguage" constant blank (''empty quotation marks: ""'') should still work as Google will attempt to guess the feed's language.  More information about setting the feed language is provided below under the "'''Error Reporting'''" section.
        xmlhttp.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");
 
        xmlhttp.send("v=1.0&q=" + encodeURIComponent(textToTranslate) + "&langpair=" + feedLanguage + "|" + yourLanguage);
 
        eval("var response = " + xmlhttp.responseText);
 
  
        var results = new String(response.responseData.translatedText)
+
==Error Reporting==
        cache[textToTranslate] = decodeURI(results); // Add the results to the cache.
+
Due to the nature of XSLT processors, it is impossible to output error information in such a way that Awasu could capture it and display it in the Channel Properties dialog box. So as an alternative error reporting mechanism, any errors will be logged in the Windows "Application" Event Log.  Two types of errors are reported: 1) Problems connecting to the Google Translation service, and 2) problems reported by the the Google Translation service.
        return decodeURI(results);
 
      }
 
      else // The element's value is a null string.
 
      {
 
        cache[textToTranslate] = "";
 
        return "";
 
      }
 
    }
 
    catch(e)
 
    {
 
      cache[textToTranslate] = e;
 
      return e;
 
    }
 
  }
 
  
  function itsBetterWithBacon(textToTranslate)
+
These events will have "'''WSH'''" as the "'''Source'''":
  {
+
[[image:AutoTranslateEventViewer.jpg]]
    return "bacon: " + textToTranslate + " :bacon";
 
  }
 
  
]]>
+
Double-clicking on one of the events will show the detail error code and description of the problem:
</msxsl:script>
+
[[image:AutoTranslateEventProperties.jpg]]
  
  <xsl:template match="node()|@*">
+
The most common problem I encountered when testing this XSLT was a "'''400'''" error code from Google with this description "'''could not reliably detect source language'''".  By setting the feed language to a specific [http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray two-character language code] I was able to eliminate this type of error completely.  If a feed item fails to translate you can manually translate it using these [[http://www.awasu.com/wiki/Google_User_Tools#Language_translation_Send_to_tool Send to/User tools]].
    <xsl:copy>
 
      <xsl:apply-templates select="@*"/>
 
      <xsl:apply-templates/>
 
    </xsl:copy>
 
  </xsl:template>
 
  
  <xsl:template match="/rss/channel/title | /rss/channel/description | /rss/channel/item/title | /rss/channel/item/description | /rss/channel/item/content:encoded | /rss/channel/item/category">
+
So you may want to keep the AutoTranslate.xsl file with the field language blank, but make copies of it and add specific [http://code.google.com/apis/ajaxlanguage/documentation/reference.html#LangNameArray two-character language code] of the languages that you translate the most.  You might rename the copied files something like this: AutoTranslate_ES_to_EN.xsl ('''Spanish to English'''), AutoTranslate_PL_to_EN.xsl ('''Polish to English'''), etc.
    <xsl:variable name="elementName" select="name()" />
 
    <xsl:element name="{name(.)}"> <xsl:value-of select="ktc:translateLang(normalize-space(.))" /></xsl:element>
 
  </xsl:template>
 
</xsl:stylesheet>
 
</pre>
 

Latest revision as of 20:25, 1 June 2011

Google has announced that the Google Translate API, upon which this Awasu extension is based, has been officially
deprecated as of May 26, 2011. "...the number of requests you may make per day will be limited and the API will be shut
off completely on December 1, 2011."

http://code.google.com/apis/language/translate/overview.html

We're in the process of seeking alternative translation functionality to provide Awasu users with the same
or similar translation functionality.

Automatic Feed Translation

You can automatically translate a foreign language feed into your language, utilizing Google's Translation API, whenever Awasu updates the associated Channel. Here is the list of supported languages.

Download and save this file somewhere:

   AutoTranslate-v2.xsl

NOTE: Awasu was upgraded in v2.4.3 to use MSXML6 which contains changes that break this XSL. A fix has already been made and will be released in 2.4.4.alpha1.

Customising the XSLT file

Your language

There is a constant in the XSLT file called "yourLanguage" which you can change to a specific two-character language code of your language. If your language is English, the XSLT file below should work well without any modification.

Feed language

There is a constant in the XSLT file called "feedLanguage" which you can change to a specific two-character language code if you know the language of the feed; this will improve Google's language translation accuracy. Leaving the "feedLanguage" constant blank (empty quotation marks: "") should still work as Google will attempt to guess the feed's language. More information about setting the feed language is provided below under the "Error Reporting" section.

Error Reporting

Due to the nature of XSLT processors, it is impossible to output error information in such a way that Awasu could capture it and display it in the Channel Properties dialog box. So as an alternative error reporting mechanism, any errors will be logged in the Windows "Application" Event Log. Two types of errors are reported: 1) Problems connecting to the Google Translation service, and 2) problems reported by the the Google Translation service.

These events will have "WSH" as the "Source": AutoTranslateEventViewer.jpg

Double-clicking on one of the events will show the detail error code and description of the problem: AutoTranslateEventProperties.jpg

The most common problem I encountered when testing this XSLT was a "400" error code from Google with this description "could not reliably detect source language". By setting the feed language to a specific two-character language code I was able to eliminate this type of error completely. If a feed item fails to translate you can manually translate it using these [Send to/User tools].

So you may want to keep the AutoTranslate.xsl file with the field language blank, but make copies of it and add specific two-character language code of the languages that you translate the most. You might rename the copied files something like this: AutoTranslate_ES_to_EN.xsl (Spanish to English), AutoTranslate_PL_to_EN.xsl (Polish to English), etc.