User avatar
kevotheclone
Posts: 239
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Postby kevotheclone » Tue Jun 02, 2009 8:30 am

But I do have no experience coding VBS.


That's ok, at one time in their lives, every person has had no programming experience. Even the people who've built great software like Google's Search engine or Awasu all started with "no experience".
I'm interested in resolving this too as I'll be adding functionality to automatically save web pages as MHT files to Awasu in the future and I like it to work with different languages/character sets, but I might not have enough time to fix it myself right now.

I've tried a few different things, and I still have a few more to try; so check back in a couple of days and you should see another reply from me.

Worst case scenario, I'll post my code and highlight some things that you could easily change and point you to the documentation so you'll know what values you can change them to.

Picard
Posts: 7
Joined: Thu May 28, 2009 4:15 pm

Postby Picard » Wed Jun 03, 2009 7:39 pm

This sounds nice,
i really looking foward to this.

Thank You,
Picard

User avatar
kevotheclone
Posts: 239
Joined: Mon Sep 08, 2008 7:16 pm
Location: Elk Grove, California

Postby kevotheclone » Wed Jun 10, 2009 7:52 am

Hi Picard, sorry about the long period of "Deep-Silence". :wink:

I've tried a lot of different options and I don't think that I have found a good generic solution. I think I can resolve the problem for the one web page you referenced but I don't have enough time to fully test the solution on a wider range of web pages.

It's interesting, when Internet Explorer requests the page the HTTP headers are:

Code: Select all

Content-Language: de-DE
Content-Type: text/html; charset=UTF-8


But when you view the HTML source code of the page it

Code: Select all

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">


Manually saving the page as an MHT file with Internet Explorer changes the "charset" attribute to "utf-8"; and displays the characters correctly.

Code: Select all

<META content="text/html; charset=utf-8" http-equiv=Content-Type>


Saving the page using Microsoft's CDOSYS library retains the original value; but does not display the characters correctly.

Code: Select all

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">


Baffling. :blink:

A simple solution for the one page you mentioned would be to save the page using the CDOSYS library and then perform a find and replace on the saved MHT file, changing <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> to <meta http-equiv="Content-Type" content="text/html; charset=utf-8">. It even seems to work by searching for "iso-8859-1" and replacing it with "utf-8", but since these values appear multiple times in the MHT file I can't guarantee that a global replacement of "iso-8859-1" with "utf-8" will work for all web pages.

Since you're already familiar with SavePage.exe, maybe you should continue to use it and then after it saves the MHT file: 1) read the MHT file, 2) perform a RegExReplace(), 3) overwrite the updated file contents back to the original MHT file. It looks like you can do all of this using AutoHotKey language.

If you'd still like my VBScript code I could post it, although it produces the same results as SavePage.exe (which is written in VB6 and also uses the CDOSYS library).

I'll still look into this again in the future and maybe we'll cross paths again in the AutoHotKey forums (which have an RSS feed that can be easily and automatically monitored with Awasu).

AutoHotKey looks interesting I'll have to get to know it in the not-too-distant future.

Best of luck,
kevotheclone :afro:

Picard
Posts: 7
Joined: Thu May 28, 2009 4:15 pm

Postby Picard » Wed Jun 10, 2009 5:22 pm

Hi kevotheclone,
There is no need to apologize.

Finally you gave me the final hint. :shock:

It even seems to work by searching for "iso-8859-1" and replacing it with "utf-8"


I Solved the "My Problem" in doing what you just said.
Here is the Section of my Script that takes care of that.
Perhaps it makes you hungry for AHK. :whistle:

Code: Select all

...
FileRead, InfomhtValue, %DestinationFolder%\info.mht
FileDelete, %DestinationFolder%\info.mht
StringReplace, InfomhtValueNew, InfomhtValue,iso-8859-1,utf-8, All
FileAppend, %InfomhtValueNew%, %DestinationFolder%\info.mht
InfomhtValue =
...


Basicly you can do lots of stuff using AHK.
Would be nice to hear from you in future.
My AHK-USER-Name is "Deep-Silence"

Thank you for your help again,
and Best of luckfor you, too.

With Best Regards, Picard.
P.S. You got a nice style writing Threads. :coolthumb:


Return to “Awasu - Extensions”

Who is online

Users browsing this forum: No registered users and 1 guest