AutoHotkey webinar: Deep-Dive into File Encoding

In this AutoHotkey Webinar took a deep-dive into File Encoding by our guest speaker Jean Lalonde (author of Quick Access Popup)

Video Hour 1:  High Level:

  • Why File encoding matters
  • Pros/Cons of each type of encoding
  • Tools like the File Encoding Lab to help determine a file’s encoding
  • How to set File Encoding in AutoHotkey

Video Hour 2: Coding and Q&A

Script Highlight: BarChart by Learning One

(Here is a link to BarChart scripts demonstrated during the webinar)


1) File Enconding in AutoHotkey

(Here is a link to all of Jean Lalonde’s files)
  • Edit and run the AHK script “FileRead.ahk”
  • Load “Demo-UTF-16.txt” with default encoding
  • Load “Demo-UTF-16-No_BOM.txt” with default encoding  -> problem!
  • Load “Demo-UTF-16-No_BOM.txt” with UTF-16-RAW encoding  -> OK!
  • Try the other file enconding available in AHK

2) File Encoding in DOS (code pages)

  • Run the batch file “Type Box.bat” in DOS console (under “Tutorial“ folder)
  • See this ASCII/ANSI file displayed with code pages 1252 (default ASCI/ANSI) and 437

3) Load the File Encoding Lab

  • Run the AHK script “File Encoding Lab.ahk”
  • Loaded bu default “ASCII.txt” (detected CP1252, 7-bit chars)
  • File Encoding Lab tour
    • Binary display on the left side / Normal AHK display on the right
    • Click on the file name to see with Notepad
    • Encoded detected and use the dropdown list to reload with another encoding
  • Load “ANSI.txt” (detected CP1252, 8-bit chars)
  • Load “Box-CP437.txt” (detected CP1252, 8-bit chars)
    • Which encoding will display a box?
  • Load “UTF-8.txt” (detected UTF-8)
    • see BOM (“byte ordre mark” or header) on left side: two first bytes
    • See one byte for “!”
    • See two bytes each for “é”  and “É«
    • See three bytes for “用”

AutoHotkey Merchandise-White Stress ball

4) File Encoding Lab Cheat Sheet

  • ASCII and ANSI
  • Unicode 8-16-32 bits
  • Unicode with/without BOM
  • Unicode Little or Big Endian

5) Real life files

  • QAP Spanish translation
  • QAP Chinese translation
  • Other examples?

Web Scraping with AutoHotkey 110- Saving Files / Images from a URL / Hyperlink

While URLDownloadToFile is a built-in function for downloading binary files (Images, Word files, Spreadsheets, etc.) I’ve, and others, have had issues using it (I think it was that I was behind a proxy at work).   In the below video I demonstrate how to use the URLDownloadToFile command as well as demonstrate a function I borrowed from Maestrith (Author of AutoHotkey Studio).

Saving Files / Images from a URL / Hyperlink

 

Here is the code I walk through in the first part of the above video (demonstrating the built-in functionality and an example calling the function.

#SingleInstance,Force
;*************URL download to File*************************
url:="https://i2.wp.com/www.toptenia.com/wp-content/uploads/2017/08/gal-gadot.jpg"
;*********************url Download to file**********************************
UrlDownloadToFile, % url, % "Gal_Gadot_URLDownload.jpg" ;simple built-in way to download a file given a url


Download_File_XMLHTTP(URL) ;Call the function
;********************created by Maestrith but tweaked by Joe Glines***********************************
Download_File_XMLHTTP(URL){
	SplitPath,URL,File_Name ;get file name from URL
	req:=ComObjCreate("MSXML2.XMLHTTP.6.0"),ado:=ComObjCreate("ADODB.Stream")
	req.Open("HEAD",URL),req.Send() 
	ado.Type:=1
	req.Open("GET",URL,1),req.Send()
	while(req.ReadyState!=4){
		Sleep,50
	}
	ado.Open(),ado.Write(req.ResponseBody),ado.SaveToFile(File_Name,2),ado.Close()
	Sleep, 100
}

And here is the code where I demonstrate how you can get a list of images and iterate over them in an object (calling the download function)

#SingleInstance,Force
global Obj:=[] ;Creates obj holder for variables
;**************************************
pwb := WBGet()
MsgBox % pwb.document.images.length ;Show how many images there are
ComObjError(false)  ;Need to turn off so doesn't trigger error
;******example with While loop***Note a_index-1 is in first row, not each individual one*	 
While(ele:=pwb.document.links[a_index-1]){ ;store reference to element in ele While looping over elements
	if InStr(ele.href,"https://www.google.com/imgres?imgurl="){ ;if one of the images from Google.com
		obj.InsertAt(A_index-1,StrSplit(uri_decode(StrSplit(ele.href,"https://www.google.com/imgres?imgurl=").2),["?","&"]).1) ;Strip out a lot of the un-wanted text
	}
}
for k, v in obj{
	Download_File_XMLHTTP(v)
	Sleep, 100
}
ComObjError(True)  ;Turn back on

;********************created by Maestrith but tweaked by Joe Glines***********************************
Download_File_XMLHTTP(URL){
	SplitPath,URL,File_Name ;get file name from URL
	req:=ComObjCreate("MSXML2.XMLHTTP.6.0"),ado:=ComObjCreate("ADODB.Stream")
	req.Open("HEAD",URL),req.Send() 
	ado.Type:=1
	req.Open("GET",URL,1),req.Send()
	while(req.ReadyState!=4){
		Sleep,50
	}
	ado.Open(),ado.Write(req.ResponseBody),ado.SaveToFile(File_Name,2),ado.Close()
	Sleep, 100
}


;~ http://www.autohotkey.com/board/topic/47052-basic-webpage-controls-with-javascript-com-tutorial/
;~ wb := WBGet()
WBGet(WinTitle="ahk_class IEFrame", Svr#=1) {               ;// based on ComObjQuery docs
   static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
        , IID := "{0002DF05-0000-0000-C000-000000000046}"   ;// IID_IWebBrowserApp
;//     , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}"   ;// IID_IHTMLWindow2
   SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
   if (ErrorLevel != "FAIL") {
      lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
      if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
         DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
         return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
      }
   }
}

Uri_Decode(str) {
		Loop
			If RegExMatch(str, "i)(?<=%)[\da-f]{1,2}", hex)
				StringReplace, str, str, `%%hex%, % Chr("0x" . hex), All
		Else Break
			Return, str
	}
	
Uri_Encode(Uri, full = 0)
	{
		oSC := ComObjCreate("ScriptControl")
		oSC.Language := "JScript"
		Script := "var Encoded = encodeURIComponent(""" . Uri . """)"
		oSC.ExecuteStatement(Script)
		encoded := oSC.Eval("Encoded")
		Return encoded
	}