Connecting to Chrome with AutoHotkey
This discussion wasn’t meant to be shared. GeekDude was giving me some background on how we’re connecting to Chrome. It is a bit “advanced” but some really good background info (especially understanding what a socket verse WebSocket is). Below is the video and my transcript-ed notes from the discussion
If you’re loving this, please consider donating to GeekDude!
Connecting to Chrome with AutoHotkey
1:50 Why starting with remote debugging port
1:41 Pages in debugging environment
2:18 Other browser automation tools like Selenium realized this is a great way to connect
2:53 The debugging tools like Chrome, Selenium, FireFox all adopted this same approach
4:00 Devtools protocol https://chromedevtools.github.io/devtools-protocol/
4:09 Can I get the protocol as JSON? If you’ve set –remote-debugging-port=9222 with Chrome, the complete protocol version it speaks is available at localhost:9222/json/protocol (remember to close all instances of Chrome before launching in debug mode)
4:30 The JSON string talks about everything you can do with the protocol
4:55 If you browse to this JSON page, http://127.0.0.1:9222/json Chrome will show you all the debugable pages. Tabs, Plugins, etc.
5:44 In json Look for webSocketDebuggerUrl and pick a “page”. That will allow you to automate it
6:40 Iframe example with Google Doodle URL. This will give you just the iFrame. Get the ” devtoolsFrontendUrl” path then concatenate with your ip& port (http://127.0.0.1:9222 ) for example my hangouts was: http://127.0.0.1:9222/devtools/inspector.html?ws=127.0.0.1:9222/devtools/page/002FA737EFE330712C084757D33748F6
7:23 All you see in the debugger is from that iFrame (because we opened that iFrame directly)
9:06 other things marked as “pages”. Long strings are probably extensions where someone didn’t fill out their info correctly
10:00 Automate plugins like lastpass. It’s not documented yet, but you can see how to connect to it
11:00 When create instance of Chrome, it launches the Chrome browser and trys to get a specific debug port and then it saves that number for that instance.
11:37 We could have used the number
11:49 When creating other instances (GetPage() it takes that websocket debugger URL and passes it to the class “page” (in Chrome.ahk).
12:20 If there is a class in future versions of Chrome.ahk, he’ll probably only have the page class. Because everything being done before you connect to that page is not live. You have a live connection to the browser. Everything up to this point wasn’t a “live” connection. Once you have a connection to the page, it needs to be updated…
13:19 A socket is when you open a connection to another machine and you can send data to it and get data back. It stays open and you can continue to transfer data back and forth
13:20 A webRequest is where you open a connection to a machine, you ask for a resources, it can wait and, when you get that resources back, you’re “done” an the connection is closed
13:40 Websockets bridge the two. You start by sending a webrequest that says you want to open a websocket connection so that rather than a get/post winhttprequest, this is a special kind of request. It “upgrades” that connection to a websocket connection. From there it is much more similar to a regular socket. You can send data back and forth.
What are WebSockets
14:29 This has been difficult to do from AutoHotkey because websockets were designed with a lot of abstraction to make things easier for the javascript developer. A socket is much more loosey-goosey in that you send some bytes, they probably get there, probably not get there all at the same time, you fill up the buffer, occasionally flush the buffer, etc. Websockets handle all of this for you! You send a “message” and it gets encapsulated. The browser only exposes to you full messages.
So you don’t have to deal with text encoding, waiting for the full bytes, it all gets handled automatically. That process takes a lot of extra code. Even if you ignore the Secure sockets layer (SSL) writing all of that encryption code in AutoHotkey would be borderline insanity. So it’s just not available.
15:52 That’s why when GeekDude wrote Chrome.ahk and Discord.ahk, they both just create an instance of IE in the background and use ActiveX / COM to handle the WebSocket code. This is fast but it is part of the instability. It works great for the most part, but sometimes it just breaks down.
17:13 If IE dies, are we going to need to find another way? GeekDude thinks IE might never go away however he heard about websockets CAPI WebSocket Protocol Component API Functions for doing websockets. This could be our way to create the WebSocket connection.
17:55 There’s a WebSocketCreateClientHandle function. He’s not sure what it means, but it looks like a DLL compatible API call. Hopefully we can use this to ditch IE. Taking this approach will make it strange to implement Teadrinker’s solution.