• Intro to AutoHotkey HotStrings with AutoHotkey Intermediate AutoHotkey Intermediate Objects GUIs are Easy w/AutoHotkey Intro to DOS & AutoHotkey AutoHotkey FAQ2

06: Chrome and AutoHotkey: Getting lists from page

Automating and ChromeHere we continue with GeekDude working with Chrome and AutoHotkey extracting data from a webpage.  This session we focus on getting lists and leverage JSON, Chrome.Jxon_Dump, JSON.stringify, Chrome.Jxon_Load and jQuery.

The great news is that GeekDude explained how we can see the Reddit site the way it was the below video!

By logging into https://old.reddit.com/r/AutoHotkey/, the HTML will be the same as in the video!

Chrome and AutoHotkey: Getting lists from page

donate to GeekDudeIf you’re loving this, please consider donating to GeekDude!

Notes: Chrome and AutoHotkey: Getting lists from page

00:05     Sometimes you want multiple items from a page.  Maybe all post titles, or all links form comments.

00:20     Looking in HTML we see there’s a lot going on.  We need to look at the structure BEFORE we get working on it.

Continue reading

04: Automating Chrome to Set Text & Click a button

Automating Chrome with AutoHotkey

In the fourth session with GeekDude we look at out to Chrome and AutoHotkeyautomate setting text in a search field and then hitting the button to submit the search.

Automating Chrome to Set Text & Click a button


donate to GeekDudeIf you’re loving this, please consider donating to GeekDude!

AutoHotkey script for Automating Chrome to Set Text & Click a button


#Include   ;Remember to put Chrome in your library folder

#SingleInstance,Force
;**************************************
page:=Chrome.GetPageByTitle("AutoHotkey Community","contains") ;This will connect to the second index of a specific tab
If !IsObject(page){
MsgBox % "That wasn' t object / the page wasn't found"
ExitApp
}
page.Evaluate("document.querySelector('#keywords').value ='Chrome.ahk'")
Variable =document.querySelector('#keywords').value ='Chrome.ahk'
page.Evaluate(Variable)
var:="duh"
page.Evaluate("document.querySelector('#keywords').value ='" var "'")
page.Evaluate("document.querySelector('#search > fieldset > button').value ='Chrome.ahk'")

Notes for Automating Chrome to Set Text & Click a button

00:36     Go to AutoHotkey.com/boards/

00:44     Connect to tab using Chrome.GetPageByTitle(“AutoHotkey Community”) ;the default matchtype is “starts with”

01:23     Look at page structure using right-click and Inspect.  This opends Devtools with that element selected.

01:46     It has an ID of “keywords”, copy js path.  Which will give you queryselector(“#keywords”)

02:26     Use the .value to set some text in that box.

03:00     page.Evaluate(“document.querySelector(‘#keywords’).value =’Chrome.ahk'”)

04:01     Make sure inside the JavaScript you use the “=”, not “:=”

04:15     Some people don’t want to have to learn JavaScript.  When using Chrome, you’re going to have to learn JavaScript.

04:56     When using Chrome.ahk, we’re injecting JavaScript.  So best to learn

05:54     The button is right next to the input.  You can go back to the page and right-click the button, then hit Inspect

06:13     Test the new js path.  Instead of using .value, use .click

06:42     Test in Chrome developer tool

07:18     When running an Evaluate method, it waits for the previous Evaluate to finish (so no need to sleep between them).

07:44     If you run into a problem where you think it is happening too quickly, check the forum for some solutions

08:40     Sometimes what you want to input won’t always be a static string.  If you’re trying to reference a variable, you need to use the expression syntax.  In an expression, you’re not just assigning text, you’re doing math or making function calls.

Variable =document.querySelector(‘#keywords’).value =’Chrome.ahk’

page.Evaluate(Variable)

page.Evaluate(“document.querySelector(”#keywords ‘).value ='” variablevar:=”duh”

page.Evaluate(“document.querySelector(‘#keywords’).value ='” var “‘”) “‘”)

10:48     This works because AutoHotkey splits everything up on a given line.   First is a name of a function, then says this is inside the function, then this is text inside a function.  Then builds from left to right as to the string that will be used.

12:15     AutoHotkey proceeds left to right when evaluating an expression

12:40     when you use := you’re in expression assignment mode.

13:25     With just single = you’re in plain-text mode.  It reads it as text

15:00     When automating a site, you don’t know what kind of buffer’s they have to prevent scraping / botting.

15:49     When you start automating, you might start seeing Captcha’s everywhere

16:04     Sites get really good at looking like a normal site to a user, but looking like an impenetrable fortress to code

16:36     If your variable contains a single quote or other special charachters, JavaScript will interpret it as code instead of text.

17:13     JavaScript string escape sequence will replace characters with special escape sequences

Not mentioned in Video but GeekDude wrote me after

You can escape JavaScript code using Coco’s JSON library does actually do that escaping that we discussed when talking about putting data on the page. The syntax for invoking it looks like this:

variable = 123`r`n456’quote”quote

page.Evaluate(“document.querySelector(‘#whatever’).value = ” Chrome.Jxon_Dump(variable))

The dump function will automatically escape anything that needs escaped and add quotes to anything that needs quotes.

 

 

Updated iWB2 Learner tool to grab ClassName & Highlight Frames

ClassName & Highlight FramesI use ClassName a lot when web scraping.  So much so I updated the iWB2 Learner tool to extract it from the outerHTML and have it’s own edit field (like Name and ID).

I also changed the ListView that shows Frames to be RED.  I can’t tell you how much time I wasted early-on when dealing with Frames.  Hopefully this will help people realize they need to take a different approach when Frames are present! Continue reading

Navigating the DOM with parentNode, previousSibling,nextSibling, firstChild, and lastChild

Navigating the DOMSometimes it is not convenient to grab the specific element you want while you’re scraping a web page.  In this tutorial I walk through how to use parentNode, previousSibling,nextSibling,firstChild and lastChild. Continue reading