Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites. Usually, such software programs simulate human exploration of the World Wide Web by either implementing low-level Hypertext Transfer Protocol (HTTP), or embedding a fully-fledged web browser, such as Mozilla Firefox.
Web scraping is closely related to web indexing, which indexes information on the web using a bot or web crawler and is a universal technique adopted by most search engines. In contrast, web scraping focuses more on the transformation of unstructured data on the web, typically in HTML format, into structured data that can be stored and analyzed in a central local database or spreadsheet. Web scraping is also related to web automation, which simulates human browsing using computer software. Uses of web scraping include online price comparison, contact scraping, weather data monitoring, website change detection, research, web mashup and web data integration.
Notes from Create and connect to new Chrome profiles with AutoHotkey
00:09 There’s been a lot of confusion on Chrome profiles. What they are, why you should concern yourself with them when using AutoHotkey.
00:23 Your Chrome profile is what keeps you logged into websites, connected to Google, etc. Most of the time you won’t need access to your entire Chrome profile. But you might want to start with a blank slate if you’re distributing your code to people. Or you want to create a new instance of Chrome that you don’t want attached to an existing Chrome tab. For any of those, you need to have a Chrome profile
01:17 This is because most people don’t have the remote debugging flag on their default shortcut. If you launch chrome with debugging code, it will automatically group it with the current process window.
02:38 So instead of spawning a new Chrome window that is listening to the debug window, it will open a new page on the existing Chrome instance without the debugging access (even though you specified debugging)
02:52 So in order to get Chrome to open a new instance, you need to use the Chrome profile.
03:51 Create a folder (name it what you want) and tell Chrome to use it the profile flag. “–user data-dir” with your directory i.e. “–user data-dir-C:\temp\newProfile”
05:18 Looking in the profile folder, you can see Chrome has generated a bunch of files. Things like Cookies, browser history, etc. Everything Chrome remembers…
05:47 If you’re targeting portable Chrome, making sure you have this profile set correctly can be a big deal! If you use AutoHotkey to launch portable Chrome, it might still load the default profile. Make sure you specify the Chrome profile!
07:32 Everywhere you would have used Chrome. In your script, use ChromeInst. (i.e. Instead of Chrome.GetPage use ChromeInst.GetPage.
07:32 That tells Chrome to look for this new / specific instance of Chrome instead of the default version. Remember, it’s only “new” right after you make it.
Here we continue with GeekDude working with Chrome and AutoHotkey extracting data from a webpage. This session we focus on getting lists and leverage JSON, Chrome.Jxon_Dump, JSON.stringify, Chrome.Jxon_Load and jQuery.
The great news is that GeekDude explained how we can see the Reddit site the way it was the below video!