Recently there has been a lot of buzz around a script which allows you to scrape Google’s “People also ask” into either a CSV or a visual HTML file. It looks a little something like this:
This very nice script (gquestions.py) by nittolese, and in this post I’ll show you how to make it executable within Pycharm. I love to use Pycharm when working with Python, but to execute this script within Pycharm you need to make a few changes to the script, which are covered below in our step-by-step guide.
The following steps will assume you have already installed Python 3 and the Pycharm IDE.
Step 1: Download the ghit hub files and copy to your project
Unzip the gquestions-master.zip file into your pycharm project. For me this is in /Users/jack/PycharmProjects/gquestions/gquestions-master
Step 2: Install dependencies
To do this right click on the gquestions-master
directory and open a new local terminal (Open in terminal).
This will then launch a second terminal interface. Once open, enter:
pip install -r requirements.txt
Step 3: Test in terminal
Let’s try to find explamples of English questions around “used cars”.
python gquestions.py query "used cars" en --csv
If successful, within your CSV and Driver files there will be new files for “used cars” and you can stop reading this post!
If, like me, this doesn’t work, we need to carry out a few additional steps to allow it to run.
Step 4: Download the latest version of ChromeDriver
Follow this link to ChromeDriver and download the latest stable version for your OS – mine for example is chromedriver_mac64.zip.
Step 5: Unzip ChromeDriver and copy the file path
Unzip your chrome driver file (chromedriver_mac64) and save it to desktop or another memorable place. Once extracted, copy the file path – mine for example is /Users/jack/Desktop/gquestions – ChromeDriver/chromedriver.
Step 6: Edit the gquestions.py file (line 69)
We now need to edit line 69 “executable_path” value from:
#Default executable_path value executable_path="chrome_path #New executable_path value from your own file path executable_path="/Users/jack/Desktop/gquestions - ChromeDriver/chromedriver" #Remember to use speech marks to quote a string
Step 7: Test and you’re done!
Execute a command such as python gquestions.py query "used cars" en --csv
in the terminal, and this will now create new HTML and CSV files which can be used within your data analysis.
Search “used cars” in English and export in html
python gquestions.py query "used cars" en
Search headlessly “used cars” in English and export in html
python gquestions.py query "used cars" en --headless
Search “used cars” in English and export in html and csv
python gquestions.py query "used cars" en --csv
Search “used cars” in English and export in html
python gquestions.py query "used cars" en depth 1
What is Selenium in the context of Python?
Selenium is a library which can be used to automate web browser functions, so, within this script, we are using it to interact with Google’s search results.
Want to find out more? National Coding Week is coming up on 16th September. Keep an eye on our blog as we’ll be sharing useful tips daily!