Tutorial: Web Scraping with Python on the Cloud
Hi All - Newbie to Hashnode. I'll be re-posting some of our tutorials here that were well received. Most of the tutorials show how to create useful scripts, apps and tools. We're building a rapid development platform for developers called WayScript and would love your feedback. Currently in Beta and free to all.
Introduction
Web scraping on the cloud has never been easier. Setting up an automated web scraping script on WayScript only takes a few minutes to do.
Prerequisites
No prerequisites but some content you might find helpful: Working with Python
Automating a Script to Run Daily
Most things you create on WayScript can be activated daily by using a time trigger. When setting up the time trigger, we select our time that we want the script to run, and build the script below that tree in the workflow.
Scraping our content
We'll scrape our content in this example by using the python module. We'll drag this into our workflow and write some code that looks like this:
import requests
from bs4 import BeautifulSoup
ticker = 'AAPL'
url = 'https://finance.yahoo.com/quote/' + ticker
res = requests.get( url )
html = res.text
soup = BeautifulSoup( html, 'html.parser' )
market_cap_elem = soup.find( 'td', { 'data-test' : 'MARKET_CAP-value' } )
market_cap = market_cap_elem.text
print( ticker, 'Market Cap', market_cap )
variables[ 'MarketCap' ] = market_cap
With that code, we'll go and scrape information off another webste, and return it to our script as a variable using the variables dictionary. We'll use it to send ourselves a text message.
Questions?
If there's any questions feel free to reach out to me. If you create an account on WayScript, you can also message and engage with the team and founders on Discord.