Scraping Data From Websites

As digital marketers, big data should be what we use to inform most of the decisions we make. Using intelligence to understand what works within your industry is crucial within content campaigns absolutely, but it blows my mind to know that so many businesses aren’t concentrating on it. One reason I often hear from businesses is that they don’t have the budget to invest in complex and expensive tools that can feed in reams of data to them. Having said that, you do not always need to purchase expensive tools to gather valuable intelligence – this is where data scraping will come in.

Essentially, it consists of crawling through a web page and gathering nuggets of information that you can use for your analysis. Hopefully, you can begin to see how this data can be valuable. What’s more, it doesn’t require any coding knowledge – when you can follow my simple instructions, you can begin gathering information that will notify your content campaigns.

I’ve recently used this research to help me get a post published on leading page of BuzzFeed, getting viewed over 100,000 times and channeling plenty of traffic through to my blog. Disclaimer: A very important factor that I really need to stress before you read on is the fact that scraping a website may breach its conditions of service.

  • Click Decide on a domain out of this account and discover your area name
  • Bandwidth 200 GB
  • Are holding the necessary drinking water, clothing and security equipment required for this run
  • Set “Power On By PCIE Devices” to “Enabled”

You should ensure that isn’t the case before carrying out any scraping activities. For instance, Completely prohibits the scraping of information on their site Twitter. Google’s Terms of Service don’t allow the sending of automated queries of any sort to your system without express permission beforehand from Google.

So be careful, kids. Mastering the basics of data scraping will start a whole ” new world ” of options for content evaluation. I’d advise any content marketer (or at least a member of their team) to get clued up on this. Before I get yourself started the specific illustrations, you’ll need to make sure that you have Microsoft Excel on your pc (everyone must have Excel!) as well as the SEO Tools plugin for Excel (download free here).

I come up with a full tutorial on using the SEO tools plugin that you may also be interested in. Alongside this, you will want a web crawling tool like Screaming Frog’s SEO Spider or Xenu Link Sleuth (both have free options). Once you have got these setup, you can do anything that I format below. Analysing big weblogs and publications to find who the influential authors are can give you some really valuable data. I use this information on a regular basis to build relationships with influential writers and get my content positioned on top tier websites. Step one 1: Gather a list of the URLs from the domains you’re analysing using Screaming Frog’s SEO Spider.

Simply add the main area into Screaming Frog’s interface and hit start (if you haven’t used this tool before, you can check out my guide here). After the tool has completed gathering all the URLs (this can take a short while for big websites), simply export them all to an Excel spreadsheet.

Step 2: Start Google Chrome and navigate to 1 of this article pages of the site you’re analysing and find where they mention the author’s name (normally, this is within an author bio section or within the post name). Once you’ve found this, right-click their name and choose inspect element (this provides up the Chrome developer gaming console).