PMG Digital Made for Humans

A Starter Setup to Using SEO Crawls and Databases

3 MINUTE READ | June 4, 2016

A Starter Setup to Using SEO Crawls and Databases

Author's headshot

John Greer

John Greer has written this article. More details coming soon.

Most people in digital marketing have so much data available at their fingertips now that working with databases is a necessity. On the SEO team, site crawls, performance numbers, and other data can be worked with in Microsoft Excel, but often it goes into our internal data warehouse where much of the work can be automated.

For this example, we’re going to look at a middle ground between standalone Excel and an automated data warehouse system to handle larger site crawl data when you don’t have access to a team of developers.

1. Collecting a Large Amount of Data with Crawlers

One thing we have done is setup servers that allow us to crawl hundreds of thousands of URLs from very large sites. Start with running a crawler (such as Screaming Frog) and downloading all of the data on a schedule (daily, weekly, or monthly). Amazon AWS is an option here, or buy a dedicated desktop with a lot of RAM.

Once you’re done, export a simple file like a CSV.

2. Take the Data into a Database

Excel can only handle a million or so rows before it keels over. We’ll use a database to store all of the data, but get smaller chunks of data from our it that can still fit into Excel.

Install a database like MySQL (free) or Microsoft Access (pretty cheap). Internally, we use a different database type, but the good news is that databases pretty much work the same regardless of which one you use.

Create a table and name it “site-crawl” (or “i-luv-kittens,” whatever works for you). Before importing it, adjust your crawl file by adding a column for the date of each crawl so you can discover trends. Then, use the Access wizard tool or MySQL import tool to upload your CSV from step 1.

3. Query the Database

Now you want to make some simple queries into your data. Access, like our data warehouse, has a GUI to help with this, but there are a number of times you’re going to need to actually write some SQL statements. Take a little time and you can learn to write your own SQL queries. It’s easier than learning something like JavaScript or PHP and there’s tons of help online.

Here’s an example you can do to to get you started.  In MySQL or Access, you can save that as a view/report.

And now you’ve got trended data of the site’s 404 errors.

4. Buildout the Data in a Useful Format

Last, put that data into a format that’s easier to work and build graphs. You could use a tool like Tableau, but in this example using Excel.

Connect your database from step 3 first. Start a new notebook and then click the “Data” tab. Use “From Access” for Access and the “Connections” button for MySQL.

Set up either connection, which the wizards will help with. Once you have that, you can either select a view you’ve setup, or paste the SQL query you wrote in the previous step directly into Excel. You should get a table with all of your data, small enough that Excel won’t crash.

Now you can set up a graph of that data which refreshes from your database anytime you use it.

Stay in touch

Bringing news to you

Subscribe to our newsletter

By clicking and subscribing, you agree to our Terms of Service and Privacy Policy


Related Content

thumbnail image

Consumer Trends

The Road to Recovery for the Travel Industry

5 MINUTES READ | November 19, 2020

thumbnail image

EMEA Search Trends Amid COVID-19

8 MINUTES READ | April 28, 2020

thumbnail image

Google Tools You Didn’t Know About

5 MINUTES READ | March 6, 2018

thumbnail image

Google’s Mission: Keeping You On Google

3 MINUTES READ | January 25, 2017

thumbnail image

Google Announces Analytics 360 Suite

4 MINUTES READ | March 31, 2016

thumbnail image

2015 Must-Attend Digital Conferences

4 MINUTES READ | March 16, 2015

thumbnail image

Google Analytics Advanced Segments 101

3 MINUTES READ | June 19, 2012

ALL POSTS