Technical SEO: crawl, diagnostics, and tools
Technical SEO: crawl, diagnostics, and tools
Check out my Site Assessment Scorecard to get a sense of how I report on these metrics.
I get a high-level view of the site by conducting a Comprehensive Site Audit. I begin with a Screaming Frog SEO crawl of the site. I start with the sitemap.xml (or sitemap-index.xml) in the robots.txt file. Search expects to find that link, and important pages will be missed. I think of the sitemap as the field guide to your website, and robots.txt as where you store that guide. So, killing two birds with one stone. From this data set, identify issues such as broken links (404s), duplicate content, thin content, missing meta tags, and redirect errors.
Next. Check Site indexability and crawlability. How easily can Search access and analyze the copy and meta content of your page? Quick Tip: Verify which pages are indexed by Google using the site:example_domain.com search or Google Search Console's Index Coverage report. This will also give you a sense of how many pages you have to work with.
Look for crawl errors (server errors, soft 404s, 3xx redirects, and non-indexable URLs - these are the sitemap KILLERS - Search considers your sitemap untrustworthy if you're directing spiders to crawl retired, redirected, or weak pages). Review robots.txt and meta robots tags to see if important pages are accidentally blocked.
Analyze Website Speed and Performance. I use Google PageSpeed Insights (Lighthouse) to check load times and page performance. I select a sample of four pages: Home, Blog, a heavy content page, i.e., news or articles, and products and services, or one of each.
Then, identify issues such as render-blocking resources, large image files, inefficient scripts, or slow server response times. A good resource: use Developer Tools>Console>Coverage, take a look at the unused code. This is a good place to consider tree-shaking to identify code that can be delayed, deferred, or deleted.
Examine Site Architecture and URL Structure. Ensure a clear, logical URL hierarchy. Use your folder (and sub-folders) intelligently.
Check for deep pages buried too many clicks away.
Look for canonicalization issues, such as pages canonicalized to other pages in the sitemap, and inconsistent URLs (www vs non-www, HTTP vs HTTPS, and trailing/ vs open/).
Look for mixed content warnings when some elements load over HTTP, with file content types labeled correctly. Images are not Text - look for opportunities to ease Search’s workload. Remember, Search has billions of pages to crawl and index.
Next, review Structured Data and Schema Markup. Validate your schema markup using Google’s Rich Results Test or Schema.org Validator. Look for errors or missing structured data that could impact rich snippets and send stronger search signals about the content before indexing.
Evaluate Internal Linking: Identify orphan pages with no internal links. Ensure important pages get adequate link equity. Look for excessive or irrelevant internal links. Make sure your internal links are one-to-one. Try to avoid wasting Search's precious milliseconds with redirect hops.
click to expand
Summary of Tech SEO analysis, Focus, and Tools