If it feels like Google search is changing faster than ever, it’s not your imagination. Google reported an astonishing 4,367 “launches” in 2021, up dramatically from 350-400 in 2009. On average, that’s nearly a dozen changes per day.
Many of these more than 25,000 changes were undoubtedly very small, but some were outright cataclysmic. What can we learn from nine years of Google rankings data, and how can it help us prepare for the future?
Is the algorithm heating up?
Thanks to our MozCast research project, we have daily algorithm flux data going back to 2014. The visualization below shows nine full years of daily Google “temperatures” (with hotter days representing more movement on page one of rankings):
White 2022 was certainly hotter than 2014, the pattern of rising temperatures over time is much more complicated than that. In some cases, we can’t map temperatures directly to algorithm updates, and in others, there are known causes outside of Google’s control.
Take, for example, the WHO declaration of the global COVID-19 pandemic on March 11, 2020 (#8 in the labeled events). COVID-19 changed consumer behavior dramatically in the following months, including huge shifts in e-commerce as brick-and-mortar businesses shut down. While it’s likely that Google launched algorithm updates to respond to these changes, COVID-19 itself reshaped global behavior and search rankings along with it.
The summer of 2017 is an entirely different story, with unexplained algorithm flux that lasted for months. Truthfully, we still don’t really know what happened. One possibility is that Google’s Mobile-first index update caused large shifts in rankings as it was being tested in the year or more preceding the official launch, but at this point we can only speculate.
What were the hottest days?
While some of the hottest confirmed days on Google from 2014-2022 were named updates, such as the “core” updates, there are a few interesting exceptions …
The hottest day on record in MozCast was a major outage in August of 2022 that measured a whopping 124°F. While this corresponded with an electrical fire in an Iowa-based Google data center, Google officially said that the two events were unrelated.
The 8th and 10th hottest confirmed days over these nine years were serious bugs in the Google index that resulted in pages being dropped from the search results. Our analysis in April of 2019 measured about 4% of pages in our data set disappearing from search. Thankfully, these events were short-lived, but it goes to show that not all changes are meaningful or actionable.
The largest search penalty on record was the “Intrustive Interstitial Penalty” in January of 2017, that punished sites with aggressive popups and overlays that disrupted the user experience.
What was the biggest update?
If we’re talking about named updates, the highest-temperature (i.e. largest impact) was actually a penalty-reversal, phase 2 of the Penguin 4.0 update in October of 2016. Phase 2 removed all previous Penguin penalties, an unprecedented (and, so far, unrepeated) move on Google’s part, and a seismic algorithmic event.
Keep in mind, this was just the impact of undoing Penguin. If we factor in the 7+ major, named Penguin updates (and possibly dozens of smaller updates and data refreshes), then Penguin is the clear winner among the thousands of changes from 2014-2022.
What’s in store for the future?
Ultimately, Google’s “weather” isn’t a natural phenomenon — it’s driven by human choices, and, occasionally, human mistakes. While we can’t predict future changes, we can try to learn from the patterns of the past and read between the lines of Google’s messaging.
As machine learning drives more of Google search (and Bing’s recent launch of ChatGPT capabilities will only accelerate this), the signals from Google will likely become less and less clear, but the themes of the next few years will probably be familiar.
Google wants content that is valuable for searchers, and reflects expertise, authority, and trust. Google wants that content delivered on sites that are fast, secure, and mobile-friendly. Google doesn’t want you to build sites purely for SEO or to clutter their (expensive) index with junk.
How any of that is measured or codified into the algorithm is a much more complicated story, and it naturally evolves as the internet evolves. The last nine years can teach us about the future and Google’s priorities, but there will no doubt be surprises. The only guarantee is that — as long as people need to find information, people, places and things, both search engines and search engine optimization will continue to exist.
For a full list of major algorithm updates back to 2003’s “Boston” update, check out our Google algorithm update history. For daily data on Google rankings flux and SERP feature trends, visit our MozCast SERP tracking project.
If it feels like Google search is changing faster than ever, it’s not your imagination. Google reported an astonishing 4,367 “launches” in 2021, up dramatically from 350-400 in 2009. On average, that’s nearly a dozen changes per day.
Many of these more than 25,000 changes were undoubtedly very small, but some were outright cataclysmic. What can we learn from nine years of Google rankings data, and how can it help us prepare for the future?
Is the algorithm heating up?
Thanks to our MozCast research project, we have daily algorithm flux data going back to 2014. The visualization below shows nine full years of daily Google “temperatures” (with hotter days representing more movement on page one of rankings):
White 2022 was certainly hotter than 2014, the pattern of rising temperatures over time is much more complicated than that. In some cases, we can’t map temperatures directly to algorithm updates, and in others, there are known causes outside of Google’s control.
Take, for example, the WHO declaration of the global COVID-19 pandemic on March 11, 2020 (#8 in the labeled events). COVID-19 changed consumer behavior dramatically in the following months, including huge shifts in e-commerce as brick-and-mortar businesses shut down. While it’s likely that Google launched algorithm updates to respond to these changes, COVID-19 itself reshaped global behavior and search rankings along with it.
The summer of 2017 is an entirely different story, with unexplained algorithm flux that lasted for months. Truthfully, we still don’t really know what happened. One possibility is that Google’s Mobile-first index update caused large shifts in rankings as it was being tested in the year or more preceding the official launch, but at this point we can only speculate.
What were the hottest days?
While some of the hottest confirmed days on Google from 2014-2022 were named updates, such as the “core” updates, there are a few interesting exceptions …
The hottest day on record in MozCast was a major outage in August of 2022 that measured a whopping 124°F. While this corresponded with an electrical fire in an Iowa-based Google data center, Google officially said that the two events were unrelated.
The 8th and 10th hottest confirmed days over these nine years were serious bugs in the Google index that resulted in pages being dropped from the search results. Our analysis in April of 2019 measured about 4% of pages in our data set disappearing from search. Thankfully, these events were short-lived, but it goes to show that not all changes are meaningful or actionable.
The largest search penalty on record was the “Intrustive Interstitial Penalty” in January of 2017, that punished sites with aggressive popups and overlays that disrupted the user experience.
What was the biggest update?
If we’re talking about named updates, the highest-temperature (i.e. largest impact) was actually a penalty-reversal, phase 2 of the Penguin 4.0 update in October of 2016. Phase 2 removed all previous Penguin penalties, an unprecedented (and, so far, unrepeated) move on Google’s part, and a seismic algorithmic event.
Keep in mind, this was just the impact of undoing Penguin. If we factor in the 7+ major, named Penguin updates (and possibly dozens of smaller updates and data refreshes), then Penguin is the clear winner among the thousands of changes from 2014-2022.
What’s in store for the future?
Ultimately, Google’s “weather” isn’t a natural phenomenon — it’s driven by human choices, and, occasionally, human mistakes. While we can’t predict future changes, we can try to learn from the patterns of the past and read between the lines of Google’s messaging.
As machine learning drives more of Google search (and Bing’s recent launch of ChatGPT capabilities will only accelerate this), the signals from Google will likely become less and less clear, but the themes of the next few years will probably be familiar.
Google wants content that is valuable for searchers, and reflects expertise, authority, and trust. Google wants that content delivered on sites that are fast, secure, and mobile-friendly. Google doesn’t want you to build sites purely for SEO or to clutter their (expensive) index with junk.
How any of that is measured or codified into the algorithm is a much more complicated story, and it naturally evolves as the internet evolves. The last nine years can teach us about the future and Google’s priorities, but there will no doubt be surprises. The only guarantee is that — as long as people need to find information, people, places and things, both search engines and search engine optimization will continue to exist.
For a full list of major algorithm updates back to 2003’s “Boston” update, check out our Google algorithm update history. For daily data on Google rankings flux and SERP feature trends, visit our MozCast SERP tracking project.
If it feels like Google search is changing faster than ever, it’s not your imagination. Google reported an astonishing 4,367 “launches” in 2021, up dramatically from 350-400 in 2009. On average, that’s nearly a dozen changes per day.
Many of these more than 25,000 changes were undoubtedly very small, but some were outright cataclysmic. What can we learn from nine years of Google rankings data, and how can it help us prepare for the future?
Is the algorithm heating up?
Thanks to our MozCast research project, we have daily algorithm flux data going back to 2014. The visualization below shows nine full years of daily Google “temperatures” (with hotter days representing more movement on page one of rankings):
White 2022 was certainly hotter than 2014, the pattern of rising temperatures over time is much more complicated than that. In some cases, we can’t map temperatures directly to algorithm updates, and in others, there are known causes outside of Google’s control.
Take, for example, the WHO declaration of the global COVID-19 pandemic on March 11, 2020 (#8 in the labeled events). COVID-19 changed consumer behavior dramatically in the following months, including huge shifts in e-commerce as brick-and-mortar businesses shut down. While it’s likely that Google launched algorithm updates to respond to these changes, COVID-19 itself reshaped global behavior and search rankings along with it.
The summer of 2017 is an entirely different story, with unexplained algorithm flux that lasted for months. Truthfully, we still don’t really know what happened. One possibility is that Google’s Mobile-first index update caused large shifts in rankings as it was being tested in the year or more preceding the official launch, but at this point we can only speculate.
What were the hottest days?
While some of the hottest confirmed days on Google from 2014-2022 were named updates, such as the “core” updates, there are a few interesting exceptions …
The hottest day on record in MozCast was a major outage in August of 2022 that measured a whopping 124°F. While this corresponded with an electrical fire in an Iowa-based Google data center, Google officially said that the two events were unrelated.
The 8th and 10th hottest confirmed days over these nine years were serious bugs in the Google index that resulted in pages being dropped from the search results. Our analysis in April of 2019 measured about 4% of pages in our data set disappearing from search. Thankfully, these events were short-lived, but it goes to show that not all changes are meaningful or actionable.
The largest search penalty on record was the “Intrustive Interstitial Penalty” in January of 2017, that punished sites with aggressive popups and overlays that disrupted the user experience.
What was the biggest update?
If we’re talking about named updates, the highest-temperature (i.e. largest impact) was actually a penalty-reversal, phase 2 of the Penguin 4.0 update in October of 2016. Phase 2 removed all previous Penguin penalties, an unprecedented (and, so far, unrepeated) move on Google’s part, and a seismic algorithmic event.
Keep in mind, this was just the impact of undoing Penguin. If we factor in the 7+ major, named Penguin updates (and possibly dozens of smaller updates and data refreshes), then Penguin is the clear winner among the thousands of changes from 2014-2022.
What’s in store for the future?
Ultimately, Google’s “weather” isn’t a natural phenomenon — it’s driven by human choices, and, occasionally, human mistakes. While we can’t predict future changes, we can try to learn from the patterns of the past and read between the lines of Google’s messaging.
As machine learning drives more of Google search (and Bing’s recent launch of ChatGPT capabilities will only accelerate this), the signals from Google will likely become less and less clear, but the themes of the next few years will probably be familiar.
Google wants content that is valuable for searchers, and reflects expertise, authority, and trust. Google wants that content delivered on sites that are fast, secure, and mobile-friendly. Google doesn’t want you to build sites purely for SEO or to clutter their (expensive) index with junk.
How any of that is measured or codified into the algorithm is a much more complicated story, and it naturally evolves as the internet evolves. The last nine years can teach us about the future and Google’s priorities, but there will no doubt be surprises. The only guarantee is that — as long as people need to find information, people, places and things, both search engines and search engine optimization will continue to exist.
For a full list of major algorithm updates back to 2003’s “Boston” update, check out our Google algorithm update history. For daily data on Google rankings flux and SERP feature trends, visit our MozCast SERP tracking project.
If it feels like Google search is changing faster than ever, it’s not your imagination. Google reported an astonishing 4,367 “launches” in 2021, up dramatically from 350-400 in 2009. On average, that’s nearly a dozen changes per day.
Many of these more than 25,000 changes were undoubtedly very small, but some were outright cataclysmic. What can we learn from nine years of Google rankings data, and how can it help us prepare for the future?
Is the algorithm heating up?
Thanks to our MozCast research project, we have daily algorithm flux data going back to 2014. The visualization below shows nine full years of daily Google “temperatures” (with hotter days representing more movement on page one of rankings):
White 2022 was certainly hotter than 2014, the pattern of rising temperatures over time is much more complicated than that. In some cases, we can’t map temperatures directly to algorithm updates, and in others, there are known causes outside of Google’s control.
Take, for example, the WHO declaration of the global COVID-19 pandemic on March 11, 2020 (#8 in the labeled events). COVID-19 changed consumer behavior dramatically in the following months, including huge shifts in e-commerce as brick-and-mortar businesses shut down. While it’s likely that Google launched algorithm updates to respond to these changes, COVID-19 itself reshaped global behavior and search rankings along with it.
The summer of 2017 is an entirely different story, with unexplained algorithm flux that lasted for months. Truthfully, we still don’t really know what happened. One possibility is that Google’s Mobile-first index update caused large shifts in rankings as it was being tested in the year or more preceding the official launch, but at this point we can only speculate.
What were the hottest days?
While some of the hottest confirmed days on Google from 2014-2022 were named updates, such as the “core” updates, there are a few interesting exceptions …
The hottest day on record in MozCast was a major outage in August of 2022 that measured a whopping 124°F. While this corresponded with an electrical fire in an Iowa-based Google data center, Google officially said that the two events were unrelated.
The 8th and 10th hottest confirmed days over these nine years were serious bugs in the Google index that resulted in pages being dropped from the search results. Our analysis in April of 2019 measured about 4% of pages in our data set disappearing from search. Thankfully, these events were short-lived, but it goes to show that not all changes are meaningful or actionable.
The largest search penalty on record was the “Intrustive Interstitial Penalty” in January of 2017, that punished sites with aggressive popups and overlays that disrupted the user experience.
What was the biggest update?
If we’re talking about named updates, the highest-temperature (i.e. largest impact) was actually a penalty-reversal, phase 2 of the Penguin 4.0 update in October of 2016. Phase 2 removed all previous Penguin penalties, an unprecedented (and, so far, unrepeated) move on Google’s part, and a seismic algorithmic event.
Keep in mind, this was just the impact of undoing Penguin. If we factor in the 7+ major, named Penguin updates (and possibly dozens of smaller updates and data refreshes), then Penguin is the clear winner among the thousands of changes from 2014-2022.
What’s in store for the future?
Ultimately, Google’s “weather” isn’t a natural phenomenon — it’s driven by human choices, and, occasionally, human mistakes. While we can’t predict future changes, we can try to learn from the patterns of the past and read between the lines of Google’s messaging.
As machine learning drives more of Google search (and Bing’s recent launch of ChatGPT capabilities will only accelerate this), the signals from Google will likely become less and less clear, but the themes of the next few years will probably be familiar.
Google wants content that is valuable for searchers, and reflects expertise, authority, and trust. Google wants that content delivered on sites that are fast, secure, and mobile-friendly. Google doesn’t want you to build sites purely for SEO or to clutter their (expensive) index with junk.
How any of that is measured or codified into the algorithm is a much more complicated story, and it naturally evolves as the internet evolves. The last nine years can teach us about the future and Google’s priorities, but there will no doubt be surprises. The only guarantee is that — as long as people need to find information, people, places and things, both search engines and search engine optimization will continue to exist.
For a full list of major algorithm updates back to 2003’s “Boston” update, check out our Google algorithm update history. For daily data on Google rankings flux and SERP feature trends, visit our MozCast SERP tracking project.
If it feels like Google search is changing faster than ever, it’s not your imagination. Google reported an astonishing 4,367 “launches” in 2021, up dramatically from 350-400 in 2009. On average, that’s nearly a dozen changes per day.
Many of these more than 25,000 changes were undoubtedly very small, but some were outright cataclysmic. What can we learn from nine years of Google rankings data, and how can it help us prepare for the future?
Is the algorithm heating up?
Thanks to our MozCast research project, we have daily algorithm flux data going back to 2014. The visualization below shows nine full years of daily Google “temperatures” (with hotter days representing more movement on page one of rankings):
White 2022 was certainly hotter than 2014, the pattern of rising temperatures over time is much more complicated than that. In some cases, we can’t map temperatures directly to algorithm updates, and in others, there are known causes outside of Google’s control.
Take, for example, the WHO declaration of the global COVID-19 pandemic on March 11, 2020 (#8 in the labeled events). COVID-19 changed consumer behavior dramatically in the following months, including huge shifts in e-commerce as brick-and-mortar businesses shut down. While it’s likely that Google launched algorithm updates to respond to these changes, COVID-19 itself reshaped global behavior and search rankings along with it.
The summer of 2017 is an entirely different story, with unexplained algorithm flux that lasted for months. Truthfully, we still don’t really know what happened. One possibility is that Google’s Mobile-first index update caused large shifts in rankings as it was being tested in the year or more preceding the official launch, but at this point we can only speculate.
What were the hottest days?
While some of the hottest confirmed days on Google from 2014-2022 were named updates, such as the “core” updates, there are a few interesting exceptions …
The hottest day on record in MozCast was a major outage in August of 2022 that measured a whopping 124°F. While this corresponded with an electrical fire in an Iowa-based Google data center, Google officially said that the two events were unrelated.
The 8th and 10th hottest confirmed days over these nine years were serious bugs in the Google index that resulted in pages being dropped from the search results. Our analysis in April of 2019 measured about 4% of pages in our data set disappearing from search. Thankfully, these events were short-lived, but it goes to show that not all changes are meaningful or actionable.
The largest search penalty on record was the “Intrustive Interstitial Penalty” in January of 2017, that punished sites with aggressive popups and overlays that disrupted the user experience.
What was the biggest update?
If we’re talking about named updates, the highest-temperature (i.e. largest impact) was actually a penalty-reversal, phase 2 of the Penguin 4.0 update in October of 2016. Phase 2 removed all previous Penguin penalties, an unprecedented (and, so far, unrepeated) move on Google’s part, and a seismic algorithmic event.
Keep in mind, this was just the impact of undoing Penguin. If we factor in the 7+ major, named Penguin updates (and possibly dozens of smaller updates and data refreshes), then Penguin is the clear winner among the thousands of changes from 2014-2022.
What’s in store for the future?
Ultimately, Google’s “weather” isn’t a natural phenomenon — it’s driven by human choices, and, occasionally, human mistakes. While we can’t predict future changes, we can try to learn from the patterns of the past and read between the lines of Google’s messaging.
As machine learning drives more of Google search (and Bing’s recent launch of ChatGPT capabilities will only accelerate this), the signals from Google will likely become less and less clear, but the themes of the next few years will probably be familiar.
Google wants content that is valuable for searchers, and reflects expertise, authority, and trust. Google wants that content delivered on sites that are fast, secure, and mobile-friendly. Google doesn’t want you to build sites purely for SEO or to clutter their (expensive) index with junk.
How any of that is measured or codified into the algorithm is a much more complicated story, and it naturally evolves as the internet evolves. The last nine years can teach us about the future and Google’s priorities, but there will no doubt be surprises. The only guarantee is that — as long as people need to find information, people, places and things, both search engines and search engine optimization will continue to exist.
For a full list of major algorithm updates back to 2003’s “Boston” update, check out our Google algorithm update history. For daily data on Google rankings flux and SERP feature trends, visit our MozCast SERP tracking project.
If it feels like Google search is changing faster than ever, it’s not your imagination. Google reported an astonishing 4,367 “launches” in 2021, up dramatically from 350-400 in 2009. On average, that’s nearly a dozen changes per day.
Many of these more than 25,000 changes were undoubtedly very small, but some were outright cataclysmic. What can we learn from nine years of Google rankings data, and how can it help us prepare for the future?
Is the algorithm heating up?
Thanks to our MozCast research project, we have daily algorithm flux data going back to 2014. The visualization below shows nine full years of daily Google “temperatures” (with hotter days representing more movement on page one of rankings):
White 2022 was certainly hotter than 2014, the pattern of rising temperatures over time is much more complicated than that. In some cases, we can’t map temperatures directly to algorithm updates, and in others, there are known causes outside of Google’s control.
Take, for example, the WHO declaration of the global COVID-19 pandemic on March 11, 2020 (#8 in the labeled events). COVID-19 changed consumer behavior dramatically in the following months, including huge shifts in e-commerce as brick-and-mortar businesses shut down. While it’s likely that Google launched algorithm updates to respond to these changes, COVID-19 itself reshaped global behavior and search rankings along with it.
The summer of 2017 is an entirely different story, with unexplained algorithm flux that lasted for months. Truthfully, we still don’t really know what happened. One possibility is that Google’s Mobile-first index update caused large shifts in rankings as it was being tested in the year or more preceding the official launch, but at this point we can only speculate.
What were the hottest days?
While some of the hottest confirmed days on Google from 2014-2022 were named updates, such as the “core” updates, there are a few interesting exceptions …
The hottest day on record in MozCast was a major outage in August of 2022 that measured a whopping 124°F. While this corresponded with an electrical fire in an Iowa-based Google data center, Google officially said that the two events were unrelated.
The 8th and 10th hottest confirmed days over these nine years were serious bugs in the Google index that resulted in pages being dropped from the search results. Our analysis in April of 2019 measured about 4% of pages in our data set disappearing from search. Thankfully, these events were short-lived, but it goes to show that not all changes are meaningful or actionable.
The largest search penalty on record was the “Intrustive Interstitial Penalty” in January of 2017, that punished sites with aggressive popups and overlays that disrupted the user experience.
What was the biggest update?
If we’re talking about named updates, the highest-temperature (i.e. largest impact) was actually a penalty-reversal, phase 2 of the Penguin 4.0 update in October of 2016. Phase 2 removed all previous Penguin penalties, an unprecedented (and, so far, unrepeated) move on Google’s part, and a seismic algorithmic event.
Keep in mind, this was just the impact of undoing Penguin. If we factor in the 7+ major, named Penguin updates (and possibly dozens of smaller updates and data refreshes), then Penguin is the clear winner among the thousands of changes from 2014-2022.
What’s in store for the future?
Ultimately, Google’s “weather” isn’t a natural phenomenon — it’s driven by human choices, and, occasionally, human mistakes. While we can’t predict future changes, we can try to learn from the patterns of the past and read between the lines of Google’s messaging.
As machine learning drives more of Google search (and Bing’s recent launch of ChatGPT capabilities will only accelerate this), the signals from Google will likely become less and less clear, but the themes of the next few years will probably be familiar.
Google wants content that is valuable for searchers, and reflects expertise, authority, and trust. Google wants that content delivered on sites that are fast, secure, and mobile-friendly. Google doesn’t want you to build sites purely for SEO or to clutter their (expensive) index with junk.
How any of that is measured or codified into the algorithm is a much more complicated story, and it naturally evolves as the internet evolves. The last nine years can teach us about the future and Google’s priorities, but there will no doubt be surprises. The only guarantee is that — as long as people need to find information, people, places and things, both search engines and search engine optimization will continue to exist.
For a full list of major algorithm updates back to 2003’s “Boston” update, check out our Google algorithm update history. For daily data on Google rankings flux and SERP feature trends, visit our MozCast SERP tracking project.
This week, Shawn talks you through the ways your site structure, your sitemaps, and Google Search Console work together to help Google crawl your site, and what you can do to approve Googlebot’s efficiency.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. Welcome to this week's edition of Whiteboard Friday, and I'm your host, SEO Shawn. This week I'm going to talk about how do you help Google crawl your website more efficiently.
Site structure, sitemaps, & GSC
Now I'll start at a high level. I want to talk about your site structure, your sitemaps, and Google Search Console, why they're important and how they're all related together.
So site structure, let's think of a spider. As he builds his web, he makes sure to connect every string efficiently together so that he can get across to anywhere he needs to get to, to catch his prey. Well, your website needs to work in that similar fashion. You need to make sure you have a really solid structure, with interlinking between all your pages, categories and things of that sort, to make sure that Google can easily get across your site and do it efficiently without too many disruptions or blockers so they stop crawling your site.
Your sitemaps are kind of a shopping list or a to-do list, if you will, of the URLs you want to make sure that Google is crawling whenever they see your site. Now Google isn't always going to crawl those URLs, but at least you want to make sure that they see that they're there, and that's the best way to do that.
GSC and properties
Then Google Search Console, anybody that creates a website should always connect a property to their website so they can see all the information that Google is willing to share with you about your site and how it's performing.
So let's take a quick deep dive into Search Console and properties. So as I mentioned previously, you always should be creating that initial property for your site. There's a wealth of information you get out of that. Of course, natively, in the Search Console UI, there are some limitations. It's 1,000 rows of data they're able to give to you. Good, you can definitely do some filtering, regex, good stuff like that to slice and dice, but you're still limited to that 1,000 URLs in the native UI.
So something I have actually been doing for the last decade or so is creating properties at a directory level to get that same amount of information, but to a specific directory. Some good stuff that I have been able to do with that is connect to Looker Studio and be able to create great graphs and reports, filters of those directories. To me, it's a lot easier to do it that way. Of course, you could probably do it with just a single property, but this just gets us more information at a directory level, like example.com/toys.
Sitemaps
Next I want to dive into our sitemaps. So as you know, it's a laundry list of URLs you want Google to see. Typically you throw 50,000, if your site is that big, into a sitemap, drop it at the root, put it in robots.txt, go ahead and throw it in Search Console, and Google will tell you that they've successfully accepted it, crawled it, and then you can see the page indexation report and what they're giving you about that sitemap. But a problem that I've been having lately, especially at the site that I'm working at now with millions of URLs, is that Google doesn't always accept that sitemap, at least not right away. Sometimes it's taken a couple weeks for Google to even say, "Hey, all right, we'll accept this sitemap," and even longer to get any useful data out of that.
So to help get past that issue that I've been having, I now break my sitemaps into 10,000 URL pieces. It's a lot more sitemaps, but that's what your sitemap index is for. It helps Google collect all that information bundled up nicely, and they get to it. The trade-off is Google accepts those sitemaps immediately, and within a day I'm getting useful information.
Now I like to go even further than that, and I break up my sitemaps by directory. So each sitemap or sitemap index is of the URLs in that directory, if it's over 50,000 URLs. That's extremely helpful because now, when you combine that with your property at that toys directory, like we have here in our example, I'm able to see just the indexation status for those URLs by themselves. I'm no longer forced to use that root property that has a hodgepodge of data for all your URLs. Extremely helpful, especially if I'm launching a new product line and I want to make sure that Google is indexing and giving me the data for that new toy line that I have.
Always I think a good practice is make sure you ping your sitemaps. Google has an API, so you can definitely automate that process. But it's super helpful. Every time there's any kind of a change to your content, add sites, add URLs, remove URLs, things like that, you just want to ping Google and let them know that you have a change to your sitemap.
All the data
So now we've done all this great stuff. What do we get out of that? Well, you get tons of data, and I mean a ton of data. It's super useful, as mentioned, when you're trying to launch a new product line or diagnose why there's something wrong with your site. Again, we do have a 1,000 limit per property. But when you create multiple properties, you get even more data, specific to those properties, that you could export and get all the valuable information from.
Even cooler is recently Google rolled out their Inspection API. Super helpful because now you can actually run a script, see what the status is of those URLs, and hopefully some good information out of that. But again, true to Google's nature, we have a 2,000 limit for calls on the API per day per property. However, that's per property. So if you have a lot of properties, and you can have up to 50 Search Console properties per account, now you could roll 100,000 URLs into that script and get the data for a lot more URLs per day. What's super awesome is Screaming Frog has made some great changes to the tool that we all love and use every day, to where you cannot only connect that API, but you can share that limit across all your properties. So now grab those 100,000 URLs, slap them in Screaming Frog, drink some coffee, kick back and wait till the data pours out. Super helpful, super amazing. It makes my job insanely easier now because of that. Now I'm able to go through and see: Is it a Google thing, discovered or crawled and not indexed? Or are there issues with my site to why my URLs are not showing in Google?
Bonus: Page experience report
As an added bonus, you have the page experience report in Search Console that talks about Core Vitals, mobile usability, and some other data points that you could get broken down at the directory level. That makes it a lot easier to diagnose and see what's going on with your site.
Hopefully you found this to be a useful Whiteboard Friday. I know these tactics have definitely helped me throughout my career in SEO, and hopefully they'll help you too. Until next time, let's keep crawling.
This week, Shawn talks you through the ways your site structure, your sitemaps, and Google Search Console work together to help Google crawl your site, and what you can do to approve Googlebot’s efficiency.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. Welcome to this week's edition of Whiteboard Friday, and I'm your host, SEO Shawn. This week I'm going to talk about how do you help Google crawl your website more efficiently.
Site structure, sitemaps, & GSC
Now I'll start at a high level. I want to talk about your site structure, your sitemaps, and Google Search Console, why they're important and how they're all related together.
So site structure, let's think of a spider. As he builds his web, he makes sure to connect every string efficiently together so that he can get across to anywhere he needs to get to, to catch his prey. Well, your website needs to work in that similar fashion. You need to make sure you have a really solid structure, with interlinking between all your pages, categories and things of that sort, to make sure that Google can easily get across your site and do it efficiently without too many disruptions or blockers so they stop crawling your site.
Your sitemaps are kind of a shopping list or a to-do list, if you will, of the URLs you want to make sure that Google is crawling whenever they see your site. Now Google isn't always going to crawl those URLs, but at least you want to make sure that they see that they're there, and that's the best way to do that.
GSC and properties
Then Google Search Console, anybody that creates a website should always connect a property to their website so they can see all the information that Google is willing to share with you about your site and how it's performing.
So let's take a quick deep dive into Search Console and properties. So as I mentioned previously, you always should be creating that initial property for your site. There's a wealth of information you get out of that. Of course, natively, in the Search Console UI, there are some limitations. It's 1,000 rows of data they're able to give to you. Good, you can definitely do some filtering, regex, good stuff like that to slice and dice, but you're still limited to that 1,000 URLs in the native UI.
So something I have actually been doing for the last decade or so is creating properties at a directory level to get that same amount of information, but to a specific directory. Some good stuff that I have been able to do with that is connect to Looker Studio and be able to create great graphs and reports, filters of those directories. To me, it's a lot easier to do it that way. Of course, you could probably do it with just a single property, but this just gets us more information at a directory level, like example.com/toys.
Sitemaps
Next I want to dive into our sitemaps. So as you know, it's a laundry list of URLs you want Google to see. Typically you throw 50,000, if your site is that big, into a sitemap, drop it at the root, put it in robots.txt, go ahead and throw it in Search Console, and Google will tell you that they've successfully accepted it, crawled it, and then you can see the page indexation report and what they're giving you about that sitemap. But a problem that I've been having lately, especially at the site that I'm working at now with millions of URLs, is that Google doesn't always accept that sitemap, at least not right away. Sometimes it's taken a couple weeks for Google to even say, "Hey, all right, we'll accept this sitemap," and even longer to get any useful data out of that.
So to help get past that issue that I've been having, I now break my sitemaps into 10,000 URL pieces. It's a lot more sitemaps, but that's what your sitemap index is for. It helps Google collect all that information bundled up nicely, and they get to it. The trade-off is Google accepts those sitemaps immediately, and within a day I'm getting useful information.
Now I like to go even further than that, and I break up my sitemaps by directory. So each sitemap or sitemap index is of the URLs in that directory, if it's over 50,000 URLs. That's extremely helpful because now, when you combine that with your property at that toys directory, like we have here in our example, I'm able to see just the indexation status for those URLs by themselves. I'm no longer forced to use that root property that has a hodgepodge of data for all your URLs. Extremely helpful, especially if I'm launching a new product line and I want to make sure that Google is indexing and giving me the data for that new toy line that I have.
Always I think a good practice is make sure you ping your sitemaps. Google has an API, so you can definitely automate that process. But it's super helpful. Every time there's any kind of a change to your content, add sites, add URLs, remove URLs, things like that, you just want to ping Google and let them know that you have a change to your sitemap.
All the data
So now we've done all this great stuff. What do we get out of that? Well, you get tons of data, and I mean a ton of data. It's super useful, as mentioned, when you're trying to launch a new product line or diagnose why there's something wrong with your site. Again, we do have a 1,000 limit per property. But when you create multiple properties, you get even more data, specific to those properties, that you could export and get all the valuable information from.
Even cooler is recently Google rolled out their Inspection API. Super helpful because now you can actually run a script, see what the status is of those URLs, and hopefully some good information out of that. But again, true to Google's nature, we have a 2,000 limit for calls on the API per day per property. However, that's per property. So if you have a lot of properties, and you can have up to 50 Search Console properties per account, now you could roll 100,000 URLs into that script and get the data for a lot more URLs per day. What's super awesome is Screaming Frog has made some great changes to the tool that we all love and use every day, to where you cannot only connect that API, but you can share that limit across all your properties. So now grab those 100,000 URLs, slap them in Screaming Frog, drink some coffee, kick back and wait till the data pours out. Super helpful, super amazing. It makes my job insanely easier now because of that. Now I'm able to go through and see: Is it a Google thing, discovered or crawled and not indexed? Or are there issues with my site to why my URLs are not showing in Google?
Bonus: Page experience report
As an added bonus, you have the page experience report in Search Console that talks about Core Vitals, mobile usability, and some other data points that you could get broken down at the directory level. That makes it a lot easier to diagnose and see what's going on with your site.
Hopefully you found this to be a useful Whiteboard Friday. I know these tactics have definitely helped me throughout my career in SEO, and hopefully they'll help you too. Until next time, let's keep crawling.
This week, Shawn talks you through the ways your site structure, your sitemaps, and Google Search Console work together to help Google crawl your site, and what you can do to approve Googlebot’s efficiency.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. Welcome to this week's edition of Whiteboard Friday, and I'm your host, SEO Shawn. This week I'm going to talk about how do you help Google crawl your website more efficiently.
Site structure, sitemaps, & GSC
Now I'll start at a high level. I want to talk about your site structure, your sitemaps, and Google Search Console, why they're important and how they're all related together.
So site structure, let's think of a spider. As he builds his web, he makes sure to connect every string efficiently together so that he can get across to anywhere he needs to get to, to catch his prey. Well, your website needs to work in that similar fashion. You need to make sure you have a really solid structure, with interlinking between all your pages, categories and things of that sort, to make sure that Google can easily get across your site and do it efficiently without too many disruptions or blockers so they stop crawling your site.
Your sitemaps are kind of a shopping list or a to-do list, if you will, of the URLs you want to make sure that Google is crawling whenever they see your site. Now Google isn't always going to crawl those URLs, but at least you want to make sure that they see that they're there, and that's the best way to do that.
GSC and properties
Then Google Search Console, anybody that creates a website should always connect a property to their website so they can see all the information that Google is willing to share with you about your site and how it's performing.
So let's take a quick deep dive into Search Console and properties. So as I mentioned previously, you always should be creating that initial property for your site. There's a wealth of information you get out of that. Of course, natively, in the Search Console UI, there are some limitations. It's 1,000 rows of data they're able to give to you. Good, you can definitely do some filtering, regex, good stuff like that to slice and dice, but you're still limited to that 1,000 URLs in the native UI.
So something I have actually been doing for the last decade or so is creating properties at a directory level to get that same amount of information, but to a specific directory. Some good stuff that I have been able to do with that is connect to Looker Studio and be able to create great graphs and reports, filters of those directories. To me, it's a lot easier to do it that way. Of course, you could probably do it with just a single property, but this just gets us more information at a directory level, like example.com/toys.
Sitemaps
Next I want to dive into our sitemaps. So as you know, it's a laundry list of URLs you want Google to see. Typically you throw 50,000, if your site is that big, into a sitemap, drop it at the root, put it in robots.txt, go ahead and throw it in Search Console, and Google will tell you that they've successfully accepted it, crawled it, and then you can see the page indexation report and what they're giving you about that sitemap. But a problem that I've been having lately, especially at the site that I'm working at now with millions of URLs, is that Google doesn't always accept that sitemap, at least not right away. Sometimes it's taken a couple weeks for Google to even say, "Hey, all right, we'll accept this sitemap," and even longer to get any useful data out of that.
So to help get past that issue that I've been having, I now break my sitemaps into 10,000 URL pieces. It's a lot more sitemaps, but that's what your sitemap index is for. It helps Google collect all that information bundled up nicely, and they get to it. The trade-off is Google accepts those sitemaps immediately, and within a day I'm getting useful information.
Now I like to go even further than that, and I break up my sitemaps by directory. So each sitemap or sitemap index is of the URLs in that directory, if it's over 50,000 URLs. That's extremely helpful because now, when you combine that with your property at that toys directory, like we have here in our example, I'm able to see just the indexation status for those URLs by themselves. I'm no longer forced to use that root property that has a hodgepodge of data for all your URLs. Extremely helpful, especially if I'm launching a new product line and I want to make sure that Google is indexing and giving me the data for that new toy line that I have.
Always I think a good practice is make sure you ping your sitemaps. Google has an API, so you can definitely automate that process. But it's super helpful. Every time there's any kind of a change to your content, add sites, add URLs, remove URLs, things like that, you just want to ping Google and let them know that you have a change to your sitemap.
All the data
So now we've done all this great stuff. What do we get out of that? Well, you get tons of data, and I mean a ton of data. It's super useful, as mentioned, when you're trying to launch a new product line or diagnose why there's something wrong with your site. Again, we do have a 1,000 limit per property. But when you create multiple properties, you get even more data, specific to those properties, that you could export and get all the valuable information from.
Even cooler is recently Google rolled out their Inspection API. Super helpful because now you can actually run a script, see what the status is of those URLs, and hopefully some good information out of that. But again, true to Google's nature, we have a 2,000 limit for calls on the API per day per property. However, that's per property. So if you have a lot of properties, and you can have up to 50 Search Console properties per account, now you could roll 100,000 URLs into that script and get the data for a lot more URLs per day. What's super awesome is Screaming Frog has made some great changes to the tool that we all love and use every day, to where you cannot only connect that API, but you can share that limit across all your properties. So now grab those 100,000 URLs, slap them in Screaming Frog, drink some coffee, kick back and wait till the data pours out. Super helpful, super amazing. It makes my job insanely easier now because of that. Now I'm able to go through and see: Is it a Google thing, discovered or crawled and not indexed? Or are there issues with my site to why my URLs are not showing in Google?
Bonus: Page experience report
As an added bonus, you have the page experience report in Search Console that talks about Core Vitals, mobile usability, and some other data points that you could get broken down at the directory level. That makes it a lot easier to diagnose and see what's going on with your site.
Hopefully you found this to be a useful Whiteboard Friday. I know these tactics have definitely helped me throughout my career in SEO, and hopefully they'll help you too. Until next time, let's keep crawling.
This week, Shawn talks you through the ways your site structure, your sitemaps, and Google Search Console work together to help Google crawl your site, and what you can do to approve Googlebot’s efficiency.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. Welcome to this week's edition of Whiteboard Friday, and I'm your host, SEO Shawn. This week I'm going to talk about how do you help Google crawl your website more efficiently.
Site structure, sitemaps, & GSC
Now I'll start at a high level. I want to talk about your site structure, your sitemaps, and Google Search Console, why they're important and how they're all related together.
So site structure, let's think of a spider. As he builds his web, he makes sure to connect every string efficiently together so that he can get across to anywhere he needs to get to, to catch his prey. Well, your website needs to work in that similar fashion. You need to make sure you have a really solid structure, with interlinking between all your pages, categories and things of that sort, to make sure that Google can easily get across your site and do it efficiently without too many disruptions or blockers so they stop crawling your site.
Your sitemaps are kind of a shopping list or a to-do list, if you will, of the URLs you want to make sure that Google is crawling whenever they see your site. Now Google isn't always going to crawl those URLs, but at least you want to make sure that they see that they're there, and that's the best way to do that.
GSC and properties
Then Google Search Console, anybody that creates a website should always connect a property to their website so they can see all the information that Google is willing to share with you about your site and how it's performing.
So let's take a quick deep dive into Search Console and properties. So as I mentioned previously, you always should be creating that initial property for your site. There's a wealth of information you get out of that. Of course, natively, in the Search Console UI, there are some limitations. It's 1,000 rows of data they're able to give to you. Good, you can definitely do some filtering, regex, good stuff like that to slice and dice, but you're still limited to that 1,000 URLs in the native UI.
So something I have actually been doing for the last decade or so is creating properties at a directory level to get that same amount of information, but to a specific directory. Some good stuff that I have been able to do with that is connect to Looker Studio and be able to create great graphs and reports, filters of those directories. To me, it's a lot easier to do it that way. Of course, you could probably do it with just a single property, but this just gets us more information at a directory level, like example.com/toys.
Sitemaps
Next I want to dive into our sitemaps. So as you know, it's a laundry list of URLs you want Google to see. Typically you throw 50,000, if your site is that big, into a sitemap, drop it at the root, put it in robots.txt, go ahead and throw it in Search Console, and Google will tell you that they've successfully accepted it, crawled it, and then you can see the page indexation report and what they're giving you about that sitemap. But a problem that I've been having lately, especially at the site that I'm working at now with millions of URLs, is that Google doesn't always accept that sitemap, at least not right away. Sometimes it's taken a couple weeks for Google to even say, "Hey, all right, we'll accept this sitemap," and even longer to get any useful data out of that.
So to help get past that issue that I've been having, I now break my sitemaps into 10,000 URL pieces. It's a lot more sitemaps, but that's what your sitemap index is for. It helps Google collect all that information bundled up nicely, and they get to it. The trade-off is Google accepts those sitemaps immediately, and within a day I'm getting useful information.
Now I like to go even further than that, and I break up my sitemaps by directory. So each sitemap or sitemap index is of the URLs in that directory, if it's over 50,000 URLs. That's extremely helpful because now, when you combine that with your property at that toys directory, like we have here in our example, I'm able to see just the indexation status for those URLs by themselves. I'm no longer forced to use that root property that has a hodgepodge of data for all your URLs. Extremely helpful, especially if I'm launching a new product line and I want to make sure that Google is indexing and giving me the data for that new toy line that I have.
Always I think a good practice is make sure you ping your sitemaps. Google has an API, so you can definitely automate that process. But it's super helpful. Every time there's any kind of a change to your content, add sites, add URLs, remove URLs, things like that, you just want to ping Google and let them know that you have a change to your sitemap.
All the data
So now we've done all this great stuff. What do we get out of that? Well, you get tons of data, and I mean a ton of data. It's super useful, as mentioned, when you're trying to launch a new product line or diagnose why there's something wrong with your site. Again, we do have a 1,000 limit per property. But when you create multiple properties, you get even more data, specific to those properties, that you could export and get all the valuable information from.
Even cooler is recently Google rolled out their Inspection API. Super helpful because now you can actually run a script, see what the status is of those URLs, and hopefully some good information out of that. But again, true to Google's nature, we have a 2,000 limit for calls on the API per day per property. However, that's per property. So if you have a lot of properties, and you can have up to 50 Search Console properties per account, now you could roll 100,000 URLs into that script and get the data for a lot more URLs per day. What's super awesome is Screaming Frog has made some great changes to the tool that we all love and use every day, to where you cannot only connect that API, but you can share that limit across all your properties. So now grab those 100,000 URLs, slap them in Screaming Frog, drink some coffee, kick back and wait till the data pours out. Super helpful, super amazing. It makes my job insanely easier now because of that. Now I'm able to go through and see: Is it a Google thing, discovered or crawled and not indexed? Or are there issues with my site to why my URLs are not showing in Google?
Bonus: Page experience report
As an added bonus, you have the page experience report in Search Console that talks about Core Vitals, mobile usability, and some other data points that you could get broken down at the directory level. That makes it a lot easier to diagnose and see what's going on with your site.
Hopefully you found this to be a useful Whiteboard Friday. I know these tactics have definitely helped me throughout my career in SEO, and hopefully they'll help you too. Until next time, let's keep crawling.
This week, Shawn talks you through the ways your site structure, your sitemaps, and Google Search Console work together to help Google crawl your site, and what you can do to approve Googlebot’s efficiency.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. Welcome to this week's edition of Whiteboard Friday, and I'm your host, SEO Shawn. This week I'm going to talk about how do you help Google crawl your website more efficiently.
Site structure, sitemaps, & GSC
Now I'll start at a high level. I want to talk about your site structure, your sitemaps, and Google Search Console, why they're important and how they're all related together.
So site structure, let's think of a spider. As he builds his web, he makes sure to connect every string efficiently together so that he can get across to anywhere he needs to get to, to catch his prey. Well, your website needs to work in that similar fashion. You need to make sure you have a really solid structure, with interlinking between all your pages, categories and things of that sort, to make sure that Google can easily get across your site and do it efficiently without too many disruptions or blockers so they stop crawling your site.
Your sitemaps are kind of a shopping list or a to-do list, if you will, of the URLs you want to make sure that Google is crawling whenever they see your site. Now Google isn't always going to crawl those URLs, but at least you want to make sure that they see that they're there, and that's the best way to do that.
GSC and properties
Then Google Search Console, anybody that creates a website should always connect a property to their website so they can see all the information that Google is willing to share with you about your site and how it's performing.
So let's take a quick deep dive into Search Console and properties. So as I mentioned previously, you always should be creating that initial property for your site. There's a wealth of information you get out of that. Of course, natively, in the Search Console UI, there are some limitations. It's 1,000 rows of data they're able to give to you. Good, you can definitely do some filtering, regex, good stuff like that to slice and dice, but you're still limited to that 1,000 URLs in the native UI.
So something I have actually been doing for the last decade or so is creating properties at a directory level to get that same amount of information, but to a specific directory. Some good stuff that I have been able to do with that is connect to Looker Studio and be able to create great graphs and reports, filters of those directories. To me, it's a lot easier to do it that way. Of course, you could probably do it with just a single property, but this just gets us more information at a directory level, like example.com/toys.
Sitemaps
Next I want to dive into our sitemaps. So as you know, it's a laundry list of URLs you want Google to see. Typically you throw 50,000, if your site is that big, into a sitemap, drop it at the root, put it in robots.txt, go ahead and throw it in Search Console, and Google will tell you that they've successfully accepted it, crawled it, and then you can see the page indexation report and what they're giving you about that sitemap. But a problem that I've been having lately, especially at the site that I'm working at now with millions of URLs, is that Google doesn't always accept that sitemap, at least not right away. Sometimes it's taken a couple weeks for Google to even say, "Hey, all right, we'll accept this sitemap," and even longer to get any useful data out of that.
So to help get past that issue that I've been having, I now break my sitemaps into 10,000 URL pieces. It's a lot more sitemaps, but that's what your sitemap index is for. It helps Google collect all that information bundled up nicely, and they get to it. The trade-off is Google accepts those sitemaps immediately, and within a day I'm getting useful information.
Now I like to go even further than that, and I break up my sitemaps by directory. So each sitemap or sitemap index is of the URLs in that directory, if it's over 50,000 URLs. That's extremely helpful because now, when you combine that with your property at that toys directory, like we have here in our example, I'm able to see just the indexation status for those URLs by themselves. I'm no longer forced to use that root property that has a hodgepodge of data for all your URLs. Extremely helpful, especially if I'm launching a new product line and I want to make sure that Google is indexing and giving me the data for that new toy line that I have.
Always I think a good practice is make sure you ping your sitemaps. Google has an API, so you can definitely automate that process. But it's super helpful. Every time there's any kind of a change to your content, add sites, add URLs, remove URLs, things like that, you just want to ping Google and let them know that you have a change to your sitemap.
All the data
So now we've done all this great stuff. What do we get out of that? Well, you get tons of data, and I mean a ton of data. It's super useful, as mentioned, when you're trying to launch a new product line or diagnose why there's something wrong with your site. Again, we do have a 1,000 limit per property. But when you create multiple properties, you get even more data, specific to those properties, that you could export and get all the valuable information from.
Even cooler is recently Google rolled out their Inspection API. Super helpful because now you can actually run a script, see what the status is of those URLs, and hopefully some good information out of that. But again, true to Google's nature, we have a 2,000 limit for calls on the API per day per property. However, that's per property. So if you have a lot of properties, and you can have up to 50 Search Console properties per account, now you could roll 100,000 URLs into that script and get the data for a lot more URLs per day. What's super awesome is Screaming Frog has made some great changes to the tool that we all love and use every day, to where you cannot only connect that API, but you can share that limit across all your properties. So now grab those 100,000 URLs, slap them in Screaming Frog, drink some coffee, kick back and wait till the data pours out. Super helpful, super amazing. It makes my job insanely easier now because of that. Now I'm able to go through and see: Is it a Google thing, discovered or crawled and not indexed? Or are there issues with my site to why my URLs are not showing in Google?
Bonus: Page experience report
As an added bonus, you have the page experience report in Search Console that talks about Core Vitals, mobile usability, and some other data points that you could get broken down at the directory level. That makes it a lot easier to diagnose and see what's going on with your site.
Hopefully you found this to be a useful Whiteboard Friday. I know these tactics have definitely helped me throughout my career in SEO, and hopefully they'll help you too. Until next time, let's keep crawling.
This week, Shawn talks you through the ways your site structure, your sitemaps, and Google Search Console work together to help Google crawl your site, and what you can do to approve Googlebot’s efficiency.
Click on the whiteboard image above to open a high resolution version in a new tab!
Video Transcription
Howdy, Moz fans. Welcome to this week's edition of Whiteboard Friday, and I'm your host, SEO Shawn. This week I'm going to talk about how do you help Google crawl your website more efficiently.
Site structure, sitemaps, & GSC
Now I'll start at a high level. I want to talk about your site structure, your sitemaps, and Google Search Console, why they're important and how they're all related together.
So site structure, let's think of a spider. As he builds his web, he makes sure to connect every string efficiently together so that he can get across to anywhere he needs to get to, to catch his prey. Well, your website needs to work in that similar fashion. You need to make sure you have a really solid structure, with interlinking between all your pages, categories and things of that sort, to make sure that Google can easily get across your site and do it efficiently without too many disruptions or blockers so they stop crawling your site.
Your sitemaps are kind of a shopping list or a to-do list, if you will, of the URLs you want to make sure that Google is crawling whenever they see your site. Now Google isn't always going to crawl those URLs, but at least you want to make sure that they see that they're there, and that's the best way to do that.
GSC and properties
Then Google Search Console, anybody that creates a website should always connect a property to their website so they can see all the information that Google is willing to share with you about your site and how it's performing.
So let's take a quick deep dive into Search Console and properties. So as I mentioned previously, you always should be creating that initial property for your site. There's a wealth of information you get out of that. Of course, natively, in the Search Console UI, there are some limitations. It's 1,000 rows of data they're able to give to you. Good, you can definitely do some filtering, regex, good stuff like that to slice and dice, but you're still limited to that 1,000 URLs in the native UI.
So something I have actually been doing for the last decade or so is creating properties at a directory level to get that same amount of information, but to a specific directory. Some good stuff that I have been able to do with that is connect to Looker Studio and be able to create great graphs and reports, filters of those directories. To me, it's a lot easier to do it that way. Of course, you could probably do it with just a single property, but this just gets us more information at a directory level, like example.com/toys.
Sitemaps
Next I want to dive into our sitemaps. So as you know, it's a laundry list of URLs you want Google to see. Typically you throw 50,000, if your site is that big, into a sitemap, drop it at the root, put it in robots.txt, go ahead and throw it in Search Console, and Google will tell you that they've successfully accepted it, crawled it, and then you can see the page indexation report and what they're giving you about that sitemap. But a problem that I've been having lately, especially at the site that I'm working at now with millions of URLs, is that Google doesn't always accept that sitemap, at least not right away. Sometimes it's taken a couple weeks for Google to even say, "Hey, all right, we'll accept this sitemap," and even longer to get any useful data out of that.
So to help get past that issue that I've been having, I now break my sitemaps into 10,000 URL pieces. It's a lot more sitemaps, but that's what your sitemap index is for. It helps Google collect all that information bundled up nicely, and they get to it. The trade-off is Google accepts those sitemaps immediately, and within a day I'm getting useful information.
Now I like to go even further than that, and I break up my sitemaps by directory. So each sitemap or sitemap index is of the URLs in that directory, if it's over 50,000 URLs. That's extremely helpful because now, when you combine that with your property at that toys directory, like we have here in our example, I'm able to see just the indexation status for those URLs by themselves. I'm no longer forced to use that root property that has a hodgepodge of data for all your URLs. Extremely helpful, especially if I'm launching a new product line and I want to make sure that Google is indexing and giving me the data for that new toy line that I have.
Always I think a good practice is make sure you ping your sitemaps. Google has an API, so you can definitely automate that process. But it's super helpful. Every time there's any kind of a change to your content, add sites, add URLs, remove URLs, things like that, you just want to ping Google and let them know that you have a change to your sitemap.
All the data
So now we've done all this great stuff. What do we get out of that? Well, you get tons of data, and I mean a ton of data. It's super useful, as mentioned, when you're trying to launch a new product line or diagnose why there's something wrong with your site. Again, we do have a 1,000 limit per property. But when you create multiple properties, you get even more data, specific to those properties, that you could export and get all the valuable information from.
Even cooler is recently Google rolled out their Inspection API. Super helpful because now you can actually run a script, see what the status is of those URLs, and hopefully some good information out of that. But again, true to Google's nature, we have a 2,000 limit for calls on the API per day per property. However, that's per property. So if you have a lot of properties, and you can have up to 50 Search Console properties per account, now you could roll 100,000 URLs into that script and get the data for a lot more URLs per day. What's super awesome is Screaming Frog has made some great changes to the tool that we all love and use every day, to where you cannot only connect that API, but you can share that limit across all your properties. So now grab those 100,000 URLs, slap them in Screaming Frog, drink some coffee, kick back and wait till the data pours out. Super helpful, super amazing. It makes my job insanely easier now because of that. Now I'm able to go through and see: Is it a Google thing, discovered or crawled and not indexed? Or are there issues with my site to why my URLs are not showing in Google?
Bonus: Page experience report
As an added bonus, you have the page experience report in Search Console that talks about Core Vitals, mobile usability, and some other data points that you could get broken down at the directory level. That makes it a lot easier to diagnose and see what's going on with your site.
Hopefully you found this to be a useful Whiteboard Friday. I know these tactics have definitely helped me throughout my career in SEO, and hopefully they'll help you too. Until next time, let's keep crawling.
The adage is that if you're not paying for the service, you are the product. Unfortunately, this rings especially true in the analytics world.
The analytics space is changing, though, and there are many alternatives — both free and paid — that take into account privacy, cookie-less tracking, GDPR compliance, core web vitals, and more. The current leader in the analytics space is Universal Analytics (UA), which has been tracking web data since 2005. But Google is going to stop tracking any new data in UA as of July 1, 2023 (Happy Canada Day?).
This is a move to get users to migrate to Google Analytics 4 (GA4), which is a whole new way of tracking and navigating. In a move from sessions to an event model, GA4 will require some knowledge and familiarity to get up and running, as there’s quite a bit changing.
If you haven’t yet set up GA4 yet, or are on the fence, now is the time to take a look at what else is out there and how the landscape has changed. We've broken the alternatives up into three categories:
Web analytics
Product analytics
Data warehousing solutions
How legal is Google Analytics?
The short answer is: depends on where your visitors are from.
Without getting too deep into legal jargon, the user has control over the options now and although there are a few lawsuits out there I'm sure Google will make the proper adjustments to ensure that everyone using it will be mostly covered, eventually.
This isn't happening right now and under GDPR there are requirements that Google Analytics isn't passing. Also, by using a product that's integrated into other products I doubt they'll ever see 100% coverage.
Even if you’re planning on installing Google Analytics 4, you can run an alternative to ensure that you’re getting the right data and test as you go
Web Analytics
More than a visitor counter, but often simplified, web analytics providers focus on giving services to small businesses, bloggers, and small websites. Their metrics are also the closest to Google Analytics. Most of the web analytics tools below are easy to maintain, quick to install, and don't need self-hosting. However, some will offer self-hosting in addition to their regular services.
Launched in 2018, Fathom Analytics is a cookie-less, privacy-focused Google Analytics alternative that’s really simple to use. They care about user privacy and pioneered proprietary routes in compliance with the EU for EU visitors. Calling the route "EU isolation", Fathom Analytics is also GDPR, PECR, COPPA, CCPA, and ePrivacy compliant.
Features of Fathom Analytics include:
Seven-day free trial
Prices start at $14 per month for up to 100,000 page views
Can include up to 50 sites
Shows live visitors and where they’re navigating
Offers uptime monitoring, event filtering, email reports, ad blocker bypassing, live visitor information, and much more from a bootstrapped company
Being privacy-focused, Fathom Analytics has a beautiful interface and offers unlimited CSV exports of your data available at any time. This allows you to conveniently connect it to other data sources. In addition, Fathom is expected to release a backup option for GA data in the near future.
Previously named Pikqwik, Matomo ("keep your data in your own hands") is an open-source, cookie-free tracking, and GDPR-compliant analytics tool offering a search engine and keywords section where you can connect with Google Search Console data.
Additional Matomo features include:
Customizable dashboard.
Real-time data insights (pages visited, visitor summary, conversions, etc).
E-commerce analytics.
Event tracking (analyzes user interaction on apps or websites).
Measures CTR, clicks, and impressions for text and image banners as well as other page elements.
Visitor geolocation (stats can be viewed on maps by city, region, or country).
Page and site speed reports.
Page performance (number of views, bounce rates).
You can host Matomo on your own servers for free. Otherwise, their cloud pricing starts at $29/month after a 21-day free trial for 50,000 hits (up to 30 websites).
If you're already using Cloudflare's CDN solution, then there's no setup required. You can simply authorize the web analytics to start tracking inside your account. You can add up to 10 websites for free, and there's no need for a cookie banner, since they do not collect personal data.
In addition to tracking standard metrics — page views, visits, load times, bandwidth, etc. — Cloudflare has incorporated Core Web Vitals metrics. These are measured each across your tracked websites, and can email you weekly with updates — very handy.
Cloudflare installation is light on page load, since you can proxy it through your Cloudflare setup. Alternatively, you can install their lightweight Javascript code (or "beacon", as they call it).
Although Cloudflare doesn't say they are GDPR compliant, what they do say is that they are considered an "Operator Essential Services" under the EU Directive on Security of Network and Information Systems. You can assume that Cloudflare is tracking something along the way, with a free price tag and with other offerings available where data could be shared.
Cloudflare is closest to server analytics without going directly to your server. It's only available in specific areas, so you might have to pay for this.
Adobe Analytics is a popular analysis platform that provides tools essential for collecting relevant data concerning customer experience. Company analysts and online marketers frequently depend on it for improving customer satisfaction.
Applying Adobe Analytics to your business website can help you determine what leads to conversions. For example, did changing content or the CTA increase conversions? Did adding more visual aids increase the number of inquiries about a certain service or product?
Adobe analytics is marketed as an enterprise analytics solution with audience insights, advertising analytics, cohort analysis, customer journey analysis, remarketing triggers, and much more. It truly could fit your web analytics or product analytics depending on how your team deploys it, and it has been around for quite a while just like Google Analytics.
Privacy-friendly web analytics tool Clicky is easy to navigate and offers metrics similar to Google Analytics. However, Clicky embodies the feel of server analytics tools while making their interface simple and fast-loading, thanks to minimal graphics.
Features of Clicky include:
GDPR compliance
Bot detection and blocking
Heatmaps
Uptime monitoring
Backlink analysis
Clicky also provides a developer API and white labeling of their solution, where you can create your own theme for better brand integration starting at $49/month as part of their hosted service. They also offer a free tier if you’re looking to try it out with limited features.
EU-hosted, privacy-based Simple Analytics offers tools to use for checking your websites daily. Simple Analytics does not collect cookies, IP addresses, or any unique identifiers. Their package provides a bypass of ad blockers, hides referral spam, and includes an iOS app. You can even embed a widget to get public web statistics.
This GA alternative offers some nifty events that are gathered by default for ease of use. These include email clicks, outbound links, and files to streamline tracking. All of Simple Analytics features come together in a UI that's a simple dashboard providing quick "in and out" times — much quicker, in fact, than it would take to get to the correct property using Google Analytics.
Simple Analytics offers a free 14-day trial. If you decide to keep Simple Analytics, you'll pay $19/month, or $9/month if you pay a year in advance.
An open-source, cookie-free web analytics alternative to Google Analytics, Pirsch seamlessly integrates into websites and WordPress with plugins. A developer-friendly analytics offering a flexible, impressive API, server-side integrations, and SDKs, Pirsch provides Golang (Go), PHP SDK, JavaScript, or a community-provided code to embed snippets. Pirsch also works with Google Search Console.
You can perform any function you want from the Pirsch dashboard, like viewing statistics or adding websites.
Get started by viewing Pirsch's live demo or opting in on their 30-day free trial. Paying for Pirsch is only $5 per month if you choose annual billing. If you want to make monthly payments, the cost increases to $6 per month.
A privacy-friendly, open-source web analytics platform, Plausible is cookie-free and fully compliant with PECR, CCPA, and GDPR. Plausible is a popular alternative to Google Analytics because of its simplicity, lightweight script, and ability to reduce bounce rates by expediting site loading.
You can easily segment data into specific metrics, analyze dark traffic via Urchin Tracking Module (UTM) parameters, and track how many outbound link clicks you get.
Plausible offers a 30-day free trial option that provides unlimited usage of its features without requiring a credit card. Once your free trial has ended, you can pay $9 per month for up to 10,000 page views. If you choose yearly billing, you get two months of free use. Unlimited data retention, slack/email reports, and event customization are also provided with a paid subscription to Plausible.
Umami is GDPR-compliant, open-source, cookie-free, and does not track users or gather personal data across websites. By anonymizing any information collected, Umami prevents the identification of individual users. In addition, you won't need to worry about staying compliant with constantly changing privacy lawsi.
Features of Umami include:
Mobile-friendly so you can see stats on your phone at any time.
Since you host Umami under your domain, you won't see any ad-blockers like you see when using Google Analytics.
Public sharing of your stats is available using an exclusively generated URL.
Umami's lightweight tracking script loads almost instantly and won't slow down the loading of your website.
A single installation of Umami enables tracking of an unlimited number of sites. You can also track individual URLs and subdomains.
Since Umami is an open-source platform and self-hosted, it's free to use. Umami says a cloud-based version of its analytics is coming soon.
Although product analytics are not always GDPR-compliant, they provide powerful analytic tools that include valuable elements like segmentation and deeper levels of customization. For that reason and others, product analytics come with a higher maintenance cost. Here are several product analytics tools that are excellent Google Analytics alternatives for tracking customer behavior.
The traffic analytics tool provided by Hubspot Analytics offers exceptional analysis data for breaking down page views, new contacts, and even entrance/exit information regarding how long visitors remained on specific web pages. You can also install a tracking code to external sites so you can track traffic stats.
Additional features include:
Bounce rate percentages.
Number of call-to-action views/number of CTA clicks.
Conversion rates of visitors who click on your CTA.
Access to specific URLs or country stats.
A subscription to a starter Hubspot Analytics platform is $45 per month. You can choose monthly or annual billing, which is $540.
A professional subscription to Hubspot Analytics starts at $800 per month ($9,600 for one year). This subscription gives you multi-language content, video management and hosting, phone support, and A/B testing.
An enterprise subscription starts at $3,600 per month ($43,200 per year). You get 10,000 marketing contacts with this subscription, with additional contacts available for sale in increments.
Kissmetrics offers analytics for e-commerce and SaaS websites. Kissmetrics for SaaS gives you deep insights into the type of content and features that are fueling conversions, converting trials into conversions, and reducing churn.
Kissmetrics detects characteristics that drive conversions and retain regular buyers, so you can adjust site elements appropriately. This unique analytics tool also streamlines checkout funnels and integrates with Shopify.
Pricing for Kissmetrics SaaS involves three tiers:
Heap Analytics works on mobile and PC devices, quickly captures nearly all behavioral parameters, and supports first or third-party installation. You can also create individual identities for users over numerous sessions and augmented product information to purchase/sales events.
The makers of Fullstory state that if you know how to copy and paste, you'll have no trouble setting it up. This analytics platform also offers "private-by-default" abilities that ensures masking of text elements at their source.
Mixpanel offers a free subscription that gives access to core reports, unlimited data history, and EU or U.S. data residency. Mixpanel can be sliced and diced to fit so many situations, it’s flexible and moldable for many businesses at many levels.
For $25 per month under their growth plan, you get all the free features, plus data modeling, group analytics, and reports detailing causal inference. Mixpanel also includes features such as:
Segmenting users based on actions
Integrate with Slack to share reports, even if they don’t use Mixpanel
No limits on the amount of events tracked
Team Dashboards with Alerts
Identify top user paths and drop-off points
Understanding of conversion points across the funnel
And much much more.
It's used by 30% of Fortune 100 SaaS companies and if you need custom pricing and plans,
According to a report available here, Amplitude is consistently ranked as #1 among other product analytics platforms, top software products, and customer satisfaction. You can analyze collected data with fast self-serve analytics that do not require SQL.
The Start Amplitude package is free and offers unlimited data destinations, users, and data sources. You also get 10 million events (streamed or unstreamed) per month.
Their paid plans start with customizations on top of that including advanced behavioral analytics & custom event values.
Segment which was acquired by Twilio in 2020 offers superior email onboarding and intuitive insights in SMS campaign and email performance & that’s just one arm of what it does. Segment uses only one API to collect data across all platforms. They also have SDKs for Android, iOS, JavaScript, and over 20 server-side languages. They have been written as a technology startup that lets organizations pull customer data from one app into another & they have as of writing over 300 integrations!
Segment has customers across industries including media, medical, B2B, retail and from startup to enterprise. As well you can upload your customer data into their data warehouse to keep your data there so they could fit into the data warehousing category below too.
To get started using Segment, simply make a free account which allows for 1,000 visitors from two sources and access to their 300 integrations or their paid tiers start at $120/month where you can sign up for a team account.
Rudderstack provides developers all they need to get started immediately with this product/behavior analytics. Features of Rudderstack include identifying anonymous mobile and web users, customizing destinations through the application of real-time modifications to event payloads, and automatically occupying warehouses with event and user record schemas.
The free version of Rudderstack gives you five million events per month, over 16 SDK sources and more than 180 cloud destinations. Like Segment you can use Rudderstack as a data warehouse for your customer data and they’ve built their platform as warehouse-first and built for developers.
The Pro version starts at $500 per month. You get the free features, plus email support, a technical account manager, and even custom integrations in the higher tiers. Request pricing for the Enterprise version of Rudderstack here.
Data warehousing solutions
It's always good to have your data backed up and if you have a lot of history in Universal Analytics it would be a good place to start. There are even services to connect all these solutions together seamlessly, such as Funnel. While these have additional costs, you can save all your data for many years without a problem coming up so you can refer to them when need be.
Available with an easy connection to your GA4 data if you're leaning that way. Google lists BigQuery as enterprise and there will be limits to how much data you can send to BigQuery before paying. They offer real-time analytics with built-in query acceleration and with the scale of data that Google handles we can be pretty sure that they’ll be able to handle it. Google Cloud Storage offers standard, nearline, coldline, and archive storage options.
One suggestion is if you’re using Google for backups to ensure you have a secondary account attached to it in case your primary account loses access, which I’ve seen happen from time to time. They provide a migration from some tools and their pricing is based on data, so it’s free up to 10GB. They created a billing calculator for easy cost analysis once you get into the paid side.
AWS states that they support more compliance certifications and security measures than all other cloud providers. They allow for backup of all data types and have redundancy built in that you would expect at the Amazon level.
Like Google they have a calculator to estimate a pricing model for your team as they have a lot of other services integrated into this that you can take advantage of there.
Snowflake provides a data cloud and isn’t Google or Amazon, which may appeal to some. It supports data science, data lakes, and data warehousing on the three top clouds with their fully automated solution.
They’re HIPAA, PCI DSS, SOC 1 and SOC 2 Type 2 compliant, and FedRAMP Authorized and have a 30 day free trial with all plans you can check out as well as a “pay for usage” option or a “pay for usage upfront” option too.
Recap and recommendations
On July 1, 2023, Universal Analytics will stop tracking any new data, and then by EOY 2023 all UA data will be removed by Google — so back up your data! Think about warehousing your data long term for your Universal Analytics and moving forward.
Then, make a plan for how you’re going to be tracking moving forward. Talk options and thoughts with stakeholders.
Install GA4 now (or yesterday!) or try the options above out, as none of these are created equal and many have free trials.