Full information all about what is SEO Snippets | crawl-delay rule, Subdomain or subfolder and more

Table of Contents: Show / Hide

Full information all about what is SEO Snippets | crawl-delay rule, Subdomain or subfolder and more!

 SEO Snippets

We've seen lots of questions from webmasters over the years, and we'd like to cover these here. We're keen on giving you the answers that help you to succeed with your website on Google Search.

Have a question? Ask it in our Webmaster Help Forum: 

In this post of short videos, the team and I will be answering your webmaster and SEO questions.

If you ever wanted to know more about 404 errors, how and when crawling works, about a site's URL structure, or about duplicate content, we'll have something here for you. Drop by our Help Forum.

A crawl-delay rule ignored by Googlebot? |  SEO Snippets

Is the crawl delay rule ignored by Googlebot? I am getting a warning message in Search Console.

The crawl delay rule ignored by Googlebot I'm getting a warning message in search console has a great tool to test robots.txt files which is where we Cho this morning what does it mean and what do you need to do the crawl delay directive for robots.txt files was introduced by other search engines in the early days the idea was that webmasters could specify. 

how many seconds a crawler would wait between requests to help limit the load on a web server that's not a bad idea overall however it turns out that servers are really quite dynamic and sticking to a single period between requests doesn't really make sense the value given there is a number of seconds between requests.

which is not that useful now that most servers are able to handle so much more traffic per second instead of the crawl delay directive we decided to automatically adjust our crawling based on how your server reacts so if we see a server error or we see that the server is getting slower we'll back off on our crawling additionally.

we have a way of giving us feedback on our crawling directly in the search console so site owners can let us know about their preferred changes in crawling with that if we see this directive in your robots.txt file we'll try to let you know that this is something that we don't support of course if there are parts of your website that you don't want to have crawled at all letting us know about that in the robots.txt 

Subdomain or subfolder, which is better for SEO? | SEO Snippets

  • subdomain or subfolder?
  • Which one is the most beneficial for SEO?

Google Web Search is fine with using either subdomains or subdirectories.

Making changes to a site's URL structure tends to take a bit of time to settle down in Search. So I recommend picking a setup that you can keep for longer. Some servers make it easier to set up different parts of a website as subdirectories.

That's fine for us.

This helps us with crawling since we understand that everything's on the same server and can crawl it in a similar way. Sometimes this also makes it easier for users who recognize that these sections are all a part of the same bigger website.

On other servers, using subdirectories for different sections, like a blog in a shop, can be trickier, and it's easier to put them on separate subdomains.

That also works for us.

You'll need to verify subdomains separately in Search Console, make any changes to settings, and track overall performance per subdomain.

We do have to learn how to crawl them separately, but for the most part, that's just a formality for the first few days. So in short, use what works best for your setup and think about your longer-term plans when picking one or the other.

My site's template has multiple H1 tags | SEO Snippets

  • My site's template has multiple H1s. Is this a problem? An H1 element is commonly used to mark up a heading on a page.
  • There's something to be said for having a single, clear topic of a page, right?
  • So how critical is it to have just one of these on a page? The answer is short and easy.

It's not a problem. With HTML5, it's common to have separate H1 elements for different parts of a page.

If you use an HTML5 template, there's a chance your pages will correctly use multiple H1 headings automatically.

That said, regardless of whether you use HTML5 or not, having multiple H1 elements on a page is fine. Semantically marking up your page's content to let search engines know how it fits together is always a good idea.

If you end up using multiple headings on a page, that's fine. You can join the discussion in the Webmaster forum, plus check out the links below to get even more helpful information.

How do I regain ownership of a Search Console property? | SEO Snippets

How do I regain ownership of a Search Console property when I don't know who the owner is? 

You might wonder how to regain access to your Search Console or Webmaster Tools account if you don't remember which account was previously used, or can't get back into that account. We'll answer that question and give some general tips

How do I regain ownership of a Search Console property when I don't know who the owner is? Chances are someone set up your website to work in Search Console, or maybe even in Webmaster Tools back when it was called that. In the meantime, nobody remembers the username or the password that was used.

So how do you get back in? Well, the answer is pretty easy. First off, there's very little per-user information in Search Console.

So you usually don't need to get back into that original account. You can just create a new account, verify ownership, and get back to work in no time. Once you've verified your site in Search Console, you'll have access to all of the data for your website.

You can also list the other verified owners and perhaps spot that previous account or see which verification elements you can remove.

Once there, delegation is an easy way to add others to your organization and to provide read-only access to external consultants, such as SEOs or marketing agencies.

Do fixed penalties affect SEO? | SEO Snippets

I had a penalty, and I fixed it. Will Google hold a grudge?

First off, at Google, we call them manual actions, not penalties, since they're generally applied manually by a team here, and they don't always have a negative effect on a site overall.

In Search Console, we inform sites about any manual actions their site might have. If you receive such a notification, you can take action on that, resolve the issue, submit a reconsideration request.

The webspam team processes these. And if they can confirm that the issue is fixed, they'll lift the manual action. It might take a bit of time for everything to be reprocessed, but Google's algorithms won't hold that issue against the site in the long term.

However, it's possible that a site temporarily had an unnatural advantage before. By fixing this issue, your site will return to its natural location in our search results. Additionally, things change on the web and in our search results all the time.

A site's visibility in search can change over time, even if nothing on the website changes. So with that in mind, it can be normal that a site doesn't return to exactly the same place as before manual action. So in short, no, Google's algorithms don't hold a grudge. However, visibility in search can change over time, regardless of any manual action.

How often does Google re-index websites? | SEO Snippets

how often does Google re-index a website? It seems like it's much less often than it used to be. We add or remove pages from our site, and it's weeks before those changes are reflected in Google search.

You, too, might be wondering, how long does it take for Google to recognize bigger changes on a website?  And from there, what can you do to speed that up? Looking at the whole website all at once, or even within a short period of time, can cause a significant load on a website.

Googlebot tries to be polite and is limited to a certain number of pages every day. This number is automatically adjusted as we better recognize the limits of a website.

Looking at portions of a website means that we have to prioritize how we crawl. So how does this work? In general, Googlebot tries to crawl important pages more frequently to make sure that the most critical pages are covered. Often, this will be a website's home page or maybe higher-level category pages.

New content is often mentioned and linked from there, so it's a great place for us to start. We'll recrawl these pages frequently, maybe every few days, maybe even much more frequently, depending on the website.

Why does the indexed pages count vary? | SEO Snippets

why is the number of pages indexed on Search Console different than what appears on google.com? Depending on where you look, you might see different numbers for your site's count of indexed pages. Which one is the right number, and which one should you use? The actual number of pages on a website is surprisingly hard to determine.

At first, one might assume that it's just a matter of counting through the pages-- starting with the home page and following the links from there. However, on most websites, there are many, many ways to reach a specific page.

There might be different URL parameters-- so everything after a question mark in the URL-- that leads to the same page. Sometimes upper and lower case URLs also lead to the same page. Or perhaps you can add a slash to the end and still get the same page. Some websites have a calendar or something similar that leads to an endless number of new invalid pages.

Assuming that most websites have an infinite number of possible URLs, should Google just show an infinite count? That probably isn't that useful. So which numbers can you see, and where do they come from? There are three main places to get counts for the number of indexed URLs.

  • First, you can check a site query in Google search.
  • Second, you can use the index status report in Search Console.

Or third, you can look at the index count per sitemap file. Let's take a look at these options. In Google Search, you can just enter a site, and then a colon, and then your domain name.

Google will show you a sample of the pages indexed from your website, together with an approximate count of the URLs from your website. This number is generally a very, very rough approximation, based on what we've seen from your website over time.

We try to show search results as quickly as possible, so that count is optimized more for speed rather than for accuracy. It's useful to look at this as a very rough order of magnitude. But we don't recommend using that count as a metric.

The second and third methods require that you use Google Search Console. Search Console is a free tool you can sign up for and verify your website with. In Search Console, there is a report that shows a number of indexed URLs from your website.

Google Search Console: What should I do with old 404 errors? | SEO Snippets

What should I do with 404s in Search Console that is from ancient versions of my site? So sites evolve over time, URLs change, you add redirect, redirects get dropped over the years. Sometimes URLs are just no longer needed.

These URLs end up returning 404, so they show up in Search Console as crawl errors, but what does that mean? When an invalid URL is opened, it's the right thing for a server to return a 404 page not found error.

When doing a restructuring of your website, we recommend redirecting from old URLs to the new ones and updating the links that go to the old URLs to point to the new ones directly. However, over time, you might decide to drop those redirects, maybe because of the maintenance overhead, or just maybe you forget about them.

These URLs are now 404s in Search Console. In your server logs or analytics, check for traffic to those URLs. If there is no traffic, that's great. In Search Console, check for links to those URLs. Are there no relevant links? If you see nothing special in either of the links or the traffic, having those pages return 404 is perfectly fine.

If you do see traffic to those URLs or see links pointing at those URLs, check where they're coming from and have those links point at the new URLs instead. Or, if it looks like a lot of traffic or links are going to those URLs, perhaps putting a redirect back in place would be more efficient.

That works for a few crawl errors, but what if you have a ton of 404 errors? Search Console makes this easy. It prioritizes crawl errors for you. If the top errors in the report are all irrelevant, you can rest assured that there is nothing more important further down on the list.

Crawl errors for 404s that you don't want to have indexed don't negatively affect the rest of your site in search, so take your time to find a good approach that works for you.

Will removing “.html” from my URLs help my site? | SEO Snippets

  • Will removing .html from my URLs help my site? 
  • URLs are important for search engines like Google. 
  • Do they care which endings your URLs use though?

The answer is no.

Google uses URLs to identify pieces of content, whether your URLs end with .html, .php, .asp, or just have words in them, doesn't really matter to Google.

All of these URLs can show up in search in the same way. That's it. If you need to change your URLs, for example, if you move to a new content management system that doesn't allow you to use .html URLs at all, keep in mind that this change would be a restructuring of your website.

You would need to redirect the old URLs to the new ones. This kind of change can take quite a bit of time to be reprocessed. So picking a time when you're not dependent on search is a good idea. because it can take time, we don't recommend doing this kind of change on a whim. When making URL changes, pick URLs that you're sure can last a longer period of time.

Can my URLs use non-English words? | SEO Snippets

So can URLs use local non-English words? For sites that target users outside of English-speaking regions, it's sometimes unclear, if they can really use their own language for URLs. And if so, what about non-English characters? Google Search uses URLs primarily as a way to address a piece of content.

We use URLs to crawl a page, which is when Googlebot goes to check the page and to use the page's content for our search results.

As long as URLs are valid and unique, that's fine. For domain names and top-level domains, non-Latin characters are represented with Punycode encoding. This can look a little bit weird, at first. For example, if you take Mueller-- my last name-- with the dots on the U, that would be represented slightly differently as a domain name.

For browsers, and for Google Search, both versions of the domain name are equivalent. We treat them as one and the same.

The rest of the URL can use Unicode, UTF-8, encoding for non-Latin characters. You can use either the [INAUDIBLE] version or the Unicode version with your website.

They are also equivalent to Google. Regardless of what you place within your URLs, make it easy for folks to link to your pages. For example, avoid using spaces, commas, and other special characters in the URL. They work for Google, but they make linking a little bit harder.

Use dashes to separate words in your URL. Some prefer using underscores-- that's fine, too. Dashes are usually a little bit easier to recognize. And if your site is available in multiple languages, use the appropriate language in URLs for content in that language.

So to sum it up-- yes, non-English words in URLs are fine. We recommend using them for non-English websites.

Add a sitemap for more than 50,000 URLs | SEO Snippets

SEO Snippets, we discuss sitemap files, which are a great way to make your content known to Google and other search engines. A single sitemap file is limited to 50,000 URLs. What do you do if you have more URLs? 

how do we add a sitemap for more than 50,000 URLs? Sitemap files are a great way to make your content known to Google and to other search engines. However, they're limited to 50,000 URLs per file. What do you do if you have more URLs?

You can generate more than one sitemap file per website. You can either submit these individually, for example, through a Search Console, or you can create a sitemap index file.

A sitemap index file is like a sitemap file for sitemaps. You can list multiple sitemap files in it. If you use a sitemap index file, you can just submit that file for your website in Search Console. 

Even if you have fewer than 50,000 URLs, you can submit multiple sitemap files. For example, you might want to do that to keep track of different sections of your website, or just, in general, to make maintenance of your sitemap files a little bit easier.

When it comes to creating sitemap files, we strongly recommend that you have these made automatically through your server directly.

Post a Comment

Post a Comment (0)

Previous Post Next Post