Hello dear readers!
Today I’ll show you how to improve your SEO by using a robots.txt file, a small text file designed to work with search engines.
Robots.txt file, a robots exclusion protocol that tells web robots which pages on your site to crawl, which pages not to crawl. I’ll show you how to change your robots.txt file so that search engines love it. When a crawler is about to visit a site, it will check robots.txt for instructions before visiting a page. Let’s see how it looks.
The main skeleton of a robots.txt file.
An asterisk after “user-agent” means that the file is applied to all robots that visit the site. The slash after ‘Disallow’ instructs the robot not to visit the pages on the site. The robot is given a certain time to crawl each site and if the robot crawls a lot of them for a long time, this will lower the site’s ranking. Therefore, some pages are blocked from being scanned by the robot; it must scan your most valuable pages. If you create the correct robots.txt page, you can instruct search crawlers to avoid certain pages. This makes the robots.txt file useful for ranking. If you want to view the robots.txt file of your site or any other site, enter the site’s URL into the address bar of your browser and add /robots.txt to the end.For example https://wplaksa.ru/robots.txt.
This is what happens:
1) You will find a robots.txt file.
2) You will find an empty file.
For example: https://www.disney.com/robots.txt:
3) You will receive a 404 or Not Found error.
Take a moment and check your site’s robots.txt file.
If you find a blank file or 404, you need to fix it.
If you find the correct file, it is probably installed with the default settings that were created when you created your site. I like this way of viewing other sites’ robots.txt files. Once you’ve learned the ins and outs of robots.txt, spying on other people’s files can be a rewarding exercise.
If you don’t have a robots.txt file, you’ll need to create one from scratch. Open a simple text editor. Write User-agent: * for all robots. Then enter ‘Disallow:’ but after Disallow: do not enter anything. Since it costs nothing after the ban, web robots will be prompted to crawl your entire site. Your file now looks like this:
It looks simple, but this is already a working robots.txt file
You also need to link to your XML sitemap.
Write a Sitemap: https://site.com/sitemap.xml
for example: https://wplaksa.ru/sitemap.xml
Trust me, this is what a basic robots.txt file looks like.
Now let’s optimize our robots.txt for SEO.
The best use case for a robots.txt file is to tell search engines not to crawl parts of your site that don’t appear in the public domain.
If you visit the robots.txt file for the site (https://wplaksa.ru/), you will see lines in the file that prohibit robots from entering the server side of the site on the login page (wp-admin). It doesn’t make sense for search engine robots to waste their time crawling it.
If you have a WordPress site, you can use these deny lines.
You can use a similar directive to prevent bots from crawling specific pages or folders. After Disallow: Enter the portion of the URL that comes after the .com. Place this between the two forward slashes. To prevent the bot from scanning your tmp folder, write Disallow: / tmp /
disallow indexing for tmp folders
What types of pages to exclude from indexing.
Duplicate content. For example, if you have a printable version of a page, then you can tell bots not to scan the printable version.
There is no universal rule to block pages, your robots.txt file will be unique to your site. Use your own rules for your site.
There are other directives you should be aware of: noindex and nofollow.
If you don’t want to index individual pages use noindex,
to make sure that bots don’t visit and index certain pages.
For example like this:
The page will no longer appear in search results.
Let’s take a look at the nofollow directive. This is the same as the nofollow directive for the link. She tells web robots not to crawl.
But the nofollow directive will be implemented slightly differently, because it is not actually part of the robots.txt file. The only difference is where it happens. Open the source code of the page you want to modify and paste between the header tags.
Insert the line: meta name = ”robots” content = ”nofollow”
If you want to add noindex and nofollow directives together, write code like this:
meta name = ”robots” content = ”noindex, nofollow”
Check your robots.txt file to make sure everything is working correctly.
using Google’s robots.txt validator:
Or using Yandex Webmaster:
It doesn’t take much effort to set up your robots.txt file, it’s a one-time setup, but you can always make changes as needed. By properly configuring your robots.txt file, you improve your site’s SEO. Search engine robots will organize and display your content in the search results in the best possible way, your site will be more visible.
That’s all for today. Thanks for your attention.Meet me in the next post. Kissed, hugged, cried!
Did you like it? Visit the ad:
How I love your comments, do not be shy!