The perfect Robots.txt file for SEO.

en_Tizer_robotstxt

Hello dear readers!
Today I’ll show you how to improve your SEO by using a robots.txt file, a small text file designed to work with search engines.

Robots.txt file, a robots exclusion protocol that tells web robots which pages on your site to crawl, which pages not to crawl. I’ll show you how to change your robots.txt file so that search engines love it. When a crawler is about to visit a site, it will check robots.txt for instructions before visiting a page. Let’s see how it looks.




The main skeleton of a robots.txt file.

The main skeleton of a robots.txt file.
An asterisk after “user-agent” means that the file is applied to all robots that visit the site. The slash after ‘Disallow’ instructs the robot not to visit the pages on the site. The robot is given a certain time to crawl each site and if the robot crawls a lot of them for a long time, this will lower the site’s ranking. Therefore, some pages are blocked from being scanned by the robot; it must scan your most valuable pages. If you create the correct robots.txt page, you can instruct search crawlers to avoid certain pages. This makes the robots.txt file useful for ranking. If you want to view the robots.txt file of your site or any other site, enter the site’s URL into the address bar of your browser and add /robots.txt to the end.For example https://wplaksa.ru/robots.txt.
This is what happens:

1) You will find a robots.txt file.
robots.txt file
2) You will find an empty file.

For example: https://www.disney.com/robots.txt:
www.disney.com/robots.txt
3) You will receive a 404 or Not Found error.

 You will receive a 404

Take a moment and check your site’s robots.txt file.
If you find a blank file or 404, you need to fix it.
If you find the correct file, it is probably installed with the default settings that were created when you created your site. I like this way of viewing other sites’ robots.txt files. Once you’ve learned the ins and outs of robots.txt, spying on other people’s files can be a rewarding exercise.
If you don’t have a robots.txt file, you’ll need to create one from scratch. Open a simple text editor. Write User-agent: * for all robots. Then enter ‘Disallow:’ but after Disallow: do not enter anything. Since it costs nothing after the ban, web robots will be prompted to crawl your entire site. Your file now looks like this:
Your file now looks like this
It looks simple, but this is already a working robots.txt file
You also need to link to your XML sitemap.
Write a Sitemap: https://site.com/sitemap.xml
for example: https://wplaksa.ru/sitemap.xml
Write a Sitemap
Trust me, this is what a basic robots.txt file looks like.

basic robots.txt file
Now let’s optimize our robots.txt for SEO.
The best use case for a robots.txt file is to tell search engines not to crawl parts of your site that don’t appear in the public domain.
If you visit the robots.txt file for the site (https://wplaksa.ru/), you will see lines in the file that prohibit robots from entering the server side of the site on the login page (wp-admin). It doesn’t make sense for search engine robots to waste their time crawling it.
en_robots4
If you have a WordPress site, you can use these deny lines.
You can use a similar directive to prevent bots from crawling specific pages or folders. After Disallow: Enter the portion of the URL that comes after the .com. Place this between the two forward slashes. To prevent the bot from scanning your tmp folder, write Disallow: / tmp /
Disallow: / tmp /
disallow indexing for tmp folders
What types of pages to exclude from indexing.
Duplicate content. For example, if you have a printable version of a page, then you can tell bots not to scan the printable version.
There is no universal rule to block pages, your robots.txt file will be unique to your site. Use your own rules for your site.
There are other directives you should be aware of: noindex and nofollow.
If you don’t want to index individual pages use noindex,
to make sure that bots don’t visit and index certain pages.
For example like this:
en_noindex
The page will no longer appear in search results.
Let’s take a look at the nofollow directive. This is the same as the nofollow directive for the link. She tells web robots not to crawl.
But the nofollow directive will be implemented slightly differently, because it is not actually part of the robots.txt file. The only difference is where it happens. Open the source code of the page you want to modify and paste between the header tags.
en_head
Insert the line: meta name = ”robots” content = ”nofollow”
Like this:
en_meta
If you want to add noindex and nofollow directives together, write code like this:
meta name = ”robots” content = ”noindex, nofollow”
Check your robots.txt file to make sure everything is working correctly.
using Google’s robots.txt validator:
https://www.google.com/webmasters/tools/robots-testing-tool

en_robotstxt_google

Or using Yandex Webmaster:
https://webmaster.yandex.ru/tools/robotstxt/

en_robotstxt_yandex

Output
It doesn’t take much effort to set up your robots.txt file, it’s a one-time setup, but you can always make changes as needed. By properly configuring your robots.txt file, you improve your site’s SEO. Search engine robots will organize and display your content in the search results in the best possible way, your site will be more visible.

    That’s all for today. Thanks for your attention.Meet me in the next post. Kissed, hugged, cried!

Did you like it? Visit the ad:

ad

   How I love your comments, do not be shy!

стрелка

Do you like the post? Help others find out about this article, click on social networks button.

to donate
Support the site

Site Content

Leave a Reply

Your email address will not be published. Required fields are marked *

Do NOT follow this link or you will be banned from the site!