What is crawling and how does it relate to indexing?

Crawling is the process of automatically crawling web pages using search engine robots (crawlers), such as Googlebot and similar bots used by other search engines. During crawling, the robot visits website pages, downloads their content, and analyzes their structure and links.

How does crawling work?

The search robot starts with known pages and follows internal and external links, gradually discovering new URLs. Along the way, it collects information about the pages' content, their technical condition, and the connections between them.

The main tasks of crawling are to discover new pages, update information about existing ones, and identify changes on the site.

2026-05-26_202657.jpg

The relationship between crawling and indexing

Crawling and indexing are two sequential but different stages of page processing by a search engine.

Crawling is the stage of discovering and scanning a page.

Indexing is the process of adding a page to a search engine's database so that it can appear in search results.

After the robot has crawled the page, the search engine decides whether to include it in the index or not. If the page meets quality requirements, it is indexed and becomes available for searching.

Why might a page be crawled but not indexed?

Not all pages crawled are indexed. Reasons for this may include low-quality content, duplicate content, technical limitations, restrictions in robots.txt or meta tags, or the page's insufficient value to users.

Result

Crawling is the process of discovering and analyzing web pages, while indexing is their inclusion in search results. These processes are closely related, but not identical: first, a robot finds and scans a page, and only then does the search engine decide whether it will be displayed in search results.

If you have not found the answer to your question, you can write to us in a ticket. My tickets