Otherwise, it’s as good as invisible.īy the end of this chapter, you’ll have the context you need to work with the search engine, rather than against it! While there can be reasons for doing this, if you want your content found by searchers, you have to first make sure it’s accessible to crawlers and is indexable. It’s possible to block search engine crawlers from part or all of your site, or instruct search engines to avoid storing certain pages in their index. In general, you can assume that the higher a website is ranked, the more relevant the search engine believes that site is to the query. This ordering of search results by relevance is known as ranking. When someone performs a search, search engines scour their index for highly relevant content and then orders that content in the hopes of solving the searcher's query. Search engines process and store information they find in an index, a huge database of all the content they’ve discovered and deem good enough to serve up to searchers.
By hopping along this path of links, the crawler is able to find new content and add it to their index called Caffeine - a massive database of discovered URLs - to later be retrieved when a searcher is seeking information that the content on that URL is a good match for. Googlebot starts out by fetching a few web pages, and then follows the links on those webpages to find new URLs. but regardless of the format, content is discovered by links. Content can vary - it could be a webpage, an image, a video, a PDF, etc.
How Search Engines Work: Crawling, Indexing, and RankingĪs we mentioned in Chapter 1, search engines are answer machines.