Alex Lukyanchuk


The degree to which the user’s expectations match the search query and the results generated by the program is called relevance. The effectiveness of the search engine is determined by this parameter. The algorithm followed by the robot during the processing of a request for an appropriate requirements of the word on various resources. The relevance of any page of the site is determined by the number of phrases that match the sentences used in the search query.
How is relevance determined?

Different search engines have different thresholds for the number of key words in determining relevance. The matching word level must be greater than five percent for the page to be considered relevant to the query. If the share of popular words on the site is less than a five percent barrier, such a resource is recognized as insufficiently relevant and simply ignored. However, in the case where the site contains a much larger number of search phrases than necessary to determine compliance, it blocks the spam filter. Search engines open the way for users to the world wide web. The development of the Internet has given a major role to these programmes.
Algorithm development

After the creation of the information search system worked fine until the next stage of development of the Internet. In response to the user’s request was issued not a couple of thematic sites, and thousands. Quickly determine what is given has value, it was impossible. Among the issued list of sites there were both high-quality resources and useless. An algorithm for determining relevance was developed to eliminate the necessary from the unnecessary. Due to it, people could again get the desired information quickly, without sorting out a pile of unnecessary documents.

Immediately after the creation of the search engine in determining the relevance was guided solely by the internal parameters of the site in question. Such criteria were:

density of key words on the page;
frequency of necessary phrases in meta tags;
search expressions in headers;
coincidence with the terms in the design of the article.

The invention and distribution of doorways (doorway) changed everything. These special pages contain only the words of the popular queries. Their goal is to raise the rating of the site in the table of issue of the bot. When you go to such a page-bait user was redirected to another site or page. To combat this fraud, it was necessary to create a system for assessing the external criteria of the portal. Simplistically, this algorithm can be described by the following formula:
P = N × ( B + C ),

P – General relevance of the site;
H-reputation of the site, evaluation of its external parameters regardless of the person’s request;
B-compliance of the site content with the desired phrase;
C-the degree of coincidence of the text of links to the page and the phrase entered by the user, the reference ranking.

After thinking about this mathematical expression, you can understand the principle of assessing the relevance of modern search engines. The real work of the algorithm is much more complicated, and its meaning does not fit into the formula, which only gives information about the main parameters taken into account.
Internal relevance of the site

The search engine evaluates the internal relevance of the portal by counting the number of search phrases per article. Thus, the program considers the most frequently repeated word to be the key one.

If the phrase entered by the person coincides with the most repeated sequence of words on the site, this resource will be considered relevant. Such the most repeated sentence or word is calculated by the search engine for each portal.

Of great importance is not only the presence on the page of all the words entered by the user, but also their sequence. In addition, the location of phrases in the text hierarchy is taken into account. The greatest importance is given to the words in the names. If the article is titled in the same wording as the visitor’s request, the relevance of the portal for this request will be high. Also takes into account other criteria of importance of the word key is shown below.

Number of synonyms of the searched word.
Position relative to the beginning of the text. The significance of the key increases depending on how close it is to the first line.
The distance between words that make up the requested phrase. The more accurate the offer will be repeated on the site, the more preferable this resource is.
Include relevant words in tags, meta tags, titles, page titles.

As well as the search engine robot notes the subject of the resource and, if it coincides with the request, gives this site as a result.
External relevance of the site

To evaluate this type of site matching, the term link popularity is used. The value of this criterion depends on the citation of the surveyed resource by other portals. The credibility of the site depends in this case on the number of links to it placed on third-party sites. Thus, the popularity of the network directly affects the assessment of the quality of content. The algorithm for assessing external relevance has retained its essence since its invention by the founders of Google. Since then, it has undergone many modifications and is still working. Guided by the number of links found on the site, the search engine is PageRank-coefficient reflecting the external relevance of the resource.

Yandex got its own clone of the PR. VIC was developed in 2001 to compile the criterion of the site’s credibility. The abbreviation literally stands for weighted citation index. This value was previously available, but in 2002 it was hidden from the eyes of users because of attempts to cheat it. Now it is possible to get acquainted only with the criterion of TCI, which is used to streamline sites in the Yandex registry.

IC is used by Rambler too. But this index is used in conjunction with the rating of site visits by users. The Rambler system has been improved by This technology since 2002.

The first program, which included in its algorithm CI, was the system “Aport”. The variable was introduced in 1999. In this search engine, the index was compiled only on the basis of the most significant link obtained from the most popular site.
How to improve the relevance of web pages

Relevance of web pages is an evaluative concept that is actively used in the field of search engine optimization and website promotion. Problems related to the search for relevant information are the main reason for the fall in the conversion rate of web sites.

If a representative of the target audience can not find the necessary content on the pages of the site (text, audio, video, image), he begins to search for an alternative source for information. In other words, the visitor leaves the site to one of the competitors who is able to place high-quality and relevant content.
The reasons for a decline in relevance

Formally, the study of relevance in the field of information retrieval began in the middle of the last century. A new industry study later received the name of bibliometric. Particular attention in the course of early studies was paid to the search for relevant and accurate texts that answer a specific question.

With the development and globalization of the Internet, the flow of information has increased significantly. There was a huge number of irrelevant web pages that are sharpened exclusively for indexing in search engines. The result was a notion of technical relevance. In many ways, the desire to adapt to the modern requirements of Yandex and Google has led to a significant increase in the number of irrelevant sites.

The reasons for the fall of the relevance of the web pages

Use of non-unique content of poor quality.
Mismatch of the provided information to the key queries specified for its search.
Posting outdated, irrelevant or false information.
Problems with site optimization, including link weight and internal linking.
Exceeding the optimal level of particular words and other problems with the semantic kernel.
Insufficient or incorrect use of tools for website promotion.

In some cases, the query may have ambiguous interpretation or different correct answers, so the variety of results is taken into account when evaluating the usefulness of web pages. The most relevant web pages are not necessarily the most useful to the user. Displaying the site on the first page in the search results also should not be equated to the high quality of the overall relevance. Temporary leading positions in search results are often occupied by sites that are well optimized from the technical side.

Relevance of the key query from the point of view of a conventional search engine — obtained after the analysis of the ratio of the number of queries and other words/phrases in the text. The quality of the content itself is not evaluated. A special measure called “maximum relevance limit” (MMR) was proposed to overcome this shortcoming. It calculates the relevance of each document only in terms of how much new information it brings in the background and taking into account previous results.
Ways to improve the relevance of web pages

The relevance of the content affects the process of indexing and ranking the site. Problems with information optimization lead to a reduction in the conversion rate. Even if the pages are optimized for the requirements of the search engine, the poor quality of the content will eventually lead to an outflow of visitors. Experts in the field of search engine promotion provide some useful tips related to increasing the level of relevance.

A representative of the target audience will leave the site, which provides, though well technically optimized, but outdated information. Modern algorithms of search engine robots take into account behavioral factors, so over time, the web page will lose its technical relevance (positions in the top of the issue will decrease).

Methods to improve the relevance of the site

Attracting professional web designers who will develop the site. High-quality usability, nice design and thoughtful structure parameters will make the site pleasant for the audience.

Leave a comment