WebsiteGear Logo Log In
New User? Sign Up
About | Contact | FAQ
  Home Content Website Promotion Search Engine Optimization Sunday, October 25, 2020 
Print| Email| Save| Discuss| Feeds

About Search Engines - Part II
Published: Friday, August 20, 2004

Indexing the Web Content

Similar to an index of a book, a search engine also extracts and builds a catalog of all the words that appear on each web page and the number of times it appears on that page etc. Indexing of web content is a challenging task assuming an average of 1000 words per web page and billions of such pages. Indexes are used for searching by keywords, therefore, it has to be stored in the memory of computers to provide quick access to the search results.

Indexing starts with parsing the website content using a parser. The parser can extract the relevant information from a web page by excluding certain common words (such as a, an, the - also known as stop words), HTML tags, Java Scripting and other bad characters. A good parser can also eliminate commonly occurring content in the website pages (such as navigation links) so that they are not counted as a part of the page's content.

Once the indexing is completed, the results are stored in memory, in a sorted order. This helps in retrieving the information quickly. Indexes are updated periodically as new content is crawled. Some indexes help create a dictionary (lexicon) of all words that are available for searching. Also a lexicon helps in correcting mistyped words by showing the corrected versions in a search result. A part of the success of the search engine lies in how the indexes are built and used. Various algorithms are used to optimize these indexes so that relevant results are found easily without much computing resource usage.

Storing the Web Content

In addition to indexing the web content, the individual pages are also stored in the search engine's database. Due to cheaper disk storage, the storage capacity of search engines is very huge, and often runs into terabytes of data. However, retrieving this data quickly and efficiently requires special distributed and scalable data storage functionality. The amount of data, that a search engine can store, is limited by the amount of data it can retrieve for search results. Google can index and store about 3 billion web documents. This capacity is far more than any other search engine during this time.

Search Algorithms and Results

Once user enters the search keywords, the search engine's search algorithm looks up the indexes for matches for the search keywords. Once it can match the keywords in the index, the search engine tries to provide the most relevant contents first. This relevance matching is achieved by various search engine algorithms and hence is the bread and butter of search engine's popularity. Among all the search engines on the internet, Google stands out from the rest because it can provide more relevant answers to search queries. The search algorithms, that are used to find the most relevant results from a hay stack of web content, are different from one another. That is why search results, for the same keywords, produces different results on various search engines.

Advanced search engines, like Google, use a relevance ranking system, where each web page is ranked based on various factors such as:
  1. Content analysis : The content of each webpage is evaluated for the keywords based on the number of occurrences, position in the page (such as title, meta tags, heading), font size, proximity between them etc.

  2. Linking structure : The links from an external page or website to this page are analyzed for keywords in the link structure. Also links from a popular website will lead to a higher ranking.

  3. Page ranking :This is a relative ranking of a website based on an algorithm that is used specifically by Google. The page rank denotes the ranking of a web page based on its popularity and quality of links, among various other factors. The basic idea behind a higher page rank is that it is easier to find the website on the internet.


The search results decide the fate of a search engine. Different search engines try to cater to different users. AskJeeves is known to be popular because it provides search results based on descriptive question like queries. Its engine is optimized to parse the user friendly search query for keywords, which are then internally used to perform the search. The user feels as if the question was processed by a human behind the computer. Search engine technology is evolving every day and new researches are carried out to provide more concept and descriptive based search queries. However, the same theory applies - "The search engine, which provides the most relevant results, will rule".
Previous Article About Internet Search Engines
Print| Email| Save| Discuss| Feeds
Nav How Internet Search Engines Work
A search engine can provide links to relevant information based on your requirement or query. Learn how a search engine works in order to understand the basics of search engine positioning.
Nav Search Engine Optimization - SEO ideas to avoid
This article discusses some web page optimization tricks that webmasters use but might turn out to be harmful to the website.
Nav Search Engine Optimization - Tips & Tricks
Search Engine Optimization has gained a lot of attention in the last few years. This article will provide some tips on how to optimize your web pages for a better search engine ranking.
Nav Froogle Optimization - Optimizing For Google's Product Search Engine
Merchants who have pursued Froogle maximization for their products have reported significant traffic and associated sales increases. This article lists the important strategies involved.
Nav How to Maximize Paid Search Results
Try these proven strategies for pay-per-click marketing to produce your desired results.
News Post Sidecar Wins 2020 Search Engine Land Award for Best Retail Search Marketing Initiative
The Search Engine Land Award recognizes Sidecar's work in growing Summit Sports' Amazon ad revenue by 418%
News Post Wpromote Named Agency of the Year by Search Engine Land
The award recognizes the agency's best-in-class strategy and client results in paid search marketing
News Post UNICEF has Shortlisted the World's First Tokenized Search Engine for its Innovation Technology Fund
CAPE TOWN, South Africa, Oct. 22, 2020 /PRNewswire/ -- Blockchain Company, a multi-million euro seed funded startup, has a co...
News Post Dallas-Fort Worth Search Engine Marketing Association Hosting Virtual Digital Marketing Conference Oct. 26-27
Attendees to get a unique opportunity to receive website reviews and learn about the hottest topics in Search, Social Media, and Website Content
News Post Lightbox Search Adds Major Enhancements For Google Search Tracking And Monitoring
Users Can Now Analyze Search Results by Location and in Their Local Languages
Submit News | View More NewsView more news
Classified Ad North Bengal & Sikkim Search
Are you looking for hotels, tour and travels, tea, Real estate, Mobile shop, art and painting, grap ...
Classified Ad Website Design -
Website design, web development & SEO company, provides SEO services, search engine ...
Classified Ad KA Tech is offering Website Designing & SEO
KA Technologies have a superior understanding and knowledge of Web Designing / Development & SEO ( ...
Classified Ad Digital Marketing Company in India - Web Design
Datascribe Technologies Inc., was incorporated in the year 1999, now located at Charlotte, NC with a ...
Classified Ad sinelogix : Web Design and Development Solutions
We Sinelogix is a website development and design company based our at Bangalore and Gujarat. Get you ...
Post Free Ad | View More View more classifieds
Forum Post Search Engine Optimization Made Easy
It is becoming more and more apparent that high search engine rankings is vital for getting massive ...
Forum Post Yahoo Search Marketing (Overture) Upgrades
Yahoo is upgrading its search marketing platform, which is used for text link ads (formerly Overture ...
Forum Post SEO tricks do to improve their rankings
A: Choose the best and most effective keywords for each page of your site. Optimize each page for 1 ...
Forum Post hosting and searchengine advertisement in one plan
Is there any sites which provide web hosting and search engine advertisement in one plan?Can anyone ...
Forum Post Top 10 Ranking Solutions
If you want your site to be noticed, make sure that your site stands in the top 10 rankings in the t ...
Add New Post | View More View more forum posts
Nav How To Sell A Website
Nav Subdomain Configuration - How To Setup A Sub Domain
Nav Domain Configuration - How To Setup A Domain Name
Nav Website Layout - Tips & Tricks
Nav Round Robin DNS Load Balancing
Nav Introduction To Server Load Balancing
Nav Website Traffic & Revenue
Nav Tips On Using SubDomain
View More News View More News