WebsiteGear Logo Log In
New User? Sign Up
About | Contact | FAQ
  Home Content Website Promotion Search Engine Optimization Sunday, May 26, 2024 
Print| Email| Save| Discuss| Feeds

About Internet Search Engines
Published: Friday, August 20, 2004

About Internet Search Engines

The internet contains a vast collection of information, which is spread out in every part of the world on remote web servers. The problem in locating the correct information on the internet led to the creation of search technology, known as the internet search engine. A search engine can provide links to relevant information based on your requirement or query. Examples of popular internet search engines are Google, Yahoo, MSN, Lycos and Ask Jeeves. In order to understand the terminology and techniques to position your website pages for higher ranking in search engines, the knowledge of the basic functioning of a search engine is essential.

Functions of Internet Search Engines

A search engine is a computer software, that is continually modified to avail of the lastest technologies in order to provide improved search results. Each search engine does the same functions of collecting, organizing, indexing and serving results in its own unique ways, thus employing various algorithms and techniques, which are their trade secrets. In short, the functions of a search engine can be categorized into the following:
  1. Crawling the internet for web content.

  2. Indexing the web content.

  3. Storing the website contents.

  4. Search algorithms and results.

Crawling and Spidering the Web

Crawling is the method of following links on the web to different websites, and gathering the contents of these websites for storage in the search engines databases. Crawling the internet can start afresh (starting with a popular website containing lots of links, such as Yahoo) or from existing older indexes of websites. The crawler (also known as a web robot or a web spider) is a software program that can download web content (web pages, images, documents and other files), and then follow hyper-links within these web contents to download the linked contents. The linked contents can be on the same site or on a different website.

The crawling continues until it finds a logical stop, such as a dead end with no external links or reaching the set number of levels inside the website's link structure. If a website is not linked from other websites on the internet, the crawler will be unable to locate it. Therefore, if the website is new, and has no links from other sites, that website has to be submitted to each of the search engines for crawling.

The efficiency of the crawler makes it crawl multiple websites at the same time, so as to collect billions of website contents as frequently as it can. News and media sites are crawled more frequently (every hour or so) by advanced search engines like Google, in order to deliver updated news and content in their search results. The crawler also does not flood a single website with a high volume of requests at the same time, but spreads the crawling over a period of time so that the web site does not crash. Usually search engines crawl only a few (three or four) levels deep from the homepage of a website. The term deep crawl is used to denote that the crawler or spider can index pages that are many levels deep. Google is an example of a deep crawler.

Crawlers or web robots follow guidelines specified for them by the website owner using the robots exclusion protocol (robots.txt). The robots.txt will specify the files or folders that the owner does not want the crawler to index in its database. Many search engine crawlers do not like unfriendly URLs, such as those generated by database driven websites. These website URLs contain parameters after the question mark (such as Search engines dislike such URLs because the website can overwhelm the crawler by using parameters to generate thousands of new web pages for indexing with similar content. Thus, crawlers often disregard the changes in the parameters as part of a new URL to spider.

Search engine friendly URLs are used to compensate for this problem.
About Search Engines - Part II Next Article
Print| Email| Save| Discuss| Feeds
Nav Search Engine Optimization - SEO ideas to avoid
This article discusses some web page optimization tricks that webmasters use but might turn out to be harmful to the website.
Nav Search Engine Optimization - Tips & Tricks
Search Engine Optimization has gained a lot of attention in the last few years. This article will provide some tips on how to optimize your web pages for a better search engine ranking.
Nav Froogle Optimization - Optimizing For Google's Product Search Engine
Merchants who have pursued Froogle maximization for their products have reported significant traffic and associated sales increases. This article lists the important strategies involved.
Nav How to Maximize Paid Search Results
Try these proven strategies for pay-per-click marketing to produce your desired results.
Nav Targeting Usage Demographics to Increase Paid Search Conversions
By understanding usage demographics for the search engines, a web marketer can develop a relevant message and target an ad placement that most effectively connects and converts.
News Post Ignite Visibility Introduces New Generative Engine Optimization Service Focused on Boosting Visibility in AI-Powered Search Engines
Top Digital Marketing Agency Among First to Market with New AI-Based Services Aimed to Help Businesses Attract More Web Traff...
News Post The Op Games Launches MONOPOLY®: Grey's Anatomy Edition, Based on Popular Drama Series
On the Heels of the Series' Season 20 Finale, Fans Can Continue the Drama as They Race to Assemble the Best Team of Doctors i...
News Post Touchcast Works with Microsoft on Global LLM & Cognitive Caching Infrastructure for Scaling Generative AI
-- Global Collaboration: Touchcast to deploy its proprietary cognitive caching technology across all Azure Data Centers and b...
News Post Jeff Licciardi joins PlusMedia To Build Integrated Growth Engine
DANBURY, Conn., May 23, 2024 /PRNewswire/ -- Jeff Licciardi was introduced today as VP, Growth for PlusMedia, LLC, a fully in...
News Post SOCi Changes The Game With Genius Search: New Innovation That Does The Work Of 1,000 Local Marketers
Advanced AI and intelligent automation render traditional listings management obsolete SAN DIEGO, May 22, 2024 /PRNewswire/ -...
Submit News | View More NewsView more news
Classified Ad North Bengal & Sikkim Search
Are you looking for hotels, tour and travels, tea, Real estate, Mobile shop, art and painting, grap ...
Classified Ad Website Design -
Website design, web development & SEO company, provides SEO services, search engine ...
Classified Ad KA Tech is offering Website Designing & SEO
KA Technologies have a superior understanding and knowledge of Web Designing / Development & SEO ( ...
Classified Ad Digital Marketing Company in India - Web Design
Datascribe Technologies Inc., was incorporated in the year 1999, now located at Charlotte, NC with a ...
Classified Ad sinelogix : Web Design and Development Solutions
We Sinelogix is a website development and design company based our at Bangalore and Gujarat. Get you ...
Post Free Ad | View More View more classifieds
Forum Post Search Engine Optimization Made Easy
It is becoming more and more apparent that high search engine rankings is vital for getting massive ...
Forum Post Yahoo Search Marketing (Overture) Upgrades
Yahoo is upgrading its search marketing platform, which is used for text link ads (formerly Overture ...
Forum Post SEO tricks do to improve their rankings
A: Choose the best and most effective keywords for each page of your site. Optimize each page for 1 ...
Forum Post hosting and searchengine advertisement in one plan
Is there any sites which provide web hosting and search engine advertisement in one plan?Can anyone ...
Forum Post Top 10 Ranking Solutions
If you want your site to be noticed, make sure that your site stands in the top 10 rankings in the t ...
Add New Post | View More View more forum posts
Nav How To Sell A Website
Nav Subdomain Configuration - How To Setup A Sub Domain
Nav Domain Configuration - How To Setup A Domain Name
Nav Website Layout - Tips & Tricks
Nav Round Robin DNS Load Balancing
Nav Introduction To Server Load Balancing
Nav Website Traffic & Revenue
Nav Tips On Using SubDomain
View More News View More News