Okay, so you know the basic concept of a search engine. Type a word or phrase into a search box and click a button. Wait a few seconds, and references to thousands (or hundreds of thousands) of pages will appear. Then all you have to do is click through those pages to find what you want.
But what exactly is a search engine, beyond this general concept of “seek and ye shall find”?
It’s a little complicated. On the back end, a search engine is a piece of software that uses applications to collect information about web pages. The information collected is usually key words or phrases that are possible indicators of what is contained on the web page as a whole, the URL of the page, the code that makes up the page, and links into and out of the page. That information is then indexed and stored in a database.
On the front end, the software has a user interface where users enter a search term — a word or phrase — in an attempt to find specific information. When the user clicks a search button, an algorithm then examines the information stored in the back-end database and retrieves links to web pages that appear to match the search term the user entered.
The process of collecting information about web pages is performed by an agent called a crawler, spider, or robot. The crawler literally looks at every URL on the Web, and collects key words and phrases on each page, which are then included in the database that powers a search engine.
Considering that the number of sites on the Web went over 100 million some time ago and is increasing by more than 1.5 million sites each month, that’s like your brain cataloging every single word you read, so that when you need to know something, you think of that word and every reference to it comes to mind. In a word . . . overwhelming.