90% of all web sites visited on the Internet are found through search engines!!!
They are essentially a mechanism for users to find the product, service and specific information they require, from a staggeringly vast amount of information.
Things are by no means a simple as they may appear where search engines are involved. This guide is a frame work for designing and optimising web sites to perform well in any search engine but predominantly Google, since it accounts for, in our experience, up to 60% of all traffic generated through search engines.
How do they work? Search engines do not simply search all the sites available on the Internet and then present you with the most relevant list of search results, it is much more complicated. If search engines were to check every site in “real time”, it would take months to return any results. You know how long it takes to search something as simple as your emails for a word or phrase, the Internet is infinitely larger and increasingly more complex. So the search engine companies had to find a more manageable solution to the problem. This is where software programs called robots and spiders come into the frame.
They move around the Internet, extracting content and other items from the web sites they find and store these details in the search companies own database, this is called crawling or indexing. Once they have collected this information they then analyse and weight it according to a wide number of different factors and attach certain scores to the site for various attributes such as, keywords, content, number of links and amount of pages. So essentially, when you use a search engine like Google, you are searching their database, not the Internet directly. That is why it is vitally important to make sure your site has been submitted to the main feeder engines, which means you will at least have a chance of being found.
It is understanding the calculations the search engines use to appraise sites that is so important in achieving high rankings. If we know how to optimise sites for search engines, we can begin to see results and this means more traffic, which could equal more business.
More about search engine robots... Search engine robots, as mentioned earlier, are the tools search engines use to gather content and information regarding the different sites available on the Internet. They are sometimes called “spiders”, “bots” or “crawlers”, due to the way they move from link to link, through a site. The robots seek out web sites, checking for new sites, new pages and any changes to existing content. Once they have gathered this information, they pass it into the database for “indexing”, where it is evaluated.
There are 3 possible reasons a robot will visit your site:
- You submitted the URL to the search engine through its submission pages.
- The robot has found your site from another web site linking to you, know as an external link.
- The robot knows you exist and is checking to see if your content has changed or been updated.
Robots are the first key to search engine optimisation, if you do not understand how they move around the site and what kind of navigation system they can follow, then your content is irrelevant because the robot won't be able to find it.
Robots, at the moment, are relatively simple pieces of programming that have evolved from simple text browsers used in the early days of web browsing, when the Internet was used as a military information resource. They are able to read most text and code but struggle with the following:
- Frames and frame sets, this is why they have been largely abandoned by designers.
- Flash animations and navigation systems.
- Invalid code or coding practices.
- Text contained in images.
- Dynamically created URLs.
Assuming the robots can find your site, the first thing they will check is your “robots.txt”, Not all sites have one, if you do ,it will be in what's called your root directory, the base folder your site is stored in, and it informs the robot if it is allowed to index your site, it is then directed by this file, where it can and cannot go.
Your statistics package for your web site should tell you which robots have been visiting you, how often they are coming and how many pages they are indexing per visit. The main three robots are Yahoo's, Google's and Microsoft's, respectively identified as Inktomi Slurp, Googlebot and MSNBot. In order to do well in search engines, you want them to visit you every day and do comprehensive, deep level indexing.
Q: So, what happens once all your information has been gathered?
A: The search engines index the content!
Search engine indexing.
Once your content has been extracted, it is indexed. Each search engine has its own unique system for evaluating this information called an algorithm, essentially a complex mathematical equation that weighs up all the factors discussed in this document, compares them to every other relevant site and allocates a point score to your site. Because of the unique nature of these algorithms, it does not necessarily follow that a high ranking in Google will give the same result in Yahoo. The search engine companies are constantly tweaking and changing their algorithm in order that their engine be the one to offer the most relevant results from your keyword search and you will therefore use their service again. They also constantly refine their weighting techniques to make sure they are avoiding dirty tricks used by unethical optimisation companies, to give artificially high rankings to irrelevant pages. It is for this reason that employing the services of a search engine optimisation firm, such as MaSha Design, who live and breath these rules, on a daily basis, is imperative. The layman cannot hope to keep up with these changes and perform at a high enough level, it is a full time job.
As I write this information now, the search engine companies are working on their next generation robots with much higher levels of intelligence that will change the face of optimisation, at least for a while anyway. In my opinion, Google is by far the most fair, impartial and intelligent search engine around. It does not allow paid listings to effect its main results and it pushes over 70% of traffic through the web sites I monitor.
Written By: Shane Quigley.