For anyone in this business I bet you get asked the same question……. “what is important for a search engine and how do we do that?”
Which in most other businesses would be easy to answer but SEO is not a straight forward question. Here is my attempt to answer this to people who only like bullet points, ‘whooo-har’ action plans!
What do you think?
I have tried to write some notes down to explain the different stages. It is a summary and may include some sweeping generalisations. But the idea is to identify the important topics as I see it from an SEO perspective.
What is important for a Search Engine?
The SE’s objective is to present the most relevant content to their users. It uses a series of ways of achieving this.
They try to find all of the webs content and then make sense of it all.
In short – they want the content, and we need to make good content and make it available!
How it works
There are 3 distinct stages (imho!)
1) Being crawled
2) Being Indexed
3) The SERPs (search engine results pages)
“Being Crawled” – when the search engine’s web crawler comes to our site and makes copies of our pages.
This is based on an allocated workflow for the spider. This will consume as much content as it can find easily.
The quickest way of stopping a spider is the Robots.txt file, producing spiders traps (where the spider gets trapped e.g. calendars if errors!)
The SEs confess they have a limited capacity and fresh content updated regularly will get more attention than an infrequently updated content on a poor site.
“Being Indexed” – when Google (or other SE) makes a first pass over the content it has found. Based on some basic quality scores it will then allocate the page into the main index or a separate ‘supplemental index’. This is based on a number of factors which we can assume are: A combination of a good server being available, accessible content, unique content with relevant keywords, good internal linking, has inbound quality links, age, status and no reason why the site should be barred. In this process Google or other SE ‘tags’ our content. You can have multiple content tags and is filed away on their servers.
“the Run time index” – this is what you see when you search the SE. This is where the algorithm sits.
When you run a search, it searched its own tags it has made about the page. Recalls the most relevant ‘tag’ or ‘token’ and displays a snippet. This is based on its criteria.
This is always moving and is in a state of “Ever flux”. This means they may change the weighting in their algorithm at any time.
Periodically they will have a ‘data push’ to a website with a new feature, page format change etc. But there are no longer ‘Google dances’ !
There is a new variable as mentioned – “Personalisation”, so that Google produces an individual set of SERPs which it thinks are the most relevant to them.
What do we measure?
1) Pages indexed
We measure the number of pages in the index. This is an approximate number and is best for trends rather than absolute numbers.
2) Ranking and Visibility report
This is taken on a sample of phrases. An automated programme tests these terms on the Search Engines. It shows us how many results we return.
It also shows us a visibility score – how much exposure do we get. This is another theme. And the trend is important over the absolute numbers. It builds over time and is a snapshot from that day and datacenter.
3) Traffic and other business specific KPI by traffic source etc etc.
And this is one that really matters.
Ways to achieve this.
- Have a fully accessible site
- Well structured hierarchy
- Good Internal linking
- Comply with webstandards on code
- Stable site etc
- Unique useful content, and accessible
- Regularly updated
- Status and Authority
- Quality inbound links
- Well connected on the internet and endorsed by other authorities
Things to avoid
- Blocking their webcrawlers
- Avoid confusing their webcrawlers – infinite loops, bad navigation
- Confusing URL and internal links
- Dynamic sites cannot be read by SE’s
- CMS includes are bad
- SE’s can’t read ? and &’s
it could go on – but this is enough for now.
Any comments welcome !