Tag Archives: 101

Server Response Codes – what are they & why are they important?

What are they and why are they important?

When you request a URL or an element within a page, your browser you send a request. The magic of the tinter-web happens and the server you end up on sends back the content, with a response code, that communicates with your computer and confirms success/failure or an alternative action suggestion.

To really know what your servers are doing I really would suggest that you monitor your server response codes. This is especially important during any migrations/ launches or major changes. You can get this data easily inside Google Webmaster Tools or any internal tools you may have e.g. Tealeaf. If you are doing testing you can even get browser plugins e.g. http headers, or, fidler for Firefox.

Monitoring server response codes gives you insight into your site health, indexation, what is really happening, advanced usability monitoring, especially if you look by useragent or at a session level.

Server response codes are NOT “error codes”

It all depends on what you wanted your site to do. I jokingly say that reading repsonse codes are a bit like ‘tarot cards’ as you need to add context to when the code was given. You will see some common definations below. So, if you mean to give a 404 to delete a page from Google, then it is a positive thing. However, if you have accidentally deleted or moved a page, then its bad.

So, What are the most commons response codes I should know about?

Defintitions are from the mid Nineties and havent really changed. They are all numerical and they are grouped as…

Successes are in the 200’s

These codes indicate success. As in, you requested and you received.

200 = OK

The request was fulfilled as requested. This is the ideal answer to most requests made. And if you see redirects e.g. a 301, at the end of the chain you should get a 200.

201 = Created and OK

This should following a POST command. POST means that you send a request and the content is then generated, then returned. As opposed to a GET command, which ‘gets’ the same page every time! In the old days, POST was to get search results and GETs were ‘static’ pages. These days, you can have a GET URL, but actually POST to get the content. More to come in real HTML5 as we object.

Redirections are in the 300′s

These codes handle redirections of permanent or temporary in nature.

301 = Move permanently

The data requested has been assigned a new URL, and the change is permanent. This is the most common type of redirect for any kind of content movements, site migrations or major platform upgrades (effectively an internal migration). A user may notice a different URL in their browser. But for a SEBot, this means replace the URL you requested with this new one.

302 = Temporarily moved. A temp re-direct

The data requested actually resides under a different URL temporarily. A user will not likely notice, and the instruction to a SEBot is to look at the new location, but to keep the original URL in its db.

Bad requests are in the 400s

The request had bad syntax or was inherently impossible to be fulfilled.

403 Forbidden request

The request is for something forbidden. You may see this if you are looking around a server and the sys admin has put permissions/access rules. Or an internal request has not be given the right level to authentication.

404 = Page not found

A commonly used terms across the business, but do they know what it means? In short, the server has not found anything matching the URL requested. This could be a problem if you are not expecting this response code, or could be great if you have deleted a page and you want the search engines to remove this page from its index.

You should ensure that the content of that page works for the user! Ideally a sign-post-page. But a common mistake when creating a “404 page”, especially if you are use your app layer to resolve the request that it treats it like a disambiguation request and 302s to a 200 with a 404-style-message. You should check your 404′s 404 response status today.

Server problems are in the 500′s

500 = Internal Server Error

This means that the server encountered an unexpected condition which prevented it from fulfilling the request. Normally this would be an outage. If you see these in your analysis, you probably can’t do much about these, other than to keep your sys admins/hosting company in check.

503 = Temporarily not available

In the real world, if you take your website down for maintenance or updates etc. (if you need to take it down – why not load balance?), you should return a 503. This is effectively a soft 500. This tell the SEBots to come back later. This is important, as if you give a SEBot too many outages, you will lose your rankings!

There are more response codes, but in reality, every day, these are the only ones you will normally come across.

Link to a fun post about response codes shown in pictures of cats.

How to build a business case

adrianland-business-case

If you work in a large company you probably are familiar with having to prioritise.  And if, like me, you work with a year long technology roadmap to make any change you need to build a case.

I have found the best way to build this case, is to use these 4 criteria.

• Size of Opportunity
• Risk of doing / risk of NOT doing
• Level of Effort
• Time to Impact

It generally works if there is a big opportunity and quick to impact. Then high levels of effort and risk are normally then just managed.

If there is a long time to impact, and only medium or small possible return, you are unlikely to get your idea squeezed in.

Sometimes, if the level of effort is next to nothing and the size is medium and is quick you might be lucky!

If this is a SEO business case, that is when the risk of not doing really kicks in.  Try to model what the loss is, especially if this is a hygiene change!

I reckon most situations can be modelled on these 4 criteria.

Google’s SEO Starter Guide

Those helpful folk at Google have written a really simple and easy to follow PDF to explain the basics of SEO and how to do this on a small site.  It is worth a read.  Here is a link to download their file. (549kb)

You might want to compare this to my article about common SEO mistakes.

Here are some other Google articles 

 

What is SEO?

One of my responsibilities as a corporate SEO is to explain what SEO is.  This is quite often to lay-men/women who have an genuine interest and occasionally whom have been told to have an interest!

SEO has become a mainstream topic in recent years inside big and small companies alike.  And since SEO is a broad topic and covers many strategic and tactical elements it can be a long and varied chat.

And since most conversations normally start with very specific questions straight into the bits of information they have read, heard, or departmental specific topics.  I unfortunately over-use the word “depends” or answer their question with a flurry of questions back to them to qualify their interest.  

This sometime makes SEO seem slightly impenetrable.  Which it isn’t really !

SEO is a hybrid discipline of marketing & technical.  SEOs need to be logical, mathematical, project managers, researchers, data miners, lateral thinkers, writers, a politician, people manager, problem solver and creative.  And to-date I have not met two SEOs who have the same background or specific SEO passion.

Generally the overlying objective is “to drive qualified traffic to site”.  This then normally breaks out as 

 

  • Traffic targets from search engines, &
  • Transactions / Actions.

 

SEO is made up of manly constituent parts.  And each piece is needed, as they all work together.  Hopefully this diagram shows the main areas in my opinion.

Adrian's 5 SEO categories

 

 

In addition to this all good SEOs must have market knowledge, understand competitors, the changing Search Scape, all with half an eye on the SEO crystal ball.  

SEO is also the catch all for any web related questions, topics from UGC to usability.  Who else is there normally to turn abstract ideas into pragmatic actionable business sounds plans.

There is no one-size-fits-all SEO strategy, all websites are unique and can have differing goals, objectives, infrastructure, strong perosonalities, legacy systems.  But to-date all ideas / topics / issues can fit into one of my 5 categories.

SEO is many things to many people! And some of it is about Optimising websites for Search Engines!

A SEO site audit

his is my essay plan on what I would put in a site audit.  Obviously there are extra bits that you may need depending on the type of site you are working on.  But this is meant to be a kind of generic framework.

Three main categories for any analysis

  • Technical Spec of site
  • The site itself
  • External SEO

These blend into this report format / essay plan below. So, here goes…..

SE Index visibility by Engine
Look at the what each of the main SE’s have on your site(s) and compare this to what they should have !

  • Observations / Duplications / URLs / Session ID’s etc
  • Tech issues, clitches, err
  • Comments between G, Y and M.
  • Use of Google’s Webmaster Central etc

Site architecture
How is the site organised, how the site all links up.  Can you find everything if you dont have JS enabled and this includes deep content on SERPS or through refinements

  • Structure of domains and folders
  • Static pages
  • Search results, refinements and pagination
  • URL structure – capitalisation and inconsistencies
  • Directories versus Search results navigation
  • Effectively site map pages

Site link analysis
Quantity, Quality, Depth. And internally, can you find everything in a logical and consistent way?

  • External links, Where do they go to? Anchor text of those links
  • Internal linking, Crumb trials, On page, Anchor text review,Footers and Headers

Keyword ranking and optimisation
What are you targeting and is the optimsiation up to it?

  • Ranking report summary, current position
  • Observations , head body tail, general KW distribution.
  • By Search Engine
  • Compared to competitor(s) performance
  • And how the keywords fit in the site structure and linking

Onsite optimisation

  • Metas, Title, Description,  Best practice and Summary from GWC about duplicates, By key templates etc
  • Markup audit of H1, h2.  Quality CSS
  • Error handling protocols including errors and genuine site changes, adding or removing pages!

Summary of key issues with solutions

Next steps and Recemendations

- – end – -

All you need to do now – is take the recemmendations and turn that into your action plan.

Hopefully you wil get the technical resource or/and the biz resources before your competitors gain too much ground.

What is RSS? and how can it be used?

RSS - what is it?What is RSS. ? Well you may of seen this logo appear on websites and espexially sites that update their content very frequently.

It is defined as ‘Really Simple Syndication’. It means sharing data. What does that mean I still hear you ask and what does that mean to me?

Think of it as a distributable “What’s New” on websites.

On a practical basis you can get your news headlines in your ‘RSS Reader’ like receiving email ! This is the only current practical use to the average internet users. Many companies share data using XML. RSS is effecitively just like XML, and is another form of that but kindof for news and consumer content.

A program known as a feed reader or aggregator can check a list of feeds on behalf of a user and display any updated articles that it finds.

There are two main ones that I use for work and for personal info. The first is Bloglines, this is web based and has been around for a while. This has the advantage of being able to save Blog searches. And the other really popular web based one is Google Reader. This is my main one for my personal and professional reading.

I really like the BBC, (British Broadcasting Corporation) here in the UK for their explaination and they have been very good in setting up their services. They have adopted their content and it is good stuff. Take a look at their content.

Take a look at a RSS reader and have a play.

I get all my work news, news and sport direct to my ‘news inbox’ every day. It is addicitive and easier than trawling through your favourites to find new content.

Did this makes sense, any comments? A.

Common causes of duplicate content

Duplicate content zebraI was trying to list out all the causes of duplicate content – well accidental duplication.

Here is the list to date

  • Inconsistent URLs and links, especially search results or inventory via different attribute routes
  • Similar products or bundles of products with similar description, this can be on your sites of resellers
  • Print friendly pages inc. white papers, pdf downloads.
  • DNS errors ie. no http:// to http://www etc or https !
  • Content management systems who use session cookies in urls

These all produce Errors – and effectively your are ‘pissing in your own pool’. And risk getting duplicate content penalties – even though – you probably don’t realise you have done it. The good news is that all of these can be fixed fairly easily.

As I resolve more here – I will add to this list – send me more if you have them !

Best practice guides for Meta Data & H1′s

Best Practices for Title Tag

  • Limit the length of the titles between 70 – 90 characters including spaces
  • Focus on keywords that most closely relate to the purpose of the page
  • Repeat the primary keyword at least twice
  • Use a secondary keyword within the title
  • Create custom titles for top categories and locations on the site
  • Be careful with “Templated” titles – They tend to be discounted as they are easy to be detected as database generated, ensure they are differentiated, even if this is just adding ‘Page2′ or similar!

Best Practices for Meta Description Tag

  • Meta descriptions should be crafted as marketing messages since they can significantly impact the Click-through rates from search result pages, there is no point being on page one if you are not going to get the attention of the user.
  • The meta description should be no more than 155 characters including spaces
  • Avoid using templated meta descriptions – Customize the descriptions for the top areas of the site, if you are on a big site, be creative with database variable and output tags.
  • Repeat the keywords used in title within the meta description
  • Use variations of the main terms in different orders and tenses

Best Practices for Meta Keywords Tag

  • Meta keywords tag should be limited to keywords most closely related to the concept of the page – Avoid using keywords that do not directly relate to the content on the page
  • Limit to about 12-15 keyword phrases (2 – 3 words each)
  • Alternatively, limit the length of the keywords field to 255 characters at the very maximum (including spaces) – If the recommendation of closely related keywords is followed, you should find this limit extremely difficult to hit
  • Separate with “, “ (comma followed by space and then next keyword)

Best Practice for Page Elements

  • Use H1 tags
    - Repeat the Top Keyword related to the page at least twice and use 2nd best keyword at least once
  • Use H2 tags
    - The H2 tag can (and should) be slightly longer than the H1 tag, make it a true sub headline
    - Focus on slightly different variations of the Top 2 keywords
    - Depending on length try working in a 3rd keyword
  • SEO Content
    - Keyword rich content is essential to gaining high rankings, make sure you use genuinely useful content, which just happens to have exact seo keywords in it!

Read is aloud, if you have to ask if it sounds bad – it probably is.

Have fun.

If you have any other top tips – let me know and I will add them in.

p.s. This is always work in progress !

What do Search Engines want?

What do search engines want … (written in Mar 07 and moved here in July 08) will update at some point !!

… this can depend on who you ask – but here are some general notes.

YAHOO:

  • Original and unique content of genuine value
  • Pages designed primarily for humans, with search engine considerations secondary
  • Hyperlinks intended to help people find interesting, related content, when applicable
  • Metadata (including title and description) that accurately describes the contents of a web page
  • Good web design in general

Google:

  • A site with a clear hierarchy and text links. Every page should be reachable from at least one static text link.
  • Make sure that our TITLE and ALT tags are descriptive and accurate. Check for broken links and correct HTML.
  • A site map for our users with links that point to the important parts of our site.
  • A useful, information-rich site, with pages that clearly and accurately describe our content.
  • Use text instead of images to display important names, content, or links. The Google crawler doesn’t recognize text contained in images.

MSN

  • Make sure that each page is accessible by at least one static text link.
  • Keep the text that you want indexed outside of images. For example, if you want your company name or address to be indexed, make sure it is displayed on your page outside of a company logo.
  • Add a site map. This enables MSNBot to find all of your pages easily. Links embedded in menus, list boxes, and similar elements are not accessible to web crawlers unless they appear in your site map.
  • Keep your site hierarchy fairly flat. That is, each page should only be one to three clicks away from the home page.Keep your URLs simple and static. Complicated or frequently changed URLs are difficult to use as link destinations. For example, the URL www.example.com/mypage is easier for MSNBot to crawl and for people to type than a long URL with multiple extensions. Also, a URL that doesn’t change is easier for people to remember, which makes it a more likely link destination from other sites.

Ask:

  • Sites should load quickly and be polished, easy to read and easy to navigate.
  • Sites should be well maintained and updated regularly.
  • Sites should offer thorough and accurate information that provides information that is highly relevant to a user’s search term(s).
  • Sites should offer additional links or information related to a user’s search term(s).Sites should demonstrate credibility by providing author and source citations and contact information.

You can get all that directly from their help sections or deduction!

E-commerce sites
There are a few additional considerations for transactional sites or sites with secure areas.
These additional guidelines for e-commerce sites are listed below:

  • Sites should provide secure transactions (preferably by SSL/SET)
  • Sites should disclose policies for customer privacy, returns, exchanges and other customer concerns
  • Sites should offer many types of the product being sought, relevant brands and/or an appropriate range of products
  • Sites should provide adequate product information
  • Sites should offer customer service by phone, preferably 24 hours a day

How do we give search engines what they want:
Use the words users would type to find our pages, and make sure that our site actually includes those words within it. Understand your customer and use their language. Isnt this marketing 101?

Dynamic pages (i.e., the URL contains a “?” character) – not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few. Seems obvious – CMS systems and cookies are the first culprits of ruining this !

Use a text browser such as Lynx to examine the site, because most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of our site in a text browser, then search engine spiders may have trouble crawling your site.

Allow search bots to crawl the sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of our sites, as bots may not be able to eliminate URLs that look different but actually point to the same page. Accessibility is the single quickest way to fail in search engines!

Make sure our web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead.

The robots.txt file on the web server tells crawlers which directories can or cannot be crawled. Make sure it’s current so that we don’t accidentally block the Googlebot crawler. We can test our robots.txt file to make sure we’re using it correctly with the robots.txt analysis tool available in Google Sitemaps or webmaster central (many others available).

Make sure that the content management system can export our content in a way so that search engine spiders can crawl our sites.

Specifically:

  • Avoid hidden text or hidden links.
  • Don’t employ cloaking or sneaky redirects.
  • Don’t send automated queries to Google.
  • Don’t load pages with irrelevant words.
  • Don’t create multiple pages, subdomains, or domains with substantially duplicate content.
  • Don’t create pages that install viruses, trojans, or other badware.
  • Avoid “doorway” pages created just for search engines, or other “cookie cutter” approaches such as affiliate programmes with little or no original content.
  • If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.

How to make a site search engine friendly:

  • Give visitors the information they’re looking for
  • Provide high-quality content on pages, especially the homepage. This is the single most important thing to do. If your pages contain useful information, their content will attract many visitors and entice webmasters to link to the sites naturally. In creating a helpful, information-rich site, write pages that clearly and accurately describe vacation rentals. Utilise keyword research findings by using keywords on the page.

Links help our crawlers find our site and can give your site greater visibility in search results. When returning results for a search, Google combines PageRank (their view of a page’s importance) with sophisticated text-matching techniques to display pages that are both important and relevant to each search. Google counts the number of votes a page receives as part of its PageRank assessment, interpreting a link from page A to page B as a vote by page A for page B. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.”

Keep in mind that Google’s algorithms can distinguish natural links from unnatural links. Natural links to your site develop as part of the dynamic nature of the web when other sites find your content valuable and think it would be helpful for their visitors. Unnatural links to your site are placed there specifically to make your site look more popular to search engines. Some of these types of links (such as link schemes and doorway pages) are covered in Google’s webmaster guidelines.

Only natural links are useful for the indexing and ranking of our sites.

Make your site easily accessible
Build our sites with a logical link structure. Every page should be reachable from at least one static text link.

Use a text browser, such as Lynx, to examine your site. Most spiders see your site much as Lynx would. If features such as JavaScript, cookies, session IDs, frames, DHTML, or Macromedia Flash keep you from seeing your entire site in a text browser, then spiders may have trouble crawling it.

Consider creating static copies of dynamic pages. Although the Google index includes dynamic pages, they comprise a small portion of our index. If you suspect that your dynamically generated pages (such as URLs containing question marks) are causing problems for our crawler, you might create static copies of these pages. If you create static copies, don’t forget to add your dynamic pages to your robots.txt file to prevent us from treating them as duplicates.

Things to Avoid

Don’t fill your page with lists of keywords, attempt to “cloak” pages, or put up “crawler only” pages. If your site contains pages, links, or text that you don’t intend visitors to see, Google considers those links and pages deceptive and may ignore your site.

Don’t use images to display important names, content, or links. Google’s crawler doesn’t recognize text contained in graphics. Use ALT tags.

Don’t create multiple copies of a page under different URLs.

And a new thing – think about how you present your search results. These ideally should not be treated as many many pages. If they add value then that is ok. But dont think that mulit criteria search results making hundreds of thousands of very similar pages is a great thing.

Think about what adds value and is easy for your users and then SEO should be taken care of.

Who is responsible for SEO in your company?

I say everyone !

SEO what is it all about and how does it work?

HyperTextTransferProtocolsFor anyone in this business I bet you get asked the same question……. “what is important for a search engine and how do we do that?”

Which in most other businesses would be easy to answer but SEO is not a straight forward question. Here is my attempt to answer this to people who only like bullet points, ‘whooo-har’ action plans!

What do you think?

I have tried to write some notes down to explain the different stages. It is a summary and may include some sweeping generalisations. But the idea is to identify the important topics as I see it from an SEO perspective.

What is important for a Search Engine?

The SE’s objective is to present the most relevant content to their users. It uses a series of ways of achieving this.

They try to find all of the webs content and then make sense of it all.

In short – they want the content, and we need to make good content and make it available!

How it works

There are 3 distinct stages (imho!)

1) Being crawled

2) Being Indexed

3) The SERPs (search engine results pages)

“Being Crawled” – when the search engine’s web crawler comes to our site and makes copies of our pages.

This is based on an allocated workflow for the spider. This will consume as much content as it can find easily.

The quickest way of stopping a spider is the Robots.txt file, producing spiders traps (where the spider gets trapped e.g. calendars if errors!)

The SEs confess they have a limited capacity and fresh content updated regularly will get more attention than an infrequently updated content on a poor site.

“Being Indexed” – when Google (or other SE) makes a first pass over the content it has found. Based on some basic quality scores it will then allocate the page into the main index or a separate ‘supplemental index’. This is based on a number of factors which we can assume are: A combination of a good server being available, accessible content, unique content with relevant keywords, good internal linking, has inbound quality links, age, status and no reason why the site should be barred. In this process Google or other SE ‘tags’ our content. You can have multiple content tags and is filed away on their servers.

“the Run time index” – this is what you see when you search the SE. This is where the algorithm sits.

When you run a search, it searched its own tags it has made about the page. Recalls the most relevant ‘tag’ or ‘token’ and displays a snippet. This is based on its criteria.

This is always moving and is in a state of “Ever flux”. This means they may change the weighting in their algorithm at any time.

Periodically they will have a ‘data push’ to a website with a new feature, page format change etc. But there are no longer ‘Google dances’ !

There is a new variable as mentioned – “Personalisation”, so that Google produces an individual set of SERPs which it thinks are the most relevant to them.

What do we measure?

1) Pages indexed

We measure the number of pages in the index. This is an approximate number and is best for trends rather than absolute numbers.

2) Ranking and Visibility report

This is taken on a sample of phrases. An automated programme tests these terms on the Search Engines. It shows us how many results we return.

It also shows us a visibility score – how much exposure do we get. This is another theme. And the trend is important over the absolute numbers. It builds over time and is a snapshot from that day and datacenter.

3) Traffic and other business specific KPI by traffic source etc etc.

And this is one that really matters.

Ways to achieve this.

  • Have a fully accessible site
  • Well structured hierarchy
  • Good Internal linking
  • Comply with webstandards on code
  • Stable site etc
  • Content
  • Unique useful content, and accessible
  • Regularly updated
  • Status and Authority
  • Quality inbound links
  • Well connected on the internet and endorsed by other authorities

Things to avoid

  • Blocking their webcrawlers
  • Avoid confusing their webcrawlers – infinite loops, bad navigation
  • Confusing URL and internal links
  • Duplication
  • Instability

Common misconceptions

  • Dynamic sites cannot be read by SE’s
  • CMS includes are bad
  • SE’s can’t read ? and &’s

it could go on – but this is enough for now.

Any comments welcome !