Canonical links for osCommerce beta

Ok, I did it (well, sort of). Here is a beta (or better: proof of concept) for the new tag which tells the search engines Google, Yahoo! and Live which URL it should have for the current page. Matt Cutts (software engineer at Google) explains the new canonical tag:

The syntax is pretty simple: An ugly url such as http://www.example.com/page.html?sid=asdf314159265 can specify in the HEAD part of the document the following:

<link rel="canonical" href="http://example.com/page.html"/>

That tells search engines that the preferred location of this url (the “canonical” location, in search engine speak) is http://example.com/page.html instead of http://www.example.com/page.html?sid=asdf314159265 .

What can it do for my webshop?
If search engines find the same content on irregular/different URL, the page/content will be marked as “duplicate content”. This is a common mistake which you want to avoid since you want to have all your pages indexed properly. However, there are some factors that can lead to several URL’s for one page.

In osCommerce there URL’s displays the category ID, the product ID and a session ID. Every new visit(or) has a unique session. Here lies the problem: the unique session ID creates an irregular URL for the same page! This can create duplicate content.

You want to tell the search engines that the product page is found here
/catalog/product_info.php?cPath=44&products_id=172and not here
/catalog/product_info.php?cPath=44&products_id=172&osCsid=7eeskreq4qo639cg38 (where osCsid is unique for every new visit(or))

How to implement the canonical tag in your webshop
I must warn you, this is just a quick and dirty test. The contribution with optimized code and (probably) more compatibility will follow very soon.

You only have to adjust two files: index.php and product_info.php.

  1. Open both files and add before require(‘includes/application_top.php’); the following code:
    1
    2
    3
    
    $string = $_SERVER['REQUEST_URI'];
    $search = '&osCsid.*|?osCsid.*';
    $replace = '';
  2. Add within your <head> section the following code to generate the correct URL:
    1
    
    <link rel="canonical" href="<?php echo 'http://www.yourdomain.com' . ereg_replace( $search, $replace, $string ); ?>" />
  3. Don’t forget to replace yourdomain.com with your actual domain name.
  4. Done!

As I said, this is just a quick and dirty test. Compatible with and without search engine friendly URL’s. Although I tested it on three different osCommerce setups, it might not work for your installation. Please wait till the stable contribution.

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • co.mments
  • eKudos
  • LinkedIn
  • StumbleUpon
  • Technorati
  • TwitThis
  • email

12 comments:

  1.  

    [...] I launched my quick and dirty proof of concept for the implementation of the “canonical urls” link tag. Now I gathered some more information, tidied up the code and released it as a [...]

     
  2. Tony, 28. February 2009, 5:52

    Hi
    Do not copy and paste from the above as you may get errors with formatting of the single quotes.
    Tony

     
  3. Graeme Belle, 28. February 2009, 6:53

    gave the osc contrib a test, I’m on a xxxx.com.au. Source shows no change and link shows as missing “u” Can’t see why but is probably the reason it fails on my site.

    I like theidea so will keep watch.

    graeme

    Admin: Yes, Wordpress parses the quotes all wrong. Sorry about that. And it looks like in your case your server don’t put a trailing slash behind your domain name. In your case, replace the $domain line with this one:

    $domain = “http://www.yourdomain.com.au”

    This will replace the server check with your hard coded domain name without the trailing slash. This would make it up.

     
  4. Graeme Belle, 28. February 2009, 6:55

    from source

    forgot to add
    graeme

     
  5. mark brindle, 12. March 2009, 2:13

    worked ok for me, however pages that use ’sort’ and ‘page’ get listed as thier own canonical links. wondering if these should be filtered or not, as well as the OSCid ? i currently use a method to flag those ones as ‘noindex, follow’ instead – still waiting for google to stop reporting duplicate title tags on those to see if my work has fixed the problem or not.
    Whats your feeling on canonical when used with seo extras that make pages like:
    product_name-p-222.html ? your add on works for these fine – however since the same page is still accessible as -p-222.html which gives a canonical the same (ie -p-222.html), this doesnt help! I guess i need to fix the seo add-on to stop them being found and indexed as -p-222.html in the first place…

    Admin: Oh dear. You got a good point there. The additional “sort” and “page” don’t get excluded with the canonical tag. However, this should be unique pages. I don’t think it will harm. It’s not pretty tho… I will think about this and hopefully fix it soon.

    The extra p-222.html is a extra parameter which displays the ID of a product. Nothing harmful and certainly something you do not want to remove from the URL. Explanation is quite simple: you can have products with the same name in your store. Without the extra parameter your browser would not know which product to show since it’s missing the unique id.

     
  6. kbking, 28. March 2009, 2:39

    Thanks!

    Works a treat!

     
  7. mike, 13. April 2009, 17:02

    Hello,

    I downloaded and installed. The canonical url in the head of index.php and productinfo.php it is subtracting the “m” in .com. It has something to do with minus the trailing slash. Anyone else have this problem?
    Thanks

     
  8. mike, 14. April 2009, 16:43

    Sorry I didn’t see the Graeme Bell comment above.. Works great after hard coding in the URL.

     
  9. marc, 9. May 2009, 16:43

    hello,

    for mike i’ve got this problem i modify this :
    $domain = substr((($request_type == ‘SSL’) ? HTTPS_SERVER : HTTP_SERVER), 0); // gets the base URL minus the trailing slash

    i removed the minus.

     
  10. luke, 26. June 2009, 5:09

    what about situation like :-

    http://www.yourdomain.com.au
    yourdomain.com.au

    does the above code take care of this type of canonical problems?

     
  11. Stephan Miller, 18. March 2010, 16:38

    With the domain www. and non .www issues just use a redirect in the .htaccess file. I have a question about the canonical urls though. My installation of osc has at least 3 urls for each product, one with just the product_id, one with the catgory path and one with the manufacturers id. Which would you choose as a canonical url for these three?

     
  12. Thomas, 4. May 2010, 9:41

    Hello,

    I see there is no answer on the last comment – I don’t know if this discussion is still active or not. I can imagine you want to move on after a certain point :)

    I’m having the following issue after some changes in oscommerce: our shop exists in 3 languages. Product info pages are accessible through the following URLs:

    /product-name-dutch-p-1.html
    /product-name-english-p-1.html?language=dutch
    /product-name-french-p-1.html?language=french

    Will this contribution take care of this problem? Obviously I want the canonical URL be the URL in the correct language, and that the URLs in the 2 other languages with the language variable behind redirect to the first one.

    Thanks for your answer

     

Write a comment: