Google added to Google Webmaster Tools in "Labs>Site performance" information about pages load time and presents to webmasters the Page Speed tool that gives detailed information and guidelines on improving page performance. The official Google blog article How fast is your site explains the site performance metric. Information on how to install and use Page Speed is at Using Page Speed and there is a fantastic collection of guidelines at Web Performance Best Practices.
Another new feature in "Show options>Standard results" of Google search results is ‘Images from the page’ that shows a selection of thumbnail images embedded in a page.
Google Webmaster Tools added two new features: fetch as Googlebot that shows how Googlebot sees a page, and malware details to show malware code in infected websites. The two features are in the new Labs tab of Google Webmaster Tools.
Google presents the new features in the article Fetch as Googlebot and malware details of the Official Google Webmaster Central Blog, and the malware details feature is discussed in detail in the article Show me the malware of the Google Online Security Blog.
Google announced a proposal to crawl AJAX-based website in the article A proposal for making AJAX crawlable of the Official Google Webmaster Central Blog. One of the ways of making an AJAX-based website crawlable is for the site to adapt the way it marks the links with fragment identifiers, and to use server-side headless browser techniques to generate HTML code based on the final state in the browser.
Google recently (August 2009) announced in their Official Webmaster Central blog that they added to Google Image Search support for RDFa mark-up indicating the type of license for an image embedded in an HTML page (it applies only to the images owned by the site using the mark-up). The mark-up wraps around the image to make clear which license type for reuse refers to which image. The example given in the blog article is
<img src="image.jpg" />
<a rel="license" href="http://creativecommons.org/example">Creative Commons Attribution example</a>
The new micro-format attributes for labelling the type of usage rights for reuse are the
about attribute of the
div element enclosing the image and the hyperlink describing the type of license, and the
rel="license" attribute of the hyperlink. Continue reading
Mystery solved, the home page URL www.asymptoticdesign.co.uk appears now with title and snippet in Google search results for a search using the
site: operator. I was right I think, it was a problem of time. It takes time to search engines to update in search results data that took a long time to build up. Time and waiting a bit are important tools in the art of SEO.
I have a theory that the hugely complex databases of search engines, like Google and Yahoo, are structured in layers, the deeper layers having more inertia in search results. Continue reading
Blogs have a nifty way to add semantic structure with categories and tags. I was wondering how to label the category for my blog articles about blogs, about the SEO and web programming aspects of blogs, and I thought of inventing a witty word, “blogosophy”, a word based like the word “philosophy” on the Greek word for wisdom, σοφία (sophia). I made a Google search for it, thinking that soon only my website will turn up for this new word, and surprise… the word “blogosophy” already exists, it is a word as established as the blogosphere….
Blogs are in theory very search-engine-friendly. They are the perfect format for adding new content often, and search engines like that. Also blogs have many built-in features and settings to make them easy to crawl and index by search engines.
Blogs have search-engine-visibility settings, for example the WordPress software adds or removes a
noindex,nofollow meta tag to the HTML
head element of blog pages according to this visibility setting. A noindex,nofollow meta tag makes a page invisible in search results, prevents search engines from adding a page to search results. But for that the page has to be accessible to search engines, not blocked in robots.txt file. If a URL is blocked in the robots.txt file, search engines do not access it, but the URL still appears in search results (only the URL, no title or snippet), if search results collected that URL from links or XML sitemaps.
Another feature important to search engines is the URL structure. Blogs software provide settings for that, for example in WordPress gives the option, in Settings>Permalinks, to have a user-friendly, search-engine-friendly URL structure, like http://sitename.com/blog/2009/08/03/post-title-excerpt/
The URLs of RSS feeds automatically generated by the blog software, that include recent posts, can be submitted to search engines like Google, Yahoo, MSN, Ask, as XML sitemaps, and added in Sitemap lines to the robots.txt file, to quickly let know search engines of the new URLs of the blog, and help with indexing in search results.
It is very important to have indexable URLs easy to find by search engines, but as important as that is to stop search engines from crawling and indexing duplicate content URLs, from presenting to search engines a quasi-infinite space of URLs. For example the search feature or category navigation generate URLs with similar content as the blog pages, and it is good practice to block these URLs in the robots.txt file.