Tracking OERs: Web search

This page is one of several describing technical approaches to tracking the use of OERs.

The idea behind using web search as a tracking mechanism is that one may discover where on the open web a resource is available by putting a unique phrase, tag or identifier into the text of the resource and then searching Google or other index for that phrase. The key phrase used should be should be unique to the resource, project or programme depending on the level of aggregation required for the tracking. An extreme form is the use of the whole text of the resource as the phrase for searching.


 * Web search can track how many places on the open web are hosting a resource, regardless of how many people access the resource on third party sites.


 * This technique cannot be used to track copies that are located on sites that do not allow web indexing, either because they are private intranets, behind authentication gateways or otherwise block web crawling.


 * If the licence under which the OERs are published allows modification then the text or phrase which is used for searching may be removed when the resource is copied. Some Open Source Software so-called "badgeware" licences allow for sharing on the condition that some text or embedded image is retained.


 * The unique text that is used for the search may be a tailored URL (e.g. a via HTTP-URI identifier for the resource handled by a [URL redirect] service hosted by the OER provider) or a link to an image (perhaps a [web bug]), this would have several consequences: searching for a link URL on Google gives precise results for a short piece of text; the prominence on Google of the resource being linked to would increase; one could correlate the results of the web search with the results of the web bug to see where a resource was hosted but not used.


 * Whatever text is used for the search, if it is shorter than the full text of the resource, there is a chance that it will included in some other resource, thus giving false-sightings. This may happen semi-accidentally, e.g. someone posting a query saying "I found the following in my course notes '......' What's that about?".