From CETISwiki
CETIS has been funded by JISC to do some additional technical work relevant to the the UKOER programme. The work will cover three topics: deposit via RSS feeds, aggregation of OERs, and tracking & analysis of OER use.
There is a need for services hosting OERs to provide a mechanism for depositors to upload multiple resources with minimal human intervention per resource. One possible way to meet this requirement that has already identified by some projects is “feed deposit”. This approach is inspired by the way in which metadata and content is loaded onto user devices and applications in podcasting. in short, RSS and ATOM feeds are capable, in principle, of delivering the metadata required for deposit into a repository and in addition can provide either a pointer to the content or that content itself may be embedded into the feed. There are a number of issues with this approach that would need to be overcome.
If you are interested in this work please sign up to our JISCmail list. http://www.jiscmail.ac.uk/OER-FEED-DEPOSIT
In this work we will:
- Identify projects, initiatives, services, etc. that are engaged in relevant work (--if that's you, please get in touch).
- Identify and validate the issues that would arise with respect to feed deposit, starting with those outlined in the Jorum paper linked to above.
- Identify current approaches used to address these issues, and identify where consensus may be readily achieved.
Projects, initiatives and services with an interest in this work
List of interested parties.
- Jenny Gray, The Open University
- Leading technical developer for OpenLearn [1]
- RSS sample [2]
- OAI sample [3]
Also worth considering what the following people do:
- iTunesU (would be nice for people who provide feeds for iTunesU if they didn't have to do much extra work)
- BBC podcasts (as an example of a podcast provider big enough to drive practice)
Issues Identified with regards to Feed Deposit
The following have been identified as potential issues. This list starts with issues identified in the Jorum paper, please any others you may have.
- Item identification: How can a unique identifier be assigned to an item within a feed and how can this be linked with the corresponding resource in the repository? The feed reader component of the repository must know which items have been previously processed so that duplicate submissions are prevented.
- Item updates: How can the feed indicate if an item has been updated? Does there need to be an agreed string term within the feed that a repository implementer could search for e.g. “Jorum:update”? Could resource check summing or signatures provide a solution? Drawbacks – how can a term be agreed with potentially numerous repository implementers. Check summing and resource signatures require a single physical file, what if the item in the feed points to an entire website with numerous pages, links, images etc.
- Item deletions: How can a feed indicate if an item has been deleted? Does simple omission from a feed indicate an items deletion? Drawbacks – simply omitting the item from a feed cannot be used to mark a deletion as a feed should not contain the entire contents of a repository. It should only contain a finite number of updates.
- Missing items: If a feed contains a finite number of items, there is the possibility that a subsequent request of the feed will not contain every update since the last time the feed was requested. E.g. a feed reader could read the BBC news headlines on a Monday, and then again on Tuesday but it is not guaranteed to receive every news headline that happened between Monday and Tuesday. The BBC news feed generator may be configured to only send the 50 most recent headlines, if there were 60 headlines between Monday and Tuesday, 10 would be “lost”. This would be a problem for using a feed for “harvesting” content from a remote repository as the feed would never again contain the “lost” updates and as such the resource would never appear in the subscribing repository.
- Polling period: How often should the feed document be requested from the feed generator i.e. how frequently should the subscribing repository check for updates? Depending on the frequency of additions/updates/deletions to the remote service this may need to be very regular in an attempt to limit “missing items” (see earlier).
- Feed formats: Which feed formats should be supported?
- Metadata formats: Which metadata formats should be supported within feeds e.g. Dublin Core, LOM etc. It is highly likely that different feed generators may supply different metadata formats.
- Repository Required Metadata Profile: It is common for a repository to implement a minimum metadata a profile which each resource submitted to the repository should meet e.g. must have title, description, keywords, licence etc. How can this be managed when a user is not submitting the item and has the opportunity to add metadata if necessary – harvesting is an automatic process.
- Licensing Content: How can items within a feed indicate they are bound by a specific license? Is this using a standard such as RDFa or non-standard strings e.g. “Licensed under Creative Commons 2.0 Attribution”. Does the repository support the license specified in the metadata? More fundamentally, there is presumption that deposit does not need to be followed up by human intervention to assign an appropriate licence to give authority for the subsequent release of the content for others to use.
- Links or resource download: The feed will contain links to resources, should the repository simply store the link or should the link be “followed” and the content downloaded and stored in the repository? This is of course first and foremost a policy question but it has technical and operational implications.
Current Approaches
This section will give examples of current practice as well as example RSS Feeds.
Contact Us
Lisa J Rogers
Phil Barker