Talk:Assessment item banks: an academic perspective

Thank you to everyone who offered comments on this paper during the review period. You are still welcome to comment on these papers or on the issues they (or the comments themselves) raise, but will no longer be raising them with the author or further revising the reports. Please sign you comments (you can do this automatically by appending --~ or clicking on the signature button above the edit box.

If would like to discuss any of the issues raised in these papers, you may also post to either or both of the JISC CETIS Assessment and Metadata and Digital Repository JISCMail lists. --Rowin 13:12, 29 March 2007 (BST) and --Philb 09:54, 4 June 2007 (BST)

Negative marking without negative marks
Questionmark Perception does have a way to stop the final mark for a question becoming negative if you use negative marking. You can define a floor for a question in the question editor, this is typically set to zero and means that the question score cannot go below zero. This is usable in all question types. John Kleeman, Product Manager, Questionmark.
 * Thank you for this clarification; we'll revise the paper to make this clear. --Rowin 13:12, 29 March 2007 (BST)

Title of the paper
the title doesn't seem to match the content :)
 * the easiest change would be to "an academic's perspective" - but I'm not sure that that captures what this paper is about. --Johnr 17:05, 26 March 2007 (BST)
 * The paper is from the perspective of an academic... we're not very happy with the title either and are open to further suggestions. --Rowin 14:01, 29 March 2007 (BST)

Significance of interoperability
the paper presupposes that available technology and standards should be used simply because they exist. In particular the paper suggests that interoperability is the most significant consideration when setting up an item bank. This may be true for software engineers working at a distance from real assessment environments but it is not the case for those grappling with practical set up problems, whether as a single institution or group of institutions working together. --User:Blogs 11.01, 27 March 2007 (BST)
 * We strongly disagree with this comment. The paper does not say that interoperability standards are crucial for setting up an item bank, but for inter-institution question sharing in which interoperability is indeed a signficiant issue.  This comment seems to be relevant only to a single perspective of closed question banks whereas the focus of this paper is on assessment for learning, and storing and sharing items to support that.  It should also be noted that this paper is written from the point of view of one 'grappling with practical set up problems' who is an academic with considerable experience of developing and using such resources with real learners. --Rowin 14:01, 29 March 2007 (BST)

I stand by my comments and would point you to the sentence "Interoperability is the most important single factor in the implementation and use of an inter-institutional question sharing scheme" to back up my initial point. I too come from a practical set up perspective with experience of delivering items as part of exams that affect REAL learners (I am not sure what you mean here but my students are definitely alive also). I would say without hesitating that whilst interoperability is a consideration it does not affect most people grappling with the practical set up of an item bank. The questions I get from others interested in setting up item banks are about the process of developing high quality items and not software interoperability. --User:Blogs 15.46, 2 April 2007 (BST)

Summative assessment
The paper addresses some of the differences between formative and summative tests but perhaps does not stress enough that the real benefit insitutions are looking for in new technology in the context of summative examining is support for robust item and assessment handling, specifically transparent and defensible quality assurance processes. --User:Blogs 11.01, 27 March 2007 (BST)
 * Quality Assurance is out of scope for this work, as are related issues such as security and authentication. Although they are obviously of considerable importance for the integrity and delivery of assessment content, particularly in the context of summative assessment, they relate to the management systems and processes around the use of an item bank and not to the item bank itself. --Rowin 14:01, 29 March 2007 (BST)

Quality assurance metadata is central to item banking and whilst some workflows are better handled outside of an item banking tool I disagree over QA information which is item specific and often extremely detailed. To separate it from the item makes for very poor audit trailing. Moreover I am concerned by the comments above which seem to indicate that discussing what item banking means is over and done with - surely it is important to debate this and to have the opportunity to update our definitions. --User:Blogs 15.58, 2 April 2007 (BST)

QTI
QTI is a technology driven specification that does not hold water in high stakes assessment environments where item quality and assessment best practice will always be at the top of the agenda. Particularly when many institutions are still running paper based examinations (and with cohort sizes beyond PC cluster maximums) it becomes even less of an issue. --User:Blogs 11.01, 27 March 2007 (BST)
 * Again, we disagree very strongly with this comment. Contributors to the specification include UCLES/Cambridge Assessment, ETS, QuestionMark and OUNL, all organisations which are particularly prominent in the area of summative assessment.  Quality assurance and assessment best practice are orthogonal to QTI.  It should be noted that even where assessments are delivered on paper, some organisations are looking to QTI as a future-proof storage format for those items, particularly in high stakes contexts where preservation of assessment material is a major issue.  QTI is not 'technology driven', just technology enabled.  --Rowin 14:01, 29 March 2007 (BST)

By way of response I would say that for QTI to become more widely accepted it should follow an evidence based pattern of development where item formats that are proven statistically to be effective are included on this basis. The boundaries of QTI seem to have expanded on the basis of new possibilities rather than best practice options. The sector would benefit vastly from an evidence driven QTI specification which would then drive quality enhancement in the right direction. We are in danger of creating a whole new generation of true false item banks. --User:Blogs 15.54, 2 April 2007 (BST)

Glossary
I think, in particular, that this paper needs a glossary for the different assessment types and how they relate to each other. (i know these terms may be obvious but to a wider community they are not) --Johnr 17:05, 26 March 2007 (BST)
 * This is an excllent point - we'll add some initial definitions to the list of suggested terms below and welcome comments on these. --Rowin 14:01, 29 March 2007 (BST)

Suggested terms:
 * distracter
 * adaptive assessment
 * personalisation (with respect to assessment)
 * high stakes assessment
 * formative assessment
 * summative assessment
 * diagnostic assessment
 * examination (vs assessment)
 * cool uri

Interoperability
paragraph 2; sentences 1 and 2 (The quality ... summative use.) seem to completely be at odds with sentence 3 (It is important... contexts) and no resolution or comparasion is offered. sentence 4 (It is also..) and following should perhaps be a seperate paragraph. --Johnr 17:05, 26 March 2007 (BST)
 * We were a bit puzzled by what was meant by this. Would revising the first sentence to read 'the quality of such questions CONTRIBUTED from different authors...' help clarify what was meant?  Sentence 4 onwards should not be separated as the paragraph is about the need for a subject expert as editor.  --Rowin 14:01, 29 March 2007 (BST)

I'll try to explain my confusion

"The quality of such questions from different authors is, however, unlikely to be consistent. For example, authors are unlikely to include high quality formative feedback for questions that they are preparing for summative use. It is important, however, that questions in an item bank are consistent in their provision of feedback and marks, so that they can be used effectively in different ways and contexts. It is also important that the metadata of each record is accurate and complete, so that searching produces consistent results. This means that questions to be shared will need editing and cataloguing by a subject expert, who can find or supply any missing data or question information."

Is the quality in the first sentence the quality of the question (i.e. with respect to (wrt) the topic being asked about) or the quality of the item (as an interoperable packaged question)? The example seems to suggest that it is wrt the interoperability of the item as it addresses differences in the completeness of an item based on its original/ intended use.

The third sentence then states that it is important that questions consistently provide feedback and marks.

Is this third sentence talking about the variability in item quality or in question quality?

Presumably item quality (as feedback and marks are a function of an item) - so, is it saying that: or
 * in the midst of the variability between items (stated in sentence 1) any given item needs to provide consistent outputs (feedback and marks) in different contexts? (this thing will always perform in the same manner)
 * is it trying to say that however variable items are they need to be consistent with other items in the areas of feedback and marks? (in this area this thing needs to perform in the same way as all the other things) -this would seem to contradict the initial premise that questions/ items will be inconsistent.

From the example, it would also imply that we should always separate out items for formative use from those for summative use as they will intrinsically have different levels of completeness wrt feedback?

I'd suggested separating out the descriptive metadata bit as it was referring to the process of creating descriptions of a question and that descriptions use for retrieving questions/items. Rather than relating to the functionality of a item/ question (which was how i took the other sentences).

I hope that is clearer, I'm just not sure what sort of consistency is being suggested. but I may just be missing something :) --Johnr 15:14, 29 March 2007 (BST)