CETISFIS Sept10 Conformance Notes

''Notes from the discussion group on conformance at the September 24th CETIS Meeting, "The Future of Interoperability Standards - Technical Approaches". They are based on hand-written notes by Adam Cooper that were taken at the meeting and slightly amplified on typing-up. See the page history for change logs.''

Why Conformance
We started by rehearsing some of the motivation for conformance, specifically testing with a view to arriving assertions of conformity, i.e. product assurance. We also noted that not all testing is for the purposes of assurance. In summary, we identified 3 key benefits:
 * Conformance testing reduces overall cost to suppliers and paying for testing can represent an overall saving. So long as customer expectations are set appropriately, the testing does not have to be 100% reliable.
 * Following on from the above, and taking account of the positive attitude of consumer groups to more reliability, testing represents a viable business model for standards-related organisations subject to cost and market size considerations. Testing/certification is already a significant part of national standards bodies (e.g. BSI in the UK) although not in ICT for learning, education and training (LET). Vendors often find a certification fee easier to justify than a membership fee.
 * Conformance testing helps to build a market

The Current State of Affairs
The motivation for conformance testing is not something we have only just discovered as a community. It has been recognised since the earliest stages of applying ICT to LET. So far, however, all attempts have been in some way troubled.

Even for a relatively simple specification that many people essentially understand, it is problematical to deliver reliable conformance testing. For example, IMS Content Packaging and the more market-oriented IMS Common Cartridge profile of it is still the subject of debate on interpretation of the finer points when it comes to writing down the Schematron rules for the Common Cartridge testing engine.

For more complex specifications, especially those with behavioural requirements, the problem of full coverage testing is essentially intractable, not feasible in an acceptable time.

It was observed that different communities in the UK appeared to have quite different attitudes to conformance. BECTA's work with content standards led to some confrontation stances from some suppliers. In contrast the ePortfolio community has been more inclined to collaborative approaches. We speculated about the reasons but had no real evidence.

For an established market, politics and rivalry pose a challenge to standards development and bringing conformance testing into the frame will inevitably increase this as the stakes are increased. This is a significant problem for relatively small markets (e.g. where national differences create niches)

Conformance, Testing and Community During Specification Development
In this section, we were talking about the creation conformance statements. These are useful to implementers irrespective of a conformance testing regime as they are clear indications of what their software must do (also must not, should, may).

We agreed that an agile and incremental approach to specification development is desirable which includes iteration over:
 * the model (data, messaging)
 * conformance statements
 * application development
 * development and use of a test engine/service (easily available for self-test on web if possible)

There is a parallel here with the use of "agile methodologies" and test-driven development; are there lessons to learn?

We noted that "plugfests", "codebashes" etc focussed on finding differences of interpretation and possible ambiguities are of greater value in the long run than if they have a focus on demonstrating interoperability (i.e. what works).

Since a small group will inevitably find fewer differences of interpretation etc than a larger group and since fresh eyes will discover where tacit assumptions are made by a core group of specification writers, there is a potential benefit for involving a wider community in the later stages of specification development. i.e. once the specification is essentially definited in scope and approach, it seems that lowering the financial and time/effort bar to participation in a community focussed on finding "bugs" could be beneficial.

The issue of how best to go about writing and organising conformance statements (and by extension how application profiles are created) seems unclear but is also relevant to questions of implementability (see the other FIS session). Formal languages for conformance in a specification are problematical both because they require elaborate effort, hence cost, and because they are not accessible to implementers. They are not a practical option for LET.

Idea: Draw together a comparison of different approaches to conformance statements from outside the LET domain and assess their relevance/value.

Conformance Testing as a Moving Target?
In an ideal world we would like a specification/standard to be delivered along with a reliable test but this has been found to be problematical in practice in LET. Actual use of the specification throws up ambiguities and omissions in the specification as implementers progressively move from the core/easy parts to more elaborate implementations and generally more people try to apply the specification to their ideas.

Consequently, we need to set expectations in the marketplace that offers of conformance testing are not absolute but will be continually improved. The starting level of test reliability still needs to be good enough to offer acceptable levels of assurance.

Idea: Establish a conformance regime that harnesses a community of implementers prepared to collaborate on a shared interpretation of issues that arise and to make these available swiftly in a pre-certification self-test. To be successful, probably needs a leader to bang heads together if necessary.

Humans and Conformance
Most of our discussion was around automated testing but we realised that this was possibly a false assumption. We didn't explore this much but it was pointed out that for somethings, human testing is essential. For example, content for interactive whiteboards must actually be usable; the interactions must be sensible and accessible.

Reference Implementation
The idea of a reference implementation being a way of circumventing some of the issues with conformance testing, spec ambiguities/interpretation and the difficulty of writing good conformance statements was raised.

During the meeting it was pointed out that "reference implementation" is used in various ways. E.g. it might be one of:
 * independent implementations of a spec that are found to interoperate, thus proving some level of assurance that a common interpretation is possible
 * a single implementation that is taken as the de-facto manifestation of the spec/standard (e.g. the Amazon EC2 management API as it is implemented effectively takes precedent over what any document says)
 * an implementation that attempts to rigorously implemented the specification as written. Anyone who can interoperate with this, or uses it as library code may claim "conformance".

The latter is clearly a very strong statement. Such an implementation would, to be realistic for an open standard, have to be Open Source (so the source code can be inspected) with a impartial and trustworthy committer and a broad body of submitters who are committed to fixing any errors and omissions. At oresent it does not seem likely that this is a realistic proposal.

Are we Barking up the Wrong Tree?
Dan Rehak commented that in civil engineering, the focus is frequently on the conformance to methods to design and construct a structure. The structure is not (generally?) tested. Attempts to find ways of proving software implementations have so far been elusive but maybe we could think a little more about ways to combine better-written specifications (better conformance statements etc) with human-based evaluation of the software.

(post-group speculation, Adam)How long would it take a pedantic person with a deep knowledge of a specification and programming language X and a reputation based on reliable judgment to conduct a code review on an implementation in X? How does this compare with the investment required for a machine-based test of equivalent reliability?

Summary of Ideas to Take Forward
These are ideas from the discussions we should consider doing something about Idea (relevant to CETIS and the LET standardisation community generally): Draw together a comparison of different approaches to conformance statements from outside the LET domain and assess their relevance/value.

Idea (relevant to spec/std creating organisation: Establish a conformance regime that harnesses a community of implementers prepared to collaborate on a shared interpretation of issues that arise and to make these available swiftly in a pre-certification self-test.