Tuesday, December 27, 2011

Overloading in define.xml - should it be allowed or forbidden?

Did you like the example from the previous post where metadata information from the study design is copied into (or imported into) the define.xml?
Then realize that some people currently working on define.xml 2.0 want to forbid this.
Their argument is that define.xml "is another animal".
They state that it should be forbidden to use any elements and attributes in define.xml files that are not explicitely mentioned in the specification, this although define.xml is an extension to the CDISC ODM standard.
So they want to forbid the use of "MeasurementUnitRef", "MeasurementUnitDef", "Question", Alias", and everything that is not explicitely mentioned in the define.xml specification.
In our interpretation of "extension" however (and this comes from the ODM specification itself), an extension means that one adds stuff, not that one removes stuff.
The ODM specification states:

Requirements for Vendor extensions to the ODM model are:
  1. The vendor must supply a XML Schema fully describing their extended ODM format.
  2. Extended ODM files should reference the proper extension Schema.
  3. The extension may add new XML elements and attributes, but may not render any standard ODM elements or attributes obsolete. Vendor extensions cannot be used for information that is normally expressed using other ODM elements.
  4. All new element and attribute names must use distinct XML namespaces to insure that there are no naming conflicts with other vendor extensions.
  5. Removing all vendor extensions from an extended ODM file must result in a meaningful and accurate standard ODM file.
  6. Vendors should be able to produce ODM files free of any vendor extensions upon request.
Applications that use extended ODM files must also accept standard ODM files.

Requirements 1 and 2 are fulfilled by the define.xml standard. Requirement 3 isn't in my opinion, as the "Name" attribute in ODM is meant for a "free text short description" whereas in define.xml it is abused (?) for the SDTM variable name (enumerated). Instead, "def:Label" is used for what should essentially go into "Name". But that's a minor issue which I hope will be repaired in the next version of define.xml.
Requirement 4 is fulfilled: the additional elements and attributes are in a separate namespace.
The next requirement (requirement 5) is more problematic. Does a define.xml file without the additional (define-specific) attributes and elements still make sense? Partially I think.
But if the define.xml standard forbids us to use regular ODM elements such as "MeasurementUnit" (reference and definition), or "Question", isn't that data loss, and doesn't this violoate requirement 5?
Requirement 6 is not a problem: it is easy to remove extension elements and attributes, e.g. using a simple XSLT stylesheet.

What about the requirement "Applications that use extended ODM files must also accept standard ODM files"?
That is highly problematic! This would require applications such as used by the FDA to accept standard ODM files, such as define.xml files that contain the snippet from the previous post.
But some of the define.xml team would like to see that elements such as "MeasurementUnit" and "Question" are marked by receiving applications as "non-standard"! Isn't that in full contradiction and conflict with the above rule?

What I want to say is that if one decides to define define.xml as an ODM extension, then the result should obey the rules for ODM extensions (as given above).
If one wants to define define.xml as a "restriction" of ODM, one should not use the extension mechanism, but write a complete new XML-schema, not based on ODM, as ODM does not foresee a mechanism for restrictions.

The next post will probably go about using ODM-XML for submitting SDTM, SEND and ADaM data to the FDA.

No comments:

Post a Comment