- no more 8-, 40-, and 200-character limitations
- limitations on test codes disappear (this e.g. hindered us to use a LOINC code for LBTESTCD, as the LOINC code starts with a number, like 1234-5)
- supplemental qualifiers can remain in their parent domain/dataset. All that needs to be done is to flag them as such in the define.xml file (e.g. using Role="SUPPLEMENTAL QUALIFIER")
- no more splitting of information over different fields (the COVAL1, COLVAL2, ... disaster) or even datasets ("banning" of information with over 200 characters to supplemental qualifier datasets)
- perfect fit with define.xml 1.0 and 2.0: validation of SDTM/SEND/ADaM datasets against the define.xml now really becomes a piece-of-cake. Both the metadata (define.xml) as well as the data (the new format) now use the same format - both are extended ODM.
- we can now really obtain end-to-end as we now have one format to transport information from study design to submission. Of course the contents will differ, but at least we do not need to switch between formats (and technologies) during the process
- XML is the format of choice for exchange of information in the modern world. This also means that an enormous amount of software programs and software libraries are available for working with XML
- real vendor-neutrality: As ODM is an open standard (SAS-XPT was semi-open, it was very hard to implement in software) anyone with some basic XML knowledge can now develop great software with great features that work with SDTM/SEND/ADaM datasets. In the >20 years of SAS-XPT for SDTM, I haven't seen a single successful third party software programm using it.
- tools to transform existing SAS-XPT datasets into the new format
- tools to transform files in the new format to the old SAS-XPT format (but who would like to do so?)
- tools or scripts for loading the datasets in popular statistical software packages
- a viewer for inspecting datasets in the new format
The latter was just a viewer for SAS-XPT files, it was not "SDTM-savvy", it even did not understand what SDTM is about or how it works.
The picture below shows a few of the first features that were implemented sofar:
- simple SDTM/SEND/ADaM validation such as uniqueness of the USUBJID in the DM dataset
- check whether the subject is really present in the DM dataset
- validation whether all required/expected fields really have a value
- validation of dates: is the date a real existing date (2013-03-32 is not), does RFENDTC really come after RFSTDTC?
- calculation of age from BRTHDTC (when present) and RFSTDTC and checking against the value given in AGE
- display of "date of first study medication exposure" and "date of last study medication exposure" as retrieved from the EX dataset in the DM dataset. The latter means that we can now remove RFXSTDTC and RFENDTC from the DM domain - they should never have been there as they are copied from EX
The columns containing these are colored somewhat differently (that information is retrieved from the define.xml). For ease of use, the USUBJID column has been shifted.
Other features that have been implemented, but which can be better demonstrated using a movie (soon to come, stay tuned) are one-click "jumping" to the corresponding record in the DM dataset (and back), one-click jumping from a comment record in CO to its parent record in another dataset and few-click jumping from a RELREC record to its parent records.
Of course the software also allows sorting and filtering. For example, one can first load the DM dataset and e.g. filter all subjects above a certain age, and than load other datasets for those subjects only. This feature will probably make life of reviewers much much easier.
Another small feature I implemented is highlighting of values (--STRESN) that are outside the reference range (defined by --STNRLO and --STNRHI) for all findings datasets.
Now you will probably ask what the cost of this viewer software will be. The answer is "nothing". It will become available for free as "open source" with a license similar to that of OpenCDISC. So reviewers at the FDA will be able to use it for free from day 1, and users at sponsor companies will have the same tool available as what the FDA reviewers are using. Even more important: as the tool will be open source, everyone can extend it, add great new features, for example for analysis, visualization, etc.
Stay tuned for more information and the public release announcement!