The PSI website has moved to a new address: http://www.psidev.info/

This site is no longer being updated

You should be automatically forwarded to the new MI format page shortly



Proteomics Standards Initiative

Molecular Interaction XML Format Documentation

Version 2.5

Released December 5, 2005

Last maintenance update
June 1, 2006,
to version 2.5.3

 

Table of Contents

  1. Introduction
  2. Purpose of the PSI MI XML format
  3. Purpose of this document
  4. Directory structure
  5. Release schedule
  6. Changes from PSI MI 1.0 to 2.5
  7. Maintenance releases
  8. Detailed Documentation
  9. Use of external controlled vocabularies
  10. List of planned features
  11. How to comment
  12. Available data
  13. Tools
  14. Data submission
  15. Further information and relevant links

Introduction

The Proteomics Standards Initiative (PSI) aims to define community standards for data representation in proteomics to facilitate data comparison, exchange and verification. For detailed information on all PSI activities, please see http://psidev.sf.net.

This document decribes the molecular interaction data exchange format. PSI is following a leveled approach to building this specification. This document describes level 2.5. For documentation of the previous level 1.0 please see http://psidev.sourceforge.net/mi/xml/doc/user/.

Level 2.0 beta was never officially released. This version, 2.5, is the released successor of 1.0.

PSI MI was designed by a consortium of molecular interaction data providers from both academia and industry, including BIND, DIP,  IntAct, MINT, MIPS, GlaxoSmithKline, CellZome, Hybrigenics, Universities of Bielefeld, Bordeaux, Cambridge, and others.

Purpose of the PSI MI XML format

The PSI MI format is a data exchange format for molecular interactions. It is not a proposed database structure.

Purpose of this document

The purpose of this document is to describe the general structure of the PSI MI XML specification in a more user-friendly manner than the specification does itself. For the detailed and most up-to-date description please see the auto-generated documentation. This documentation will also provide additional information, e.g. sample data.
The XML schema is located at http://psidev.sourceforge.net/mi/rel25/src/MIF25.xsd

Directory structure

This document is in the root directory of the PSI MI 2.5 release. Subdirectories are

doc/     Auto-generated documentation of the PSI MI XML schema

src/       Source code for schema and related software

data/     Controlled vocabularies

tools/    Data management tools

Release schedule

  • Level 2.5 has been released in early December 2005. It is now supported by most PSI partners.
  • Level 1.0 support is planned to be continued until at least summer 2006.

Changes from PSI MI 1.0 to 2.5

Changes in the PSI MI XML format and controlled vocabularies from version 1.0 to 2.5 are documented in changes1to25.html .

 

Maintenance releases

  • 2.5.3:
    Minor updates as a result of the PSI spring meeting in San Francisco, April 2006:
    • updated entrySet@minorVersion to 3
    • bioSourceType@taxId now mandatory. This was inadvertedly made non-mandatory with 2.5.2.
    • featuretype@id now mandatory. This was inadvertedly made non-mandatory with 2.5.2.
    • Optional attribute parameter/uncertainty added.
    • Added participant/parameterList and participant/attributeList to allow more complex modelling of participants.
    • Deleted XML constraints on entry level. They were not working due to syntax errors, and few XML validators can check them. This validation level will be performed by the PSI XML validator in the future.
  • 2.5.2:
    There were some inconsistencies in the naming of complex types in 2.5.1. These have been fixed. This has no impact on the XML data files. The only impact for users is a facilitation if they use code generators. Concrete changes:
    • complex type interactorType to interactorElementType
    • complex type interactionType to interactionElementType
    • complex type featureType to featureElementType
    • featureType@id has been moved into the complex type featureElementType and typed as xs:int
    • bioSourceType@ncbiTaxId has been moved into the complex type bioSourceType and typed as xs:int
    • updated entrySet@minorVersion to 2
  • 2.5.1:
    At the PSI meeting in Geneva, September 2005, it was discussed that participant/experimentalFormList/experimentalForm should have the possibility to assign a position, e.g. to describe an n-terminal protein modification in an experiment. It was decided to implement this using the existing featureType. The controlled vocabulary has been updated accordingly, but the change in the XML schema was not implemented. This required the maintenance release 2.5.1, with the following changes:
    • Added entrySet@minorVersion, fixed to 1
    • Deleted participant/experimentalFormList/experimentalForm

Detailed Documentation

see http://psidev.sourceforge.net/mi/rel25/doc/

Use of external controlled vocabularies

Where possible, external controlled vocabularies are referenced from PSI MI. External controlled vocabularies are used in two forms:

  • Open controlled vocabularies: We think that no existing controlled vocabulary provides all necessary terms for the given attribute in the PSI MI format. In this case, it is up to the data provider to choose a controlled vocabulary, or to provide a free text string if no appropriate controlled vocabulary exists.
  • Closed controlled vocabularies: We think that there is a controlled vocabulary which appropriately covers all necessary terms for the given attribute. In this case, only terms from the defined vocabulary should be used.

Format change

We now support only the new OBO format, not any more the previous DAG-edit format.

 

Data

The closed controlled vocabularies referenced by PSI MI are listed in the table below. All vocabularies are contained in a files in OBO flat file format: psi-mi25.obo. They can be browsed at http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=MI. The correctness of references to external controlled vocabularies is currently not enforced by the PSI MI schema. It is the responsibility of the data provider to ensure that only existing terms at an up-to-date data source are referenced.

PSI MI XML schema elements and OBO major terms

PSI MI XML level 2.5 data element

term name

PSI-MI identifier

experimentType/participantIdentificationMethod

participant identification method

MI:0002

experimentType/interactionDetectionMethod

interaction detection method

MI:0001

interactionElementType/interactionType

interaction type

MI:0190

interactionElementType/participantList/participant/biologicalRole

biological role
Example: enzyme

MI:0500

interactionElementType/participantList/participant/experimentalPreparationList/experimentalPreparation

experimental preparation

MI:0346

interactionElementType/experimentalRoleList/experimentalRole

experimental role
Example: bait

MI:0495

featureType/featureDetectionMethod

feature detection method

MI:0003

featureType/featureType

feature type

MI:0116

'featureType/featureRangeList/featureRange/baseLocationType/startStatus/'  and '../endStatus/

feature range status

MI:0333

interactorType/interactorType

interactor type

MI:0313

xrefType/*/dbAc

database citation

MI:0444

xrefType/*/refTypeAc

refType

MI:0353

namesType/alias/typeAc

alias type

MI:0300

attributeListType/attribute/nameAc

attribute name

MI:0590

 

Obsolete terms

In the previous DAG format, which was split into *.dag and *.def files, obsolete MI terms were reported as children of the node “obsolete (MI:0431)”. The newer OBO format has a special class “obsolete”, to which all obsolete PSI MI terms are assigned.

 

Mapping from OBO to MIF25 format

We recommend the following mapping from the file psi-mi25.obo to PSI MI 2.5 XML files:

 

OBO format element

PSI MI 2.5 XML file element

id

cvType/xref/primaryRef/id

name

cvType/names/fullName

exact_synonym

cvType/names/shortLabel

synonym

cvType/names/alias

 

Mapping controlled vocabularies between PSI 1.0 and 2.5

The major change from PSI 1.0 to 2.5 requires a remapping of controlled vocabularies.

Proposed mappings from PSI 1.0 to 2.5 CVs are described in cv-1to25mapping.doc .

The reverse mapping is described in cv-25to1mapping.txt . This file is presented in plain text format to facilitate parsing.

List of planned features

Because we are following a leveled approach, we are interested in knowing what the community wishes to be included in the next level.

The latest list of features to discuss/include in the future can be found here:
http://sourceforge.net/tracker/?atid=511101&group_id=65472&func=browse

How to comment

If you would like to comment on this document, the PSI MI XML specification, please send a mail to:
psidev-mi-dev@lists.sourceforge.net 

Available data

Tools

Data submission

The following databases currently accept submissions of PSI MI formatted interaction data:

Further information and relevant links

Databases involved:

Companies involved:

Related Efforts:


Henning Hermjakob, hhe@ebi.ac.uk, 22-MAY-2006