This is an archive project page.

Welcome to the URI SOFTWARE SYSTEM Web Page 

From this Web Page you can download all the files part of the URI SOFTWARE SYSTEM Project. 

(To Download the files, right-click on their names and choose "Save Target As..")

ABSTRACT

Background: The RCSB (Research Collaboratory for Structural Bioinformatics) Protein Data Bank (PDB, www.rcsb.org) is a worldwide repository for 3-D biological macromolecular structure data. A reengineered beta site released in 2004 (pdbbeta.rcsb.org) features improved primary data in new mmCIF and XML formats, results of the Data Uniformity Project.  The XML files were obtained using a two-step process: converting PDB to mmCIF, and then converting mmCIF to XML. The conversion software (pdb2cif and mmCIF_loader) is limited to the UNIX platform and not fully automated. To our knowledge, no direct PDB to XML converter is available.

Many biologists still work with files in old PDB format stored in their own collections on local computers and there is still a need for access support.  These files are formatted in plain text, organized in 80-character lines, restricted to fixed ranges of character positions defined in the PDB standard, are very long (many over 100 pages), and use abbreviated nametags.  Although this format is useful for computer applications, scientists find it time consuming to search for information. 

Results: The prototype Universal Research Interchange Format (URI) Software System that we propose allows scientists to convert locally stored PDB files to XML format and then access protein information using user-friendly query interfaces. Our URI PDB-XML Converter component inputs a PDB file and a DTD file (describing the XML model) and outputs a XML file. It also has built-in capability for producing one-letter sequences and calculating phi and psi angles, and including this information in the XML files with the PDB data. Our system features an extensible design to accommodate additional or modified queries, and will accommodate a different DTD. The software is documented and freely available.

Conclusions: The URI PDB-XML Converter is, to our knowledge, the first tool providing direct PDB to XML conversion. It is extensible, and easily modifiable to handle changes in format and XML schema. The URI software is designed to handle the addition of queries as needed and can be easily integrated with other software supporting a web interface, compiled or ad hoc queries, and a mapping from the XML file to a DBMS.

 

File Name: File Size: Description:

                    GENERAL PROJECT DOCUMENTATION:

 Paper-2005-08-28-BIOMED.doc 1047 KB  Project Manuscript ( Version dated 8/28/2005)

                    PDB-XML CONVERTER - RELATED FILES

 URI_DTD.dtd  40 KB  DTD (Document Type Definition) for our URI XML format.
 URI_PDB2XML.pl 284KB  URI PDB to XML Converter
 pdb_dtd_table.pm 52 KB  Converter Perl module defining the correspondence between the PDB record names and the DTD element names.
 URI-PDB-XML-Converter-Manual.doc 43 KB   User Manual for the URI PDB to XML Converter
 pdb1mcp.ent  

 pdb12e8.ent 

293 KB 

626 KB 

two full PDB files to test for the one-letter sequence function of the converter
 pdb1mcp.xml 2778 KB  XML file produced by converting the pdb1mcp.ent file
 pdb12e8.xml 5146 KB  XML file produced by converting the pdb12e8.ent file

                    PDB DATA QUERY - RELATED FILES

 1mcp_dtd_inter.xml 68 KB  Manually created XML file based on the initial DTD design ( URI_DTD_0.dtd - 46 KB). This was created before the design and implementation of the PDB to XML Converter program was finished, and it was used for the design of the URI-PDB
 GUI.zip 1108 KB  Archive containing a total 38 files (the GUI queries using XML, XSL, HTML), structured in 20 folders (parent folder named “Demo”); See the main document's Future Work section for more details.
 NewDatabaseDesign-07-07-04.zip 699 KB  Archive containing Database and database related files) (694 KB). Just as in the case of the XML queries mentioned above, the database is just a prototype reflecting an older DTD design, and containing an incomplete set of data from the 1MCP PDB file, the same that was manually entered into 1mcp_dtd_inter.xml . See the main document's Future Work section for more details.

 

We plan to submit the manuscript describing our URI Software System project for publication. Should the article be accepted, we will update this page with alink to our published article. This web page will continue to serve as homepage for all the project's files.