PDB Parser Output

TEST CASES FOR THE PDBPARSER2.0.PL. SCREEN CAPTURES SHOWING PARTIAL OUTPUT OF EACH TEST CASE: ("..." represent missing parts of output which is insignificant for the test) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%% TEST for handling DUPLICATE ELEMENT NAMES %%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% << program should print the duplicate element names and the line numbers where they were found in the DTD file >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% C:\users\cat\bioProject\bioworkarea\code>perl pdbparser2.0.pl **************************************************************************************** *** PDBParser *** *** Version: 2.0-1 *** *** *** *** PDBParser will convert a Bioinformatics's PDB file format to xml format, based *** *** on a supplied DTD. *** *** *** *** Project: URI(tm) Universal Research Interchange Format *** *** *** *** Legal: Copyright (C) 2004, URI, Bioinformatics, CSC592 *** *** *** **************************************************************************************** The name of a Bioinformatics's PDB input file will be needed. Here is a list of input files in the current directory. Bioinformatics PDB (ent): pdb12e8.ent pdb1mcp.ent Please specify [pdb12e8.ent|pdb1mcp.ent](pdb12e8.ent): The name of a Bioinformatics's DTD input file will be needed. Here is a list of input files in the current directory. Bioinformatics DTD (dtd): DTD_URI.dtd DTD_URI_2_duplicate.dtd Bioinformatics DTD (dtd): URI_DTD-04-15-04.dtd URI_DTD_err1_fixed.dtd Bioinformatics DTD (dtd): URI_DTD_org.dtd Please specify [](DTD_URI.dtd):DTD_URI_2_duplicate.dtd ****************************************** **** General DTD Information **** ****************************************** ************************************ **** DTD File Declared Definitions: ************************************ Element Count: 261 Attributes Count: 128 Entity Count: 0 ****************************************************************** **** Error: Errors were detected while reading the DTD file **** ****************************************************************** Duplicate Element (48): concaten (#PCDATA) Duplicate Element (108): authors (#PCDATA) Duplicate Element (109): title (#PCDATA) Duplicate Element (135): authors (#PCDATA) Duplicate Element (136): title (#PCDATA) Duplicate Element (137): editors (#PCDATA) Duplicate Element (138): to_be_pulished (#PCDATA) Duplicate Element (139): journal_abbrev (#PCDATA) Duplicate Element (140): journal_vol (#PCDATA) Duplicate Element (141): first_page (#PCDATA) Duplicate Element (142): year (#PCDATA) Duplicate Element (143): publishers (#PCDATA) Duplicate Element (144): journal_id_ASTM (#PCDATA) Duplicate Element (145): country (#PCDATA) ... ... ... ****************************************** **** End of General DTD Information **** ****************************************** PDB Parser has exited due to DTD errors. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%% TEST for handling ORPHANE ELEMENTS %%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% << program should print the orphan element names and the line numbers where they were found in the DTD file >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ... ... ... Please specify [pdb12e8.ent|pdb1mcp.ent](pdb12e8.ent): The name of a Bioinformatics's DTD input file will be needed. Here is a list of input files in the current directory. Bioinformatics DTD (dtd): DTD_URI.dtd DTD_URI_2_duplicate.dtd Bioinformatics DTD (dtd): URI_DTD-04-15-04.dtd URI_DTD_err1_fixed.dtd Bioinformatics DTD (dtd): URI_DTD_org.dtd Please specify [](DTD_URI.dtd):URI_DTD_org.dtd ... ... ... ********************* Element: hetnams Attributes: non-polyer_seqs_type CDATA #FIXED "HETNAM" Category: LIST Child Elements: hetname* ********************* ****************************************** **** General DTD Information **** ****************************************** ************************************ **** DTD File Declared Definitions: ************************************ Element Count: 483 Attributes Count: 274 Entity Count: 0 *************************** **** DTD Tree Definitions: *************************** Root Element: URI_protein Proclaimed Elements: 251 Associated Elements: 249 Orphaned Elements: 234 Associated Attributes: 222 Orphaned Attributes: 52 ******************************************************************** **** Error: Errors were detected while building the DTD tree. **** ******************************************************************** Proclaimed Child (705): Parent Element "hetnams" proclaimed a child Element "hetname" but none exist in the Element Declaration list. ****************************************** **** End of General DTD Information **** ****************************************** PDB Parser has exited due to DTD errors. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%% TEST for handling ELEMENT ORPHANS, OR %%%%%%%%%% %%%%%%% ELEMENTS DECLARED OVER MULTIPLE LF-CR LINES %%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% << program should print the orphan element names and the line numbers where they were found in the DTD file >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% ... ... ... Please specify [pdb12e8.ent|pdb1mcp.ent](pdb12e8.ent): The name of a Bioinformatics's DTD input file will be needed. Here is a list of input files in the current directory. Bioinformatics DTD (dtd): DTD_URI.dtd DTD_URI_2_duplicate.dtd Bioinformatics DTD (dtd): URI_DTD-04-15-04.dtd URI_DTD_err1_fixed.dtd Bioinformatics DTD (dtd): URI_DTD_org.dtd Please specify [](DTD_URI.dtd):URI_DTD_err1_fixed.dtd ... ... ... ********************* Element: salt_atom_serial_num Attributes: None Category: #PCDATA ********************* ****************************************** **** General DTD Information **** ****************************************** ************************************ **** DTD File Declared Definitions: ************************************ Element Count: 483 Attributes Count: 274 Entity Count: 0 *************************** **** DTD Tree Definitions: *************************** Root Element: URI_protein Proclaimed Elements: 462 Associated Elements: 461 Orphaned Elements: 22 Associated Attributes: 271 Orphaned Attributes: 3 **************************************************************************** **** Error: The DTD tree was built without all declared definitions. **** **** **** **** Element orphans exist when it or one of its ancestors are **** **** not proclaimed by a parent Element. Note: The number of **** **** Proclaimed and Associated Elements to determine the number **** **** of disassociated trees branches possibly causing multiple **** **** orphans. Attribute orphans exist when its Element is not **** **** declared or a parent Element does not proclaim the **** **** Attribute's Element as its child Element. **** **************************************************************************** **************************** **** Unassociated Elements: **************************** Orphan Element (136): "journal_id_ASTM (#PCDATA)" Orphan Element (137): "country (#PCDATA)" Orphan Element (138): "journal_id_ISSN (#PCDATA)" Orphan Element (139): "journal_id_ISBN (#PCDATA)" Orphan Element (140): "ccdc_pdb_code (#PCDATA)" Orphan Element (603): "db_struct_ref_db_accession (#PCDATA)" Orphan Element (604): "db_struct_ref_db_code (#PCDATA)" Orphan Element (606): "db_struct_ref_auth_align_begin (#PCDATA)" Orphan Element (607): "db_struct_ref_auth_insertion_begin (#PCDATA)" Orphan Element (609): "db_struct_ref_auth_align_end (#PCDATA)" Orphan Element (610): "db_struct_ref_auth_insertion_end (#PCDATA)" Orphan Element (613): "struct_refs_seq_difs (struct_ref_seq_dif*)" Orphan Element (616): "struct_ref_seq_dif (struct_ref_seq_pdb_id*, struct_ref_seq_aminoacid_id*, str uct_ref_seq_pdb_strand_id*, struct_ref_seq_db_name*, struct_ref_seq_db_accession*, struct_ref_seq_de tails*)" Orphan Element (620): "struct_ref_seq_pdb_id (#PCDATA)" Orphan Element (621): "struct_ref_seq_aminoacid_id (#PCDATA)" Orphan Element (622): "struct_ref_seq_pdb_strand_id (#PCDATA)" Orphan Element (623): "struct_ref_seq_db_name (#PCDATA)" Orphan Element (624): "struct_ref_seq_db_accession (#PCDATA)" Orphan Element (625): "struct_ref_seq_details (#PCDATA)" Orphan Element (1008): "atom_auth_comp_id (#PCDATA)" Orphan Element (1009): "atom_auth_asym_id (#PCDATA)" Orphan Element (1010): "atom_auth_atom_id (#PCDATA)" ***************************************************** **** Attribute's Element or ancestor not proclaimed: ***************************************************** Orphan Attribute (614): (struct_refs_seq_difs) "pdb_id CDATA #IMPLIED" Orphan Attribute (617): (struct_ref_seq_dif) "struct_ref_seq_dif_id CDATA #REQUIRED" Orphan Attribute (618): (struct_ref_seq_dif) "seq_num CDATA #IMPLIED" ****************************************** **** End of General DTD Information **** ****************************************** PDB Parser has exited due to DTD errors. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%% TEST for successfully handling valid DTDs %%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% **************************************************************************************** *** PDBParser *** *** Version: 2.0-1 *** *** *** *** PDBParser will convert a Bioinformatics's PDB file format to xml format, based *** *** on a supplied DTD. *** *** *** *** Project: URI(tm) Universal Research Interchange Format *** *** *** *** Legal: Copyright (C) 2004, URI, Bioinformatics, CSC592 *** *** *** **************************************************************************************** The name of a Bioinformatics's PDB input file will be needed. Here is a list of input files in the current directory. Bioinformatics PDB (ent): pdb1mcp.ent Please specify [pdb1mcp.ent](pdb1mcp.ent): The name of a Bioinformatics's DTD input file will be needed. Here is a list of input files in the current directory. Bioinformatics DTD (dtd): URI_DTD-04-03-04.dtd URI_DTD-04-15-04.dtd Bioinformatics DTD (dtd): URI_DTD_partially _fixed.dtd Please specify [](URI_DTD-04-03-04.dtd):URI_DTD-04-15-04.dtd ********************* Element: URI_protein Attributes: pdb_id CDATA #REQUIRED Category: LIST Child Elements: attributes? annotation? seq_data? sites? crystal_cell? Child Elements: orig_matrices? model_atoms? connections? bookkeep_informat ********************* ********************* Element: attributes Attributes: pdb_id CDATA #IMPLIED Category: LIST Child Elements: header* deletion* titles* error_warn* compounds* sources* Child Elements: keywords* ********************* ********************* Element: header Attributes: classification CDATA #IMPLIED Attributes: deposition_date CDATA #IMPLIED Attributes: pdb_id CDATA #IMPLIED Category: EMPTY ********************* ... ... ... ********************* Element: num_connections Attributes: None Category: #PCDATA ********************* ****************************************** **** General DTD Information **** ****************************************** ************************************ **** DTD File Declared Definitions: ************************************ Element Count: 491 Attributes Count: 275 Entity Count: 0 *************************** **** DTD Tree Definitions: *************************** Root Element: URI_protein Proclaimed Elements: 491 Associated Elements: 491 Orphaned Elements: 0 Associated Attributes: 275 Orphaned Attributes: 0 ****************************************** **** End of General DTD Information **** ****************************************** New converted L Sequence: DIVMTQSQKFMSTSVGDRVSITCKASQNVGTAVAWYQQKPGQSPKLMIYSASNRYTGVPDRFTGSGSGTDFTLTISNMQSEDLADYFCQQYSSYPLTFGA GTKLELKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSATDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKT STSPIVKSFNRNEC New converted H Sequence: EVQLQQSGAEVVRSGASVKLSCTASGFNIKDYYIHWVKQRPEKGLEWIGWIDPEIGDTEYVPKFQGKATMTADTSSNTAYLQLSSLTSEDTAVYYCNAGH DYDRGRFPYWGQGTLVTVSAAKTTPPSVYPLAPGSAAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVPSSTWPSETV TCNVAHPASSTKVDKKIVPRD New converted M Sequence: DIVMTQSQKFMSTSVGDRVSITCKASQNVGTAVAWYQQKPGQSPKLMIYSASNRYTGVPDRFTGSGSGTDFTLTISNMQSEDLADYFCQQYSSYPLTFGA GTKLELKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSATDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKT STSPIVKSFNRNEC New converted P Sequence: EVQLQQSGAEVVRSGASVKLSCTASGFNIKDYYIHWVKQRPEKGLEWIGWIDPEIGDTEYVPKFQGKATMTADTSSNTAYLQLSSLTSEDTAVYYCNAGH DYDRGRFPYWGQGTLVTVSAAKTTPPSVYPLAPGSAAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVPSSTWPSETV TCNVAHPASSTKVDKKIVPRD PDB Parser has completed with Success. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%