Comparing and Reconciling Functional Annotations


When curating the ERGOTM database, it is often necessary to review the automated functional assignments that are routinely performed when new genomes are added to ERGOTM. The purpose is two-fold: first, to identify open reading frames for which the automated assignment was unable to assign a function. And second, to examine the automated assignments in the context of selected user models or external annotations and determine if the functional assignment is reasonable.

Reconcile Differences

Open the Statistics page for the organism of interest. Click the link at the bottom of the page labeled “Compare/Reconcile assignment differences”. This opens the Reconcile User Model page.

Displayed is a table illustrating the User / Function Assignment differences – which lists the number of reading frames in which the user and the assigned function differ, and Pathway assertion differences. Here the user can select the different user models that may require reconciliation.

In the example page shown below (Bacillus subtilis, BS), there are many user models to choose from because this organism has been curated extensively by both manual curation and external annotation. For most organisms new to ERGOTM, the list will be considerably shorter and likely consist of COG, GenBank? , and Pfam.

Reconcile Functions

It is important to note that the comparison between user models is based entirely on syntax. If the functional assignments differ at all in syntax, then it will be reflected in this list. For example, the annotations “Pyridoxine biosynthesis enzyme” and “pyridoxine biosynthesis protein” are clearly not in conflict, but because the syntax differs, they will be considered different annotations in this tool.

Select the user models you wish to compare, then at the bottom of the page, click “Reconcile functions”. This loads the page displaying the differences that have been found between the selected user models and the current master function. In the example page below, the user models COG, GenBank? , and Pfam had been selected.

Function Differences

Annotation Unassigned ORFs

To annotate open reading frames that were unable to be annotated by automated assignment, scroll down the list of differences and identify those open reading frames in which the functional assignment in the right-most column is empty.

Note that the header in this column will indicate that the assignment belongs to the current user (for example, “guest’s function assignment”). In fact, the assignment in this column is the master function that was assigned either by automatic assignment or by a master annotator.

For each open reading frame with an empty function assignment, open up the Protein Page and use the available internal and external tools to elucidate a possible function. There is a formal process for annotating functions, see: Annotating an Open Reading Frame. Continue the process until every open reading frame contains a non-blank annotation.

Review Automated Assignments

Occasionally the automatic assignment for open reading frame may not be reasonable based on the available information. This may result from the propagation of poor or incorrect annotations. Therefore, the functions that have been automatically assigned may require review.

To examine automatic assignments, select the desired user models on the Reconcile difference page and click “Reconcile functions”. This loads the page displaying the differences that have been found between the selected user models and the current master function. For each open reading frame, verify that the master function is not in conflict with the assignments of the user models.

The following example illustrates a difference between user models and the master function where there is no conflict:

No Confilct

The follow illustrates an example in which a possible conflict might arise:

Possible Conflict

Understanding Conflicts

It is important to note that an observed conflict does not mean that the automatic assignment is incorrect. It only means that more investigation is required.

If a potential conflict is apparent, open the link to the Protein Page and use the tools and the similarity table to understand how the automatic assignment was obtained. In most conflict situations, the automatic assignment is based on homology to similar open reading frames in ERGOTM.

It is the responsibility of the curator to examine all the evidence for the assignment and determine if the assignment is reasonable. If it is determined that, based on the evidence, the function derived from the automatic assignment is unreasonable; then in order to prevent further error propagation, re-annotation of all open reading frames on the similarity table that have also been assigned the unreasonable function may be required.

However, great care must be taken in determining which open reading frames in the similarity table have also been unreasonably annotated. It may be necessary to use Smith-Waterman P-scores and/or Protein Pages.

Topic attachments
I Attachment Action Size Date Who Comment
pngpng IMP_dehydrogenase.png manage 93.1 K 24 Jul 2009 - 22:30 TravisHarrison  
jpgjpg Reconcile_Functions.jpg manage 195.8 K 24 Jul 2009 - 22:29 TravisHarrison  
pngpng phosphorelay_inhibitor.png manage 59.1 K 24 Jul 2009 - 22:30 TravisHarrison  
pngpng reconcile_functions_-_2.png manage 518.7 K 24 Jul 2009 - 22:30 TravisHarrison  
Topic revision: r1 - 24 Jul 2009 - 22:48:40 - TravisHarrison
 
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback