New Computational Model Predicts Gene Function

By LabMedica International staff writers
Posted on 04 Mar 2010
Scientists have created a new computational model that can be utilized to predict gene function of uncharacterized plant genes with unprecedented speed and accuracy. The network, dubbed AraNet, has over 19,600 genes associated to each other by over one million links and can increase the discovery rate of new genes affiliated with a given trait tenfold. It is a huge advance to essential plant biology and agricultural research.

Of spite of the immense progress in functional characterization of plant genomes, over 30% of the 30,000 Arabidopsis genes have not yet been functionally characterized. Another third has little evidence regarding their role in the plant. "In essence, AraNet is based on the simple idea that genes that physically reside in the same neighborhood, or turn on in concert with one another are probably associated with similar traits,” explained corresponding author Dr. Sue Rhee at the Carnegie Institution for Science's (Washington, DC, USA) department of plant biology. "We call it guilt by association. Based on over 50 million scientific observations, AraNet contains over one million linkages of the 19,600 genes in the tiny, experimental mustard plant Arabidopsis thaliana. We made a map of the associations and demonstrated that we can use the network to propose that uncharacterized genes are linked to specific traits based on the strength of their associations with genes already known to be linked to those characteristics.”

The network allows for two main types of assessable theories. The first uses a set of genes known to be involved in a biologic process such as stress responses, as a "bait” to find new genes ("prey") involved in stress responses. The bait genes are linked to each other based on over 24 different types of experiments or computations. If they are linked to each other much more frequently or strongly than by chance, one can hypothesize that other genes that are as well linked to the bait genes have a high probability of being involved in the same process. The second testable hypothesis is to predict functions for uncharacterized genes. There are 4,479 uncharacterized genes in AraNet that have links to ones that have been characterized, so a significant portion of all the unknowns now gets a new hint as to their function.

The scientists tested the accuracy of AraNet with computational validation tests and laboratory experiments on genes that the network predicted as related. The researchers selected three uncharacterized genes. Two of them exhibited phenotypes that AraNet predicted. One is a gene that regulates drought sensitivity, now named Drought-sensitive 1 (Drs1). The other regulates lateral root development, called lateral root stimulator 1 (Lrs1). The researchers discovered that the network is much stronger forecasting correct associations than previous small-scale networks of Arabidopsis genes.

"Plants, animals, and other organisms share a surprising number of the same or similar genes--particularly those that arose early in evolution and were retained as organisms differentiated over time,” commented a lead and corresponding author Insuk Lee at Yonsei University of South Korea (Seoul). "AraNet not only contains information from plant genes, it also incorporates data from other organisms. We wanted to know how much of the system's accuracy was a result of plant data versus nonplant-derived data. We found that although the plant linkages provided most of the predictive power, the nonplant linkages were a significant contributor.”

"AraNet has the potential to help realize the promise of genomics in plant engineering and personalized medicine,” remarked Dr. Rhee. "A main bottleneck has been the huge portion of genes with unknown function, even in model organisms that have been studied intensively. We need innovative ways of discovering gene function and AraNet is a perfect example of such innovation. Food security is no longer taken for granted in the fast-paced milieu of the changing climate and globalized economy of the 21st century.”

The investigators published their findings January 31, 2010, in the advanced online issue of the journal Nature Biotechnology.

"Innovations in the basic understanding of plants and effective application of that knowledge in the field are essential to meet this challenge. Numerous genome-scale projects are underway for several plant species. However, new strategies to identify candidate genes for specific plant traits systematically by leveraging these high-throughput, genome-scale experimental data are lagging. AraNet integrates all such data and provides a rational, statistical assessment of the likelihood of genes functioning in particular traits, thereby assisting scientists to design experiments to discover gene function. AraNet will become an essential component of the next-generation plant research,” concluded Dr. Rhee.

Related Links:
Carnegie Institution for Science



Latest BioResearch News