IPC Public talk:Community Portal

From IPC_Public

Jump to: navigation, search

[edit] Innovation in Land Plants and Comparative Genomics breakout group meeting notes

April 9th 2008

Participants:

  • Eric Lyons (elyons@nature.berkeley.edu)
  • Ann Blakey
  • Ken Clark
  • Shiran Pasternak
  • R Geeta
  • Steve Mount
  • Amy Litt
  • Manpreet Katari
  • Eva Huala
  • Lukas Muller
  • Todd Vision
  • Marcie McClure
  • JD Liu
  • Heidi Ledford
  • Jane Geisler-Lee
  • Luis Servin
  • Basil Nikolau

Introductions occur and state research interests:

  • Plant Genome Evolution
  • Polymerase/Reverse Transcriptases Evolution (gene family evolution)
  • Transposon identification and evolution
  • Gene and genome duplication, and downstream phenotypic diversity
  • Solenaceous genome consortium -- comparative genomics
  • TAIR -- Genotype->phentotype
  • Comparative genomics use in annotation genome
  • Genome visualization
  • Data integration for gene network building
  • Comparative genomics for seed development -- taking At data and mapping to non-model systems
  • Plant systematics and evodevo -- flower and fruit evolution
  • Dynamic collaborations across disparate disciplines
  • RNA splicing -- core and ancillary signals, development of splicing prediction algorithms. Unique evolution of plant splicing evolution
  • Evolution of morphology -- leaf evolution, underlying mechanism, use of comparative genomics in answering these questions.
  • maizesequence.org CSH -- high-throughput analysis of genetic data
  • gramene -- comparative mapping projects
  • Secondary metabolism evolution
  • EST databases and datasets
  • Plant phylogenetics
  • Identification of non-protein coding RNAs
  • Plant metabolism and metabolic diversity in the biosphere
  • Comparative genetics
  • Evolution of disease resistance
  • Meiosis, recombination, genome rearrangements, 3D chromatin visualization
  • Genomics->phenotypes for differential yield improvement of crop plants
  • Journalistic interests in topic.

This sampling of participants research interests obviously represents a diverse group -- is there a common theme? Specifically, although there are different research interests, can we come up with a common set of tools that will provide the most benefit to a wide range of research interests? This type of computational tool development probably falls under the domain of Foundation Tools.

Initial discussion centered on the following questions:

  • What datasets/databases are available?
    • Genome sequences, EST databases, expression data-sets, protein-protein networks, metabolic pathways, phenotype data (mutational analysis to disease resistance to insect's herbaceous preferences)
  • What biological questions shall we try to answer?
    • Example: Consequence of evoutionary pressures on phenotype.
  • Are there natural partitions of GC topics: metabolism, gene expression, etc.
  • How broad or narrow of a GC shall we focus on?
    • L. Stein's examples yesterday: DNA barcoding, transcriptome anlaysis, etc.

GC options: Reminder -- think big! Genotype to phenotype, identify modules of function, understanding evolution as a continuum

  • Understanding genome structure (bag of molecular parts versus internal structure with functional implications)
  • What makes a plant genome a plant genome and a plant a plant. Identify those plant components that are unique, and link genotype to phenotype.
  • When the genome changes, how does the transcriptome change, how does the methylome change, how do pathways/networks change?
  • Evolution of plant genome regulation (expression, structure, methylation)
  • Evolution of plant secondary metabolism.
  • Evolution of vasculature (example of diversity in plant evolution. e.g. monocot versus eudicot leave development, PIN proteins)
  • Evolution of the seed
  • What is the nature of the static structure of a genome and relate it to the form and function of the organism as a whole (genometype to phenotype)
  • Association of organisms and symbiotic relationships
  • Ability to make functional, developmental and evolutionary predictions! Example of predictions:

#number, types, and presence of genes in incomplete genomes #how will plant development change when grown under conditions X #how will change Y affect crop yield #secondary metabolic networks

Workflow example: I'm a biologist with my favorite gene/gene family in a non-model organism -- what is known in related organisms for which we have more data? Find and compare syntenic regions or orthologous/paralogous genes. Bootstrap to expression data, to gene regulatory networks, to protein interaction networks, to subcellular macro-structures (e.g. proteosome) to secondary metabolic pathways. Generate hypotheses/predictions, then test!

Summary of common central themes to proposed GCs. Overall, there was agreement that we need to use the comparative approach in order to understand evolution. At the core of this approach (for our questions) is comparative genomics. However, genomic data and annotations must be linked to additional functional information in order to link genometype and genotype to phenotype. This includes, but is not limited to, expression data, regulatory networks, signaling pathways, protein interaction networks, and secondary metabolism pathways.

When dealing with genomic data, we need a system that can accommodate any genome from any organism without the need to have a single "reference" genome. Additionally, we need to be able accommodate genomes in various states of completion, including EST databases without a genomic sequence scaffold. Also, we will probably want a system that can track multiple versions of any given genome as genomic sequence and annotations will change over time. When such updates happen to the underlying data, it is important to be able to map previous work onto new datasets.

By having a system that integrates and allows us to query "omic" information, we can identify and possibly prioritize new datasets that will greatly help fill holes in our knowledge. For example: #which genomes should be sequenced next to help cover phylogenetic depth, understand crop domestication and improvement, have unique metabolic pathways, have unique developmental pathways? #which organisms do we need additional expression data? #which organisms do should we characterize secondary metabolites?

Briefly touched upon was the need for computational solutions that reduce the complexity of interacting with the underlying data, their relationships, and performing comparative analyses therein. In other words, we want a system that is easy to use, has all the information we want (when available), but doesn't overwhelm us with complexity.


Summary of requirements

  • Need for a system that
    • Can generically store any genome or EST data set from any organism without the need for a specific reference genome. This includes other "plant" genomes -- plastids, mitochondria, viruses, etc.
    • Comparison tools to find and evaluate gene homology and genomic synteny (where applicable)
    • Expression data (through orthologous/paralogs groups of genes)
    • Ability to store phylogenetic information at both organismal and gene levels
    • Pathways/metabolic networks/protein-protein interaction networks
    • Need to track providence of data and annotations
    • Need to track phenotypes: standard mutations, disease resistance, complex phenotypes (insect/pest feeding preferences)
    • Ability to add information back into the system (annotation, expression data, network connections), and track authors of data
    • Comparative physiology (a leaf versus a petiole, thorn versus spine) -- mapping between organism
    • Predictive power -- given what we know, what do we know that we don't know, but might be able to predict. E.g. power to predict protein networks, secondary metabolism
    • Must easily be able to compare and transverse various "ome" levels -- genome to transcriptome to proteome to methylome to metabolome, interactome, etc.
    • Future proof: Appreciation for the next generation of sequencing technology -- many more genomes on the way. Genome assembly and annotation.

Additions post-session

We should try to look at evolution as a continuum, with population and trait biologists at one end looking at adaptation, variation in phenotype/genotype, underlying causes of phenotypic plasticity, resilience, speciation, etc., and then Evo/Devo types at the other end looking at the evolution of the plant body plan, organs, biochemistry, etc. (what I call innovations) throughout time without really thinking about the process of evolution as much as the consequences of evolution.  -J. Banks
Personal tools