MIRI

The aim of this part of the project is to explore gene order dynamics by inferring gene rearrangement scenarii between different species of symbionts and by attempting to relate this to: 1. other features in the symbionts genome as well as to their environment and relation with the host, 2. random scenarii under certain models for randomness.

Gene order dynamics is mathematically modelled as distances between permutations of sets or multisets of gene identifiers under one or a set of operations that include principally reversal of a sequence of gene identifiers, but also displacement, deletion and duplication.

Numerous algorithms for computing a rearrangement scenario between different gene orders exist. As far as we know, only three, including one from the participants in this project , explore all possible optimal scenarii, and this only for reversal permutation distance. There can be thousands or even millions of such scenarii which would seem to make them useless for any biological interpretation. There is, however, structure in this huge set that has been very little explored yet, mainly because of the high complexity of the problem. How much biological information there is in this structure has, to the best of our knowledge, never been analysed.

Evolution is in general studied by comparative approaches. In the case of networks, many criteria could be used, from full network alignment - but this leads to hard computational problems - to the comparison of general network properties that ignore altogether topology and identify the networks to be compared with bags of proteins or of enzymes and metabolites for metabolism), or with various structural indices such as clustering coefficient, betweeness centrality, average path length, diameter, concentration of subgraphs etc. Other measures have been used such as scope of compounds or its inverse problem. The question of which method is best for comparing two networks is open, in particular when the purpose is to understand evolution and to elucidate the relation between genotype (genetic constitution of an individual, including here the constituents of its biochemical networks) and phenotype (describing any observed characteristic of an organism).

Whether we consider metabolic or PPI networks, we are dealing with node- and sometimes edge-labelled graphs (or bipartite graphs or hypergraphs in the case of metabolism) so topology only is not a fine enough measure when the main objective is to study evolution. The information sustained by the labels, that bear a relation with the genomic text is also important. Whether the relation is independent or, inversely, dependent on its context, that is on the genes that are near along the genome or in the cellular space, is another open issue.

At the sequence level

Co-cladogenesis (cladogenesis refers to the evolutionary process whereby one species splits into two or more species) and in general co-evolution of symbiont and host will be examined at the sequence level and, more speculatively, at the network level.

Among the questions that will be addressed concerning co-cladogenesis examined at the sequence level are: how often are symbionts horizontally transferred among branches of the host phylogenetic tree, how long do symbionts persist inside a given host following its invasion, and finally, what processes underlie this dynamic gain/loss equilibrium?

Current methods for answering to these questions are based on a phylogenetic approach where an evolutionary tree is built for symbiont and host separately and then compared to determine whether the two are identical (the two have indeed evolved together) or not. In the latter case, various events could have happened: the host has speciated (given risen to two new species) but not the symbiont (this is the "miss the boat" scenario), the symbiont has speciated but not the host or the symbiont has duplicated and one of the two went to infect another host (this is the "host-switching" scenario), the symbiont has gone extinct, or both host and symbiont have speciated. Other events may further complicate the co-evolutionary history: a host may live with various symbionts ("multi-infection"), symbionts may exchange genetic material further scrambling the evolutionary signal in either sense, etc. The latter events are, so far as we know, never taken into account in all existing methods.

Because of possible mistakes at various levels, for instance failure to detect the presence of a symbiont or mistake in the phylogenetic reconstruction, a difference between the two trees may also not mean a different host-symbiont evolution and although some methods consider this possibility, random models for co-evolution scenarii may be improved. Here again, combinatorial approaches may serve as important guides.

At the network level

Rough as the measures for comparing two networks mentioned in the previous section may be for now, some could provide another looking-glass into the co-evolution of symbionts and hosts. This is a far more open issue of this project as fewer data are available to get at an accurate enough picture of the evolution of the interface between symbiont and host networks. The topic is however fully open and any new result in this area would thus be interesting.

Mathematical Investigation of "Relations Intimes"

Navigation

Foreword

Symbiont genome evolution and dynamics

Symbiont biochemical network evolution

Symbiont-host co-cladogenesis and co-evolution

At the sequence level

At the network level

Copyright © 2005 | All Rights Reserved