Genomes Containing Duplicates Are Hard to Compare

Chauve, Cedric; Fertin, Guillaume; Rizzi, Romeo; Vialette, Stéphane

doi:10.1007/11758525_105

Cedric Chauve²⁰,
Guillaume Fertin²¹,
Romeo Rizzi²² &
…
Stéphane Vialette²³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3992))

Included in the following conference series:

International Conference on Computational Science

1391 Accesses
12 Citations

Abstract

In this paper, we are interested in the algorithmic complexity of computing (dis)similarity measures between two genomes when they contain duplicated genes. In that case, there are usually two main ways to compute a given (dis)similarity measure M between two genomes G ₁ and G ₂: the first model, that we will call the matching model, consists in computing a one-to-one correspondence between genes of G ₁ and genes of G ₂, in such a way that M is optimized in the resulting permutation. The second model, called the exemplar model, consists in keeping in G ₁ (resp. G ₂) exactly one copy of each gene, thus deleting all the other copies, in such a way that M is optimized in the resulting permutation. We present here different results concerning the algorithmic complexity of computing three different similarity measures (number of common intervals, MAD number and SAD number) in those two models, basically showing that the problem becomes NP-completeness for each of them as soon as genomes contain duplicates. In the case of MAD and SAD, we actually prove that, under both models, both MAD and SAD problems are APX-hard.

Work partially supported by the French-Italian Galileo Project PAI 08484VH and the French-Québec 60th CPCFQ.

Download to read the full chapter text

Chapter PDF

On Computing Breakpoint Distances for Genomes with Duplicate Genes

New Genome Similarity Measures Based on Conserved Gene Adjacencies

An Exact Algorithm to Compute the DCJ Distance for Genomes with Duplicate Genes

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Blin, G., Chauve, C., Fertin, G.: The breakpoint distance for signed sequences. In: 1st Int. Conference on Algorithms and Computational Methods for Biochemical and Evolutionary Networks, CompBioNets 2004. Texts in Algorithms, vol. 3, pp. 3–16. KCL Publications (2004)
Google Scholar
Blin, G., Rizzi, R.: Conserved interval distance computation between non-trivial genomes. In: Wang, L. (ed.) COCOON 2005. LNCS, vol. 3595, pp. 22–31. Springer, Heidelberg (2005)
Chapter Google Scholar
Bryant, D.: The complexity of calculating exemplar distances. In: Sankoff, D., Nadeau, J. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment, and the Evolution of Gene Families, pp. 207–212. Kluwer Acad. Pub., Dordrecht (2000)
Google Scholar
Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. on Comp. Biology and Bioinformatics 2(4), 302–315 (2005)
Article Google Scholar
Garey, M.R., Johnson, D.S.: Computers and Intractability: a guide to the theory of NP-completeness. W.H. Freeman, San Franciso (1979)
MATH Google Scholar
Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909–917 (1999)
Article Google Scholar
Sankoff, D.: Gene and genome duplication. Curr. Opin. Genet. Dev. 11(6), 681–684 (2001)
Article Google Scholar
Sankoff, D., Haque, L.: Power boosts for cluster tests. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 121–130. Springer, Heidelberg (2005)
Chapter Google Scholar
Cam Thach, N.: Algorithms for calculating exemplar distances. Honours Year Project Report, National University of Singapore (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

LaCIM, CGL, Département d’Informatique, Université du Québec À Montréal CP 8888, Succ. Centre-Ville, H3C 3P8, Montréal, QC, Canada
Cedric Chauve
Laboratoire d’Informatique de Nantes-Atlantique (LINA), FRE CNRS 2729 Université de Nantes, 2 rue de la Houssinière, 44322, Nantes Cedex 3, France
Guillaume Fertin
Dipartimento di Matematica e Informatica, Università di Udine, Italy
Romeo Rizzi
Laboratoire de Recherche en Informatique (LRI), UMR CNRS 8623, Faculté des Sciences d’Orsay, Université Paris-Sud, 91405, Orsay, France
Stéphane Vialette

Authors

Cedric Chauve
View author publications
You can also search for this author in PubMed Google Scholar
Guillaume Fertin
View author publications
You can also search for this author in PubMed Google Scholar
Romeo Rizzi
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Vialette
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Advanced Computing and Emerging Technologies Centre, The School of Systems Engineering, University of Reading, RG6 6AY, Reading, United Kingdom
Vassil N. Alexandrov
Department of Mathematics and Computer Science, University of Amsterdam, Kruislaan 403, 1098, SJ Amsterdam, The Netherlands
Geert Dick van Albada
Faculty of Sciences, Section of Computational Science, University of Amsterdam, Kruislaan 403, 1098, SJ Amsterdam, The Netherlands
Peter M. A. Sloot
Computer Science Department, University of Tennessee, TN 37996-3450, Knoxville, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chauve, C., Fertin, G., Rizzi, R., Vialette, S. (2006). Genomes Containing Duplicates Are Hard to Compare. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science – ICCS 2006. ICCS 2006. Lecture Notes in Computer Science, vol 3992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11758525_105

Download citation

DOI: https://doi.org/10.1007/11758525_105
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34381-3
Online ISBN: 978-3-540-34382-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Genomes Containing Duplicates Are Hard to Compare

Abstract

Chapter PDF

Similar content being viewed by others

On Computing Breakpoint Distances for Genomes with Duplicate Genes

New Genome Similarity Measures Based on Conserved Gene Adjacencies

An Exact Algorithm to Compute the DCJ Distance for Genomes with Duplicate Genes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Genomes Containing Duplicates Are Hard to Compare

Abstract

Chapter PDF

Similar content being viewed by others

On Computing Breakpoint Distances for Genomes with Duplicate Genes

New Genome Similarity Measures Based on Conserved Gene Adjacencies

An Exact Algorithm to Compute the DCJ Distance for Genomes with Duplicate Genes

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation