Main

The 125-megabase (Mb) genome of T. melanosporum is the largest sequenced fungal genome to date6, but no evidence for whole-genome duplication or large scale dispersed segmental duplications was observed (Supplementary Table 1 and Supplementary Information section 2). The approximately fourfold larger size of the truffle genome compared with other sequenced ascomycetes is accounted for by multi-copy transposable elements (TE) which constitute about 58% of the assembled genome (Fig. 1, Supplementary Figs 5, 6 and 8, Supplementary Information section 3). Estimated insertion times suggest a major wave of retrotransposition at <5 million years ago (Supplementary Fig. 7). TEs are not uniformly spread across the genome, but are clustered in gene-poor or gene-lacking regions (Fig. 1 and Supplementary Fig. 8). The expansion of regions between blocks of protein-coding genes results from an increased density of TEs. The proliferation of TEs within the truffle genome may result from its low effective population size7 during postglaciation migrations8 (Supplementary Information section 2.5).

Figure 1: Genomic landscape of T. melanosporum.
figure 1

a, The area chart quantifies the distribution of transposable elements (TE) and protein-coding genes (Gene models) along supercontig 5. The y axis represents the percentage of base pairs corresponding to TE (red), genes (blue), and other regions and gaps (white) in 10,000-bp sliding windows. b, Heat maps display the distribution of selected elements, including simple sequence repeats (SSR), gene models, all TE, long terminal repeat retrotransposons (class I LTR), long interspersed elements (class I LINE), terminal inverted repeats (class II TIR), and unknown TE classes (TE no cat.). Abundance of TE, protein-coding genes and other sequences is represented by a colour scale from 0 (white) to ≥9 occurrences (black) per 10 kbp window.

PowerPoint slide

The predicted proteome is in the lower range of sequenced filamentous fungi6, as only 7,496 protein-coding genes were identified (Supplementary Information section 4). They are mainly located in TE-poor regions and the gene density is heterogeneous when compared with that of other ascomycetes (Fig. 1, Supplementary Figs 8 and 9). Among the predicted proteins, only 3,970, 5,596 and 5,644 showed significant sequence similarity to proteins from Saccharomyces cerevisiae, Neurospora crassa and Aspergillus niger, respectively (Supplementary Fig. 10). This agrees with the predicted ancient separation (>450 Myr ago) of the Pezizomycetes from the other ancestral fungal lineages (Supplementary Fig. 4)9. Of the 5,650 T. melanosporum genes that have an orthologue, very few show conservation of neighbouring orthologues (synteny) in at least one of the other species (Supplementary Fig. 11, Supplementary Information section 5.2). The T. melanosporum genome shows a structural organization strikingly different from other sequenced ascomycetes; the largest syntenic region (with Coccidioides immitis) only contains 99 genes with 39 orthologues (Supplementary Fig. 12). TE proliferation probably facilitated genome rearrangements. Some regions of meso-synteny were however detected, suggesting that T. melanosporum could be used for assessing the genome organization of ancestral ascomycete clades.

Expression of most predicted genes was detected in free-living mycelium, ectomycorrhizal (ECM) root tips and/or fruiting body by custom oligoarrays, expressed-sequence-tag pyrosequencing and Illumina RNA-Seq (Supplementary Information sections 2.4 and 8, Supplementary Table 2, Supplementary Fig. 26). Only a low proportion of transcripts (7.6%) is differentially expressed (fold-ratio ≥4.0, P < 0.05) in either ectomycorrhiza or fruiting body by comparison to free-living mycelium (Table 1, Supplementary Table 4). Only 61 transcripts unique to ectomycorrhiza, fruiting body or free-living mycelium were detected (Supplementary Table 5). A few transcripts coding for a H-type lectin, an arabinogalactan protein, a LysM-domain containing protein, major facilitator superfamily (MFS) transporters, laccase/tyrosinase, a lipase and polysaccharide-degrading enzymes are strikingly enriched (>1,000-fold) in symbiotic tissues (Table 1). They may play a role in adhesion to host cells, detoxication of plant defence metabolites, nutrient exchange, and colonization of root apoplast through the deconstruction of cell walls.

Table 1 The most highly upregulated transcripts in T. melanosporum/Corylus avellana ECM root tips

A process that is crucial to the success of ECM interactions is the mutualistic exchange of nutrients between the microsymbiont and its host plant. A comparison with other fungi revealed that the total number of predicted transporters is lower in T. melanosporum (381 members) compared with L. bicolor (491 members) as well as with saprotrophic and pathogenic ascomycetes (481–781 members) (Supplementary Table 26). However, 64 predicted membrane transporters showed an upregulated expression in truffle ectomycorrhizas, suggesting increased fluxes of carbohydrates, oligopeptides, amino acids and polyamines at the symbiotic interface (Supplementary Table 27). PFAM classification of fungal genes induced in symbiotic tissues of either L. bicolor or T. melanosporum ECM root tips revealed strikingly divergent fungal symbiotic proteomes (Supplementary Fig. 15). However, the PFAM categories corresponding to the MFS transporters (PFAM00083), aquaporin-related major intrinsic proteins (PFAM00230) and amino acid permeases (PFAM000324) were among the most strongly overrepresented in genes that were transcriptionally upregulated in both L. bicolor and T. melanosporum ectomycorrhizas.

Orthologous genes (that is, reciprocal best hits, BLASTP e-value ≤10-5) significantly induced in the symbiosis represent only 1.5% and 4.1% of the ectomycorrhiza-upregulated genes in both T. melanosporum and L. bicolor, respectively. Most of these rare transcripts code for membrane transporters involved in sugar, amino acid or sulphate uptake (Table 2). This transcriptome trait appears to be a hallmark of the mycorrhizal symbiosis. The resulting increased nutrient flux probably explains the beneficial effect of the symbionts on the growth of their host seedlings (Supplementary Information section 1 and Supplementary Fig. 3). Other overrepresented PFAM categories displayed different patterns in the two symbionts. None of the effector-like small secreted MiSSP proteins specifically expressed in L. bicolor ectomycorrhizas4 were detected among ectomycorrhiza-regulated T. melanosporum transcripts.

Table 2 Orthologous symbiosis upregulated genes of L. bicolor and T. melanosporum

One of the most striking characteristics of the T. melanosporum genome is the almost complete absence of highly similar gene pairs. Of the predicted 7,496 protein-coding genes, only seven pairs share >90% amino-acid identity in their coding sequence, whereas 30 pairs share >80% identity (Supplementary Information section 5.3, Fig. 2). This feature was also observed in the ascomycetous saprotroph N. crassa10. In striking contrast to the ECM L. bicolor4, multigene families in T. melanosporum are limited in number and comprise only 19% of the predicted proteome; most families have only two members (Supplementary Fig. 13). The rate of gene family gain is much lower than the rate of gene loss and among the 11,234 gene families found in ascomycetes, 5,695 appear to be missing in T. melanosporum (Supplementary Information section 5.4, Fig. 2). This compact gene coding space may reflect the genome organization of an ascomycete common ancestor, as the Pezizomycetes clade is the earliest diverging lineage within the Pezizomycotina (Supplementary Fig. 4). By comparison to other ascomycetes, gene families predicted to encode metabolite transporters (for example, amino acid and sugar permeases) and secondary metabolism enzymes (such as polyketide synthases and cytochrome P450s) are much smaller. Only 465 genes encoded by expanding gene families of L. bicolor are also found in the T. melanosporum genome (BLASTP, e-value ≤10-5) and 154 orthologues are shared between expanding gene families of both symbionts. None of them is differentially expressed in ectomycorrhizas. Differences in gene family expansion, in particular dynamic repertoires of genes encoding symbiosis-regulated effector-like proteins and sugar-cleaving enzymes (see below), are probably responsible for different symbiotic traits between T. melanosporum and L. bicolor, such as altered host specificity. The compact genome of T. melanosporum might be a product of selection for specialization; this is because genome expansion, as observed in L. bicolor, is probably driven by selection on the symbiont to exploit a diversity of encountered substrates provided by multiple potential hosts and by their diverse soils4,5.

Figure 2: Genome redundancy in the truffle genome.
figure 2

a, The percentage of amino-acid identity of the top-scoring self-matches for protein-coding genes in T. melanosporum, Saccharomyces cerevisiae, Aspergillus nidulans, Neurospora crassa, Magnaporthe grisea, and Botrytis cinerea. For each fungus, the protein-coding regions for each gene were compared with those of every other gene in the same genome using BLASTX. b, The figure represents the total number of protein families in each species or node. The numerals on branches show numbers of expanded (left, red), unchanged (middle, black) or contracted (right, blue) protein families along lineages by comparison to the putative pan-proteome.

PowerPoint slide

The volatiles released by truffles are attractive to rodents and truffle flies11, which disperse their spores, but also to humans who consider this elusive mushroom a delicacy. T. melanosporum is the first sequenced fungus producing highly flavoured hypogeous fruiting bodies (Supplementary Information section 6.4, Fig. 3). Genomic signatures of the long-standing (>2,000-year-old) reputation of the black truffle as a gastronomic delicacy are its extremely low allergenic potential (Supplementary Fig. 18), coupled with the lack of key mycotoxin biosynthetic enzymes (Supplementary Information section 6.2, Supplementary Table 14), and the preferential overexpression of various flavour-related enzymes in the fruiting body (Supplementary Figs 19–21). Among the latter are specific subsets of sulphur assimilation and S-amino acid interconversion enzymes. These include cystathionine lyases known to promote the side-formation of methyl sulphide volatiles abundant in truffles12 as well as various enzymes involved in amino acid degradation through the Ehrlich pathway which are giving rise to known truffle volatiles and flavours, for example, 2-methyl-1-butanal (Fig. 3, Supplementary Information section 6.4, Supplementary Figs 20 and 21). Also notable, given the subterranean habitat of this fungus, is the presence of various putative light-sensing components (Supplementary Information section 6.6), which might be involved in light avoidance mechanisms and/or in the control of seasonal developmental variations, especially those related to fruiting body formation and sexual reproduction.

Figure 3: Outline of sulphur metabolism in T. melanosporum fruiting body.
figure 3

Numbers identify enzymes and gene models as specified in Supplementary Table 15. Reactions (arrows) catalysed by enzymes whose mRNAs are upregulated in fruiting bodies are shown in red; mRNAs downregulated by at least twofold in fruiting bodies, or whose expression levels differ by less than twofold compared to free-living mycelia and ectomycorrhizas are represented by green and black arrows, respectively. Off-pathway cystathionine-γ-lyase (no. 22)- and cystathionine β-lyase (no. 20)-supported reactions, and spontaneous (non-enzymatic) breakdown reactions are indicated by grey and dashed arrows, respectively. APS, adenosine phosphosulphate; PAPS, phosphoadenosine phosphosulphate; PAP, phosphoadenosine phosphate; Trx, thioredoxin; SAM, S-adenosylmethionine; SAH, S-adenosylhomocysteine; DMS, dimethylsulphide; DMDS, dimethyl-disulphide; DMTS, dimethyl-trisulphide.

PowerPoint slide

The analysis of genes implicated in the mating process, including pheromone response, meiosis and fruiting body development showed that most sex-related components identified in other ascomycetes are also present in T. melanosporum (Supplementary Table 11). Sexual reproduction in ascomycete filamentous fungi is partly controlled by two different mating-type (MAT) genes that establish sexual compatibility13: one MAT gene codes for a protein with an α-box domain, whereas the other encodes a high mobility group (HMG) DNA binding protein (Supplementary Information section 6.5). It was widely believed that T. melanosporum was a homothallic or even an exclusively selfing species14. The sequenced Mel28 strain contains the HMG locus, and the opposite linked MATα locus was identified in another natural isolate (Supplementary Fig. 22), confirming recent hints that T. melanosporum is heterothallic and thus an obligate outcrossing species15. This result has major implications for truffle cultivation, which will be improved by the use of host plants harbouring truffle strains of opposite mating types. In most ascomycetes, the genomic regions flanking the MAT locus show an extended conservation13, but there is no synteny of the MAT loci between T. melanosporum and other sequenced fungi (Supplementary Fig. 23).

To determine whether T. melanosporum sugar-cleaving capabilities resemble those of other fungi, we have undertaken a comparison of the glycoside hydrolase (GH) and polysaccharide lyase (PL) repertoires16 of 18 completely sequenced fungi (Fig. 4). As expected for a symbiotic fungus living in the root apoplast, T. melanosporum has a relatively small number of GH-encoding genes (91 members; Supplementary Tables 23 and 24); much fewer than phytopathogens (for example, Magnaporthe grisea, Fusarium graminearum) and saprotrophs (for example, N. crassa, Podospora anserina). The T. melanosporum GH repertoire bears some similarity with that of L. bicolor4, especially a reduced spectrum of enzymes targeting the plant cell wall compared to saprobes, culminating in both fungi with the absence of cellulases from families GH6 and GH7. There are however significant differences in the spectrum of enzymes present in these two symbiotic fungi. For instance, T. melanosporum has hemicellulases from families GH10 and GH43, whereas L. bicolor has none. Similarly, T. melanosporum has a family GH45 cellulase that is absent from the L. bicolor genome. Other differences include different strategies to cleave pectin: whereas L. bicolor utilizes six hydrolytic GH28 pectinases, T. melanosporum has only two, but these are complemented by three pectin lyases and a pectin methylesterase that are missing in L. bicolor. Both fungi have a set of proteins, few in number, bearing cellulose-binding domains, but differences appear here too: the single cellulose-binding CBM1 motif of L. bicolor is appended to a GH5 endoglucanase, whereas T. melanosporum has two CBM1 motifs attached to a GH61 enzyme and to a protein of unknown function. GH61 enzymes have been reported to display weak cellulolytic activity17.

Figure 4: Double clustering of the carbohydrate-cleaving families from representative fungal genomes.
figure 4

Top tree: the fungi named are Aspergillus nidulans (A_nidu), Aspergillus niger (A_nige), Aspergillus oryzae (A_oryz), Cryptococcus neoformans (C_neof), Gibberella zeae (G_zeae), Hypocrea jecorina (H_jeco), Laccaria bicolor (L_bico), Magnaporthe grisea (M_gris), Malassezia globosa (M_glob), Nectria haematococca (N_haem), Neurospora crassa (N_cras), Penicillium chrysogenum (Pe_chr), Phanerochaete chrysosporium (Ph_chr), Podospora anserina (P_anse), Postia placenta (P_plac), Saccharomyces cerevisiae (S_cere), Schizosaccharomyces pombe (S_pomb), and Tuber melanosporum (T_mela). Left tree: the enzyme families are represented by their class (GH, glycoside hydrolase; PL, polysaccharide lyase) and family number according to the carbohydrate-active enzyme database16. Right side: known substrate of CAZy families (most common forms in brackets): BPG, bacterial peptidoglycan; BEPS, bacterial exopolysaccharides; CW, cell wall; ESR, energy storage and recovery; FCW, fungal cell wall; PCW, plant cell wall; PG, protein glycosylation; U, undetermined; a-gluc, α-glucans (including starch/glycogen); b-glyc, β-glycans; b-1,3-gluc, β-1,3-glucan; cell, cellulose; chit, chitin/chitosan; dext, dextran; hemi, hemicelluloses; inul, inulin; N-glyc, N-glycans; N-/O-glyc, N- / O-glycans; pect, pectin; sucr, sucrose; and tre, trehalose. Abundance of the different enzymes within a family is represented by a colour scale from 0 (black) to ≥20 occurrences (red) per species.

PowerPoint slide

An extended comparison with other sequenced fungi (Fig. 4) shows that T. melanosporum clusters neither with L. bicolor nor with saprotrophic ascomycetes, most probably because of its limited overall number of GHs and PLs that make it closer to yeasts and fungi that do not interact with plants, but rather with animals (Cryptococcus neoformans, Malassezia globosa). Differences between the enzyme repertoires of T. melanosporum and L. bicolor suggest differences in the mode of interaction of the two symbionts with their respective host plants. A striking difference is the presence of an invertase gene in T. melanosporum, whereas L. bicolor has none and is therefore completely dependent on its host for its provision of glucose5. In contrast, T. melanosporum could access and hydrolyse the plant-derived sucrose. This would suggest that although both fungi develop symbiotic relationships with plants, T. melanosporum is probably less dependent than L. bicolor. The overall pattern of induction of genes coding for enzymes acting on polysaccharides is similar in both L. bicolor and T. melanosporum symbiotic transcriptomes, although a larger number of carbohydrate-cleaving enzyme transcripts are upregulated for some families—for example, GH16 (β-1,6-glucanases), GH18 (chitinases) and GT20 (α,α-trehalose-phosphate synthase) in L. bicolor (Supplementary Table 25 and Supplementary Fig. 24). Intriguingly, a GH5 cellulase and a GH28 pectinase are among the rare transcripts that are highly upregulated in both L. bicolor and T. melanosporum ectomycorrhizas, suggesting that they play a key role in the symbiosis. On the other hand, the β-glucan synthesis-associated protein present in both ectomycorrhizas is involved in fungal cell wall remodelling16 and may play a role in the alteration of cell wall surface during infection to conceal the hyphae from the host.

The ability to establish ECM symbioses is a widespread characteristic of various ascomycetes and basidiomycetes3. The truffle genome reveals features of an ancestral fungal lineage that diverged from other lineages >450 Myr ago9. Despite their similar symbiotic structures and similar beneficial effects on plant growth, the ascomycete T. melanosporum and the basidiomycete L. bicolor encode strikingly different proteomes—compact with very few multigene families, versus large with many expanded multigene families—and symbiosis-regulated genes. Effector-like proteins, such as the L. bicolor ECM-induced SSP MiSSP7 (ref. 4), are not expressed in T. melanosporum ectomycorrhizas. On the basis of our results, the ECM symbiosis appears as an ancient innovation that developed several times during the course of Mycota evolution using different ‘molecular toolkits’18. Sequencing of the T. melanosporum genome has provided unprecedented insights into the molecular bases of symbiosis, sex and fruiting in a most popular representative of the only lifestyle not yet addressed by Ascomycota genomics19. This sequencing will be a major step in moving truffle research into the realm of ecosystem science, and a deeper understanding of the genome of the Périgord black truffle is expected to have substantial social and cultural impact.

Methods Summary

A whole-genome shotgun strategy was adopted for sequencing and assembling the T. melanosporum genome (Supplementary Information section 2). All genomic DNA was obtained from the homokaryotic haploid strain Mel28. All data were generated by paired-end sequencing of cloned 3 kb and 10 kb inserts using Sanger technology. The pool of data available for the assembly consisted of 1,262,177 reads, with 1,250 Mb of sequence. The data were assembled using the ARACHNE assembler. The 4,464 contigs (N50 = 62 kb) were assembled in 398 supercontigs (N50 = 638 kb) corresponding to 124.946 Mb of sequence. The main genome scaffolds were at a depth of 10. Assemblies and annotations are available at INRA (http://mycor.nancy.inra.fr/IMGC/TuberGenome/) and Genoscope (http://www.genoscope.cns.fr/tuber).

The GAZE pipeline selected a best representative gene model for each locus on the basis of expressed-sequence-tag support and similarity to known proteins from other organisms, and predicted 7,496 protein-coding gene models (Supplementary Information section 4). All predicted genes were annotated using Gene Ontology and KEGG pathways. Protein domains were predicted using InterProScan. Gene families were built from proteins using Tribe-MCL.

Single dye labelling of cDNAs, hybridization procedures, data acquisition, background correction and normalization of custom-exon expression arrays were performed at the NimbleGen facilities following their standard protocol. A Student t-test with false discovery rate was applied to the data using the ARRAYSTAR software (DNASTAR). Transcripts with a significant P value (<0.05) and ≥4-fold change in transcript level were considered as differentially expressed in ECM root tips or fruiting body.