Abstract

Pollen allergies have long been a major pandemic health problem for human. However, the evolutionary events and biological function of pollen allergens in plants remain largely unknown. Here, we report the genome-wide prediction of pollen allergens and their biological function in the dicotyledonous model plant Arabidopsis (Arabidopsis thaliana) and the monocotyledonous model plant rice (Oryza sativa). In total, 145 and 107 pollen allergens were predicted from rice and Arabidopsis, respectively. These pollen allergens are putatively involved in stress responses and metabolic processes such as cell wall metabolism during pollen development. Interestingly, these putative pollen allergen genes were derived from large gene families and became diversified during evolution. Sequence analysis across 25 plant species from green alga to angiosperms suggest that about 40% of putative pollen allergenic proteins existed in both lower and higher plants, while other allergens emerged during evolution. Although a high proportion of gene duplication has been observed among allergen-coding genes, our data show that these genes might have undergone purifying selection during evolution. We also observed that epitopes of an allergen might have a biological function, as revealed by comprehensive analysis of two known allergens, expansin and profilin. This implies a crucial role of conserved amino acid residues in both in planta biological function and allergenicity. Finally, a model explaining how pollen allergens were generated and maintained in plants is proposed. Prediction and systematic analysis of pollen allergens in model plants suggest that pollen allergens were evolved by gene duplication and then functional specification. This study provides insight into the phylogenetic and evolutionary scenario of pollen allergens that will be helpful to future characterization and epitope screening of pollen allergens.

During the past four decades, allergic diseases have become a pandemic health problem. In general, pollen allergens are considered a major risk factor for both seasonal allergic rhinitis and asthma, and studies showed that more than 50% of patients with perennial allergic rhinitis are sensitized to pollen allergens. The sensitization rate of pollen is up to 30%, and the number of people affected by pollen allergy is on the increase worldwide (D’Amato et al., 2007; Pawankar et al., 2013). Unfortunately, pollen allergens are difficult to avoid because of the extremely small size and high prevalence of pollen, and this may contribute to pollen-food and pollen-fruit syndromes by cross-reactivity (Vieths et al., 2002).

Pollen from trees, grasses, and weeds all have been found to elicit allergic reactions in atopic individuals (Emberlin, 2009). To date, 11 groups of grass pollen allergens with the ability to elicit a specific IgE response in atopic individuals have been identified (Hrabina et al., 2008), and these mainly focused on pollen allergens from weeds and trees (Gadermaier et al., 2004; Mothes and Valenta, 2004). Those studies also suggested that these pollen allergens only belong to a few protein families, such as expansins, profilins, and calcium-binding proteins. Profilins are conserved in plants and act as a pan-allergen capable of inducing allergic reaction in various species (Valenta et al., 1992; Radauer and Breiteneder, 2006). Biologically, many pollen allergenic proteins are thought to play important physiological roles in pollen, especially the pollination process (Songnuan, 2013).

Pollen is the microgametophyte of seed plants that produces the male gametes (sperm cells) for subsequent sexual reproduction. The pollen protoplasm is surrounded by a specialized cell wall, the pollen wall, in which the inner pollen wall (also called the intine) is typically a thin multilayer composed of cellulose and pectin. In contrast, the exine refers to the very resistant outer wall that provides robust protection of the pollen grain from disintegration (Shi et al., 2015). Allergenic proteins are usually located within the pollen protoplast and readily released during the rehydration process (Grote, 1999). For example, birch (Betula spp.) pollen allergens Bet v 1 and Bet v 2 (profilin) are located within the pollen cytoplasm in the anhydrous state, in close proximity to ribosome-rich areas. Upon rehydration, birch pollen allergens are released within minutes from apertures and subsequently found on the entire pollen surface (Grote et al., 1993).

Over the past few decades, increasing information about allergens together with the advancement of bioinformatics tools have enabled scientists to predict and compare allergens from different sources (FAO/WHO, 2003; Stadler and Stadler, 2003; Saha and Raghava, 2006; Soeria-Atmadja et al., 2006; Wang et al., 2013c). These advances provided the prerequisites to allow a comparative analysis and a molecular evolution analysis of pollen allergens. Radauer and Breiteneder (2007) first introduced the evolutionary scope of the origin of plant allergens and proposed two scenarios for allergen evolution. One was that allergenicity could be an intrinsic property of the ancestral members of certain protein families still present in present-day allergens, and the other was that allergenicity emerged randomly in certain proteins and was inherited by their descendants. Recently, the evolution of major allergen gene families in peanut (Arachis hypogaea) was analyzed and revealed lineage-specific expansion and loss of allergenic genes (Ratnaparkhe et al., 2014). However, little information on the origin and evolution of pollen allergens has been reported.

In this study, we performed genome-wide analysis of potential pollen allergens in two well-studied model plants, the dicot Arabidopsis (Arabidopsis thaliana) and the monocot rice (Oryza sativa ssp. japonica), as well as their homologs in 25 species ranging from basal green alga to angiosperms. While some pollen allergens seemed to be derived from the duplication and diversification of large gene families from lower to higher plants, other allergens seemed to be recently evolved. Importantly, these genes seemed to have undergone purifying selection during evolution, implying that allergenic motifs are associated with the biological function of the allergens. A model is also proposed to explain how plants produced and maintained pollen allergens. This phylogenetic and evolutionary insight into pollen allergens will be useful in future characterization, epitope screening, and medical prevention of pollen allergens.

RESULTS

Prediction and Classification of Pollen Allergens in Arabidopsis and Rice

To identify putative pollen allergens from Arabidopsis and rice, we analyzed 186 and 261 candidates allergenic proteins by comparing the proteomic data of mature pollen from rice and Arabidopsis, respectively (Holmes-Davis et al., 2005; Noir et al., 2005; Dai et al., 2006; Sheoran et al., 2006). Next, using the combination of two methods for allergen prediction (PREAL [Wang et al., 2013c] and a sequence-based approach [FAO/WHO, 2003]), a total of 20 and 31 candidate proteins were identified as allergen proteins from rice and Arabidopsis, respectively. Furthermore, by analyzing transcriptomic data from mature pollen, 140 rice proteins and 94 Arabidopsis proteins were identified as putative allergens (Qin et al., 2009; Wei et al., 2010; Fig. 1, A and B). Together, we obtained 145 and 107 putative pollen allergens from rice and Arabidopsis, respectively (Fig. 1, A and B; Table I). Among the 145 rice candidates, five proteins were present only in the proteomic data, 15 in both the proteomic and transcriptomic data, and the remaining 125 only in the transcriptomic data. Similarly, of the 107 putative pollen allergens in Arabidopsis, 13 proteins were identified only from the proteomic data, 18 from both the proteomic and transcriptomic data, and the remaining 76 only in the transcriptomic data. The observation that most putative allergens were predicted from transcriptomic data sets is explained by the relatively low sensitivity of proteomic analysis.

Genome-wide identification and expression pattern analysis of pollen allergen genes in rice and Arabidopsis. A and B, Prediction of potential pollen allergens from both proteome and transcriptome data. Totals of 145 and 107 putative pollen allergens were predicted in rice and Arabidopsis, respectively. Among these, 15 rice and 16 Arabidopsis potential pollen allergens already described are shown. C, Expression patterns of 143 putative pollen allergens (two putative allergens in rice, LOC_Os06g45180 and LOC_Os03g01630, have no matched Affymetrix probe identifier) in 13 tissues/developmental stages of rice (a, anther An1; b, anther Mei1; c, anther M1; d, anther M2; e, anther M3; f, anther P1; g, anther P2; h, anther P3; i, inflorescence P1; j, inflorescence P2; k, inflorescence P3; l, inflorescence P4; m, inflorescence P5; n, inflorescence P6; o, seed; p, root; q, shoot; and r, mature leaf). Red labels represent pollen tissue specifically expressed, while green cluster labels represent ubiquitously expressed putative allergens. D, Expression patterns of 107 putative pollen allergens in 13 tissues/developmental stages of Arabidopsis (a′, uninucleate microspore; b′, bicellular pollen; c′, tricellular pollen; d′, mature pollen; e′ to h′, flower stages 9, 10/11, 12, and 15; i′, flower; j′, seed; k′, root; l′, vegetative shoot apex; and m′, leaf). Red labels represent pollen tissue-specific expression, while green cluster labels represent ubiquitously expressed putative allergens.
Figure 1.

Genome-wide identification and expression pattern analysis of pollen allergen genes in rice and Arabidopsis. A and B, Prediction of potential pollen allergens from both proteome and transcriptome data. Totals of 145 and 107 putative pollen allergens were predicted in rice and Arabidopsis, respectively. Among these, 15 rice and 16 Arabidopsis potential pollen allergens already described are shown. C, Expression patterns of 143 putative pollen allergens (two putative allergens in rice, LOC_Os06g45180 and LOC_Os03g01630, have no matched Affymetrix probe identifier) in 13 tissues/developmental stages of rice (a, anther An1; b, anther Mei1; c, anther M1; d, anther M2; e, anther M3; f, anther P1; g, anther P2; h, anther P3; i, inflorescence P1; j, inflorescence P2; k, inflorescence P3; l, inflorescence P4; m, inflorescence P5; n, inflorescence P6; o, seed; p, root; q, shoot; and r, mature leaf). Red labels represent pollen tissue specifically expressed, while green cluster labels represent ubiquitously expressed putative allergens. D, Expression patterns of 107 putative pollen allergens in 13 tissues/developmental stages of Arabidopsis (a′, uninucleate microspore; b′, bicellular pollen; c′, tricellular pollen; d′, mature pollen; e′ to h′, flower stages 9, 10/11, 12, and 15; i′, flower; j′, seed; k′, root; l′, vegetative shoot apex; and m′, leaf). Red labels represent pollen tissue-specific expression, while green cluster labels represent ubiquitously expressed putative allergens.

Gene information and family classification of putative pollen allergens

Table I.
Gene information and family classification of putative pollen allergens
Gene IdentifierGene FamilyInterproSpecific ExpressionaTranscriptomeProteome
Arabidopsis
 AT1G23800Aldehyde dehydrogenaseIPR015590T
 AT4G25780Allergen V5/Tpx-1 relatedIPR001283ST
 AT3G09590Allergen V5/Tpx-1 relatedIPR001283T
 AT1G01310Allergen V5/Tpx-1 relatedIPR001283ST
 AT1G66400EF handIPR002048P
 AT5G17480EF handIPR002048ST
 AT3G03430EF handIPR002048ST
 AT4G03290EF handIPR002048ST
 AT1G77840eIF4-γ/eIF5/eIF2-εIPR003307T
 AT2G36530EnolaseIPR000941P
 AT5G57320GelsolinIPR007122ST
 AT2G41740GelsolinIPR007122P
 AT4G27120Gene family candidate, 0016872/T
 AT2G05620Gene family candidate, 0020947/T
 AT5G18310Gene family candidate, 0025808/T
 AT3G14040Glycoside hydrolase, family 28IPR000743SP
 AT3G07850Glycoside hydrolase, family 28IPR000743SP
 AT5G48140Glycoside hydrolase, family 28IPR000743ST
 AT1G60590Glycoside hydrolase, family 28IPR000743T
 AT1G02790Glycoside hydrolase, family 28IPR000743SP
 AT1G48100Glycoside hydrolase, family 28IPR000743T
 AT3G07820Glycoside hydrolase, family 28IPR000743ST
 AT3G07840Glycoside hydrolase, family 28IPR000743ST
 AT3G07830Glycoside hydrolase, family 28IPR000743ST
 AT1G55120Glycoside hydrolase, family 32IPR001362T
 AT1G62660Glycoside hydrolase, family 32IPR001362T
 AT3G52600Glycoside hydrolase, family 32IPR001362ST
 AT2G36190Glycoside hydrolase, family 32IPR001362T
 AT1G12240Glycoside hydrolase, family 32IPR001362T
 AT1G09080Heat shock protein70IPR013126SP
 AT3G09440Heat shock protein70IPR013126P
 AT5G02490Heat shock protein70IPR013126P
 AT5G09590Heat shock protein70IPR013126P
 AT5G42020Heat shock protein70IPR013126P
 AT5G28540Heat shock protein70IPR013126P
 AT4G37910Heat shock protein70IPR013126T
 AT3G12580Heat shock protein70IPR013126T
 AT1G11660Heat shock protein70IPR013126P
 AT5G02500Heat shock protein70IPR013126P
 AT5G03030Heat shock protein DnaJ, N terminalIPR001623T
 AT4G24190Heat shock protein90IPR001404T
 AT5G56000Heat shock protein90IPR001404T
 AT5G03380Heavy metal transport/detoxification proteinIPR006121T
 AT3G15020Malate dehydrogenase, type 1IPR010097P
 AT3G47520Malate dehydrogenase, type 1IPR010097P
 AT1G53240Malate dehydrogenase, type 1IPR010097P
 AT3G10920Manganese and iron superoxide dismutaseIPR001189P
 AT4G08580Microfibrillar-associated1, C terminalIPR009730T
 AT3G63140NAD-dependent epimerase/dehydrataseIPR001509T
 AT3G04500Nucleotide-binding, α-β plaitIPR012677T
 AT1G14420Pectate lyase/Amb allergenIPR002022ST
 AT5G15110Pectate lyase/Amb allergenIPR002022ST
 AT3G24670Pectate lyase/Amb allergenIPR002022T
 AT3G07010Pectate lyase/Amb allergenIPR002022T
 AT3G01270Pectate lyase/Amb allergenIPR002022ST
 AT2G02720Pectate lyase/Amb allergenIPR002022ST
 AT1G69940Pectinesterase, catalyticIPR000070SP
 AT5G07430Pectinesterase, catalyticIPR000070ST
 AT5G07420Pectinesterase, catalyticIPR000070ST
 AT3G06830Pectinesterase, catalyticIPR000070ST
 AT5G43060Peptidase C1A, papainIPR013128T
 AT1G09850Peptidase C1A, papainIPR013128T
 AT4G39090Peptidase C1A, papainIPR013128T
 AT1G47128Peptidase C1A, papainIPR013128T
 AT3G63470Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G52000Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G05850Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G12480Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G45010Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G10410Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT4G00230Peptidase S8, subtilisin relatedIPR015500T
 AT3G14067Peptidase S8, subtilisin relatedIPR015500T
 AT4G26330Peptidase S8, subtilisin relatedIPR015500T
 AT4G34870Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130T
 AT2G16600Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT4G38740Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT2G21130Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT3G60570Pollen allergen, N terminalIPR007117ST
 AT2G39700Pollen allergen, N terminalIPR007117T
 AT1G29140Pollen Ole e 1 allergen and extensinIPR006041ST
 AT4G08685Pollen Ole e 1 allergen and extensinIPR006041T
 AT2G19760Profilin, plantIPR005455P
 AT4G29340Profilin, plantIPR005455SP
 AT2G19770Profilin, plantIPR005455SP
 AT3G44590Ribosomal protein 60SIPR001813T
 AT1G01100Ribosomal protein 60SIPR001813T
 AT4G00810Ribosomal protein 60SIPR001813T
 AT2G27710Ribosomal protein 60SIPR001813P
 AT1G74000Strictosidine synthaseIPR004141ST
 AT2G28190Superoxide dismutase, copper/zinc bindingIPR001424T
 AT1G08830Superoxide dismutase, copper/zinc bindingIPR001424P
 AT1G75040Thaumatin, pathogenesis relatedIPR001938T
 AT1G75050Thaumatin, pathogenesis relatedIPR001938T
 AT1G45145Thioredoxin foldIPR012335T
 AT3G08710Thioredoxin foldIPR012335T
 AT5G42980Thioredoxin foldIPR012335T
 AT1G21750Thioredoxin likeIPR017936P
 AT2G47470Thioredoxin likeIPR017936P
 AT1G77510Thioredoxin likeIPR017936P
 AT5G19510Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038P
 AT1G68185UbiquitinIPR000626T
 AT1G80040Ubiquitin system component CueIPR003892T
 AT2G39900Zinc finger, LIM typeIPR001781T
 AT3G55770Zinc finger, LIM typeIPR001781T
 AT5G64920Zinc finger, RING typeIPR001841T
 AT3G54680//T
 AT1G30750//T
Rice
 LOC_Os03g18850Bet v I allergenIPR000916T
 LOC_Os12g36850Bet v I allergenIPR000916T
 LOC_Os07g10570Bifunctional inhibitor/plant lipid transfer protein/seed storageIPR016140T
 LOC_Os01g55690Cupin1IPR006045ST
 LOC_Os10g26060Cupin1IPR006045T
 LOC_Os06g36010CupredoxinIPR008972ST
 LOC_Os01g08410Cyclin likeIPR005814T
 LOC_Os05g27730DNA-binding WRKYIPR003657T
 LOC_Os08g44660EF handIPR002048ST
 LOC_Os09g24580EF handIPR002048T
 LOC_Os06g48350eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os09g15770eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os10g08550EnolaseIPR000941TP
 LOC_Os03g14450EnolaseIPR000941T
 LOC_Os09g20820EnolaseIPR000941T
 LOC_Os04g25150Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os04g25160Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os04g25190Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45200Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g44470Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os06g45160Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45180Expansin, cellulose-binding-like domainIPR007117TP
 LOC_Os10g40090Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01610Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os12g36040Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01640Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os01g60770Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os02g51040Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os03g01630Expansin/pollen allergen, DPBB domainIPR007112TP
 LOC_Os01g47780FAS1 domainIPR000782T
 LOC_Os01g57570Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os05g42190Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os04g51440GelsolinIPR007122ST
 LOC_Os03g24220GelsolinIPR007122T
 LOC_Os06g44890GelsolinIPR007122T
 LOC_Os01g46080Gene family candidate, 0005331IPR001087T
 LOC_Os05g32190Gene family candidate, 0014938/T
 LOC_Os09g12620Gene family candidate, 0015182/ST
 LOC_Os06g21120Gene family candidate, 0015338/T
 LOC_Os08g09180Gene family candidate, 0018686/T
 LOC_Os03g17140Gene family candidate, 0020281/T
 LOC_Os02g38490Gene family candidate, 0021210/T
 LOC_Os08g13980Glycoside hydrolase, family 16IPR013320T
 LOC_Os06g39060Glycoside hydrolase, family 17IPR000490ST
 LOC_Os05g41610Glycoside hydrolase, family 17IPR000490T
 LOC_Os01g71810Glycoside hydrolase, family 17IPR000490T
 LOC_Os09g36280Glycoside hydrolase, family 17IPR000490TP
 LOC_Os04g41680Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g39330Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os08g41100Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g10300Glycoside hydrolase, family 28IPR012334STP
 LOC_Os01g33300Glycoside hydrolase, family 28IPR012334ST
 LOC_Os06g35320Glycoside hydrolase, family 28IPR012334ST
 LOC_Os08g23790Glycoside hydrolase, family 28IPR012334STP
 LOC_Os06g40890Glycoside hydrolase, family 28IPR012334T
 LOC_Os05g46510Glycoside hydrolase, family 28IPR012334T
 LOC_Os03g61800Glycoside hydrolase, family 28IPR012334T
 LOC_Os02g01590Glycoside hydrolase, family 32IPR013148T
 LOC_Os04g45290Glycoside hydrolase, family 32IPR013148T
 LOC_Os11g08440Heat shock protein70IPR001023T
 LOC_Os03g16920Heat shock protein70IPR001023T
 LOC_Os03g16880Heat shock protein70IPR001023T
 LOC_Os11g47760Heat shock protein70IPR001023P
 LOC_Os03g16860Heat shock protein70IPR001023T
 LOC_Os02g02410Heat shock protein70IPR001023T
 LOC_Os02g53420Heat shock protein70IPR001023T
 LOC_Os03g02260Heat shock protein70IPR001023P
 LOC_Os03g60620Heat shock protein70IPR001023TP
 LOC_Os09g31486Heat shock protein70IPR001023T
 LOC_Os05g38530Heat shock protein70IPR001023T
 LOC_Os04g01740Heat shock protein90IPR001404T
 LOC_Os08g39140Heat shock protein90IPR001404T
 LOC_Os09g30412Heat shock protein90IPR001404T
 LOC_Os09g29840Heat shock protein90IPR001404T
 LOC_Os01g51140Helix-loop-helix DNA bindingIPR011598T
 LOC_Os08g04390Helix-loop-helix DNA bindingIPR011598T
 LOC_Os10g25420Lipase, GDSLIPR001087T
 LOC_Os03g25040Lipase, GDSLIPR001087T
 LOC_Os05g49880Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g46070Malate dehydrogenase, type 1IPR010097T
 LOC_Os08g33720Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g61380Malate dehydrogenase, type 1IPR010097T
 LOC_Os03g56280Malate dehydrogenase, type 1IPR010097T
 LOC_Os05g25850Manganese and iron superoxide dismutaseIPR001189TP
 LOC_Os06g43710Man-6-P receptor, bindingIPR009011T
 LOC_Os07g06590MD-2-related lipid recognitionIPR003172T
 LOC_Os06g27760Met sulfoxide reductase BIPR002579T
 LOC_Os09g28060MORN motifIPR003409T
 LOC_Os06g05660Nucleosome assembly proteinIPR002164T
 LOC_Os06g05209Pectate lyase/Amb allergenIPR002022ST
 LOC_Os02g12300Pectate lyase/Amb allergenIPR002022ST
 LOC_Os06g38510Pectate lyase/Amb allergenIPR002022ST
 LOC_Os11g45730Pectinesterase, catalyticIPR012334ST
 LOC_Os02g43010Peptidase C13, legumainIPR001096T
 LOC_Os04g24600Peptidase C1A, papainIPR013128T
 LOC_Os02g27030Peptidase C1A, papainIPR013128T
 LOC_Os05g01810Peptidase C1A, papainIPR013128T
 LOC_Os01g73980Peptidase C1A, papainIPR013128T
 LOC_Os11g14900Peptidase C1A, papainIPR013128T
 LOC_Os03g09190Peptidase S10, Ser carboxypeptidaseIPR001563T
 LOC_Os04g47150Peptidase S8, subtilisin relatedIPR015500STP
 LOC_Os02g44590Peptidase S8, subtilisin relatedIPR015500T
 LOC_Os06g49480Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os09g39780Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g01270Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os02g02890Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g40010Plant lipid transfer proteinIPR000528T
 LOC_Os09g39950Pollen Ole e 1 allergen and extensinIPR006041SP
 LOC_Os06g36240Pollen Ole e 1 allergen and extensinIPR006041ST
 LOC_Os10g17660Profilin, plantIPR002097ST
 LOC_Os06g05880Profilin, plantIPR002097T
 LOC_Os02g55710Proteasome maturation factor UMP1IPR008012T
 LOC_Os04g57810Protein of unknown function DUF689IPR007785T
 LOC_Os01g72490Protein of unknown function DUF702IPR007818T
 LOC_Os02g05630Protein phosphatase 2C, N terminalIPR015655T
 LOC_Os01g16430Proteinase inhibitor I25, cystatinIPR000010T
 LOC_Os01g25540Rapid alkalinization factorIPR008801T
 LOC_Os06g48780Ribosomal protein 60SIPR001813T
 LOC_Os01g13080Ribosomal protein 60SIPR001813T
 LOC_Os05g37330Ribosomal protein 60SIPR001813T
 LOC_Os08g02340Ribosomal protein 60SIPR001813T
 LOC_Os01g09510Ribosomal protein 60SIPR001813T
 LOC_Os02g32760Ribosomal protein 60SIPR001813T
 LOC_Os03g22810Superoxide dismutase, copper/zinc bindingIPR001424TP
 LOC_Os07g46990Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os03g11960Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os11g36340Targeting for Xklp2IPR009675T
 LOC_Os01g24090Tetratricopeptide regionIPR013026T
 LOC_Os04g44830Thioredoxin, coreIPR013766T
 LOC_Os05g06430Thioredoxin likeIPR013766T
 LOC_Os09g27830Thioredoxin likeIPR013766T
 LOC_Os01g23740Thioredoxin likeIPR013766T
 LOC_Os06g42000Thioredoxin-like foldIPR012336T
 LOC_Os10g25290TifyIPR010399T
 LOC_Os07g46750Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038T
 LOC_Os07g42300Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038TP
 LOC_Os08g37310Uncharacterized protein family UPF0029, N terminalIPR001498T
 LOC_Os05g50490X8 domainIPR012946T
 LOC_Os09g36820Zinc finger, Mcm10/DnaG typeIPR015408T
 LOC_Os01g72480Zinc finger, RING typeIPR001841T
 LOC_Os05g30940//ST
 LOC_Os05g29740//ST
 LOC_Os09g31080//T
 LOC_Os08g25080//T
 LOC_Os04g57440//T
Gene IdentifierGene FamilyInterproSpecific ExpressionaTranscriptomeProteome
Arabidopsis
 AT1G23800Aldehyde dehydrogenaseIPR015590T
 AT4G25780Allergen V5/Tpx-1 relatedIPR001283ST
 AT3G09590Allergen V5/Tpx-1 relatedIPR001283T
 AT1G01310Allergen V5/Tpx-1 relatedIPR001283ST
 AT1G66400EF handIPR002048P
 AT5G17480EF handIPR002048ST
 AT3G03430EF handIPR002048ST
 AT4G03290EF handIPR002048ST
 AT1G77840eIF4-γ/eIF5/eIF2-εIPR003307T
 AT2G36530EnolaseIPR000941P
 AT5G57320GelsolinIPR007122ST
 AT2G41740GelsolinIPR007122P
 AT4G27120Gene family candidate, 0016872/T
 AT2G05620Gene family candidate, 0020947/T
 AT5G18310Gene family candidate, 0025808/T
 AT3G14040Glycoside hydrolase, family 28IPR000743SP
 AT3G07850Glycoside hydrolase, family 28IPR000743SP
 AT5G48140Glycoside hydrolase, family 28IPR000743ST
 AT1G60590Glycoside hydrolase, family 28IPR000743T
 AT1G02790Glycoside hydrolase, family 28IPR000743SP
 AT1G48100Glycoside hydrolase, family 28IPR000743T
 AT3G07820Glycoside hydrolase, family 28IPR000743ST
 AT3G07840Glycoside hydrolase, family 28IPR000743ST
 AT3G07830Glycoside hydrolase, family 28IPR000743ST
 AT1G55120Glycoside hydrolase, family 32IPR001362T
 AT1G62660Glycoside hydrolase, family 32IPR001362T
 AT3G52600Glycoside hydrolase, family 32IPR001362ST
 AT2G36190Glycoside hydrolase, family 32IPR001362T
 AT1G12240Glycoside hydrolase, family 32IPR001362T
 AT1G09080Heat shock protein70IPR013126SP
 AT3G09440Heat shock protein70IPR013126P
 AT5G02490Heat shock protein70IPR013126P
 AT5G09590Heat shock protein70IPR013126P
 AT5G42020Heat shock protein70IPR013126P
 AT5G28540Heat shock protein70IPR013126P
 AT4G37910Heat shock protein70IPR013126T
 AT3G12580Heat shock protein70IPR013126T
 AT1G11660Heat shock protein70IPR013126P
 AT5G02500Heat shock protein70IPR013126P
 AT5G03030Heat shock protein DnaJ, N terminalIPR001623T
 AT4G24190Heat shock protein90IPR001404T
 AT5G56000Heat shock protein90IPR001404T
 AT5G03380Heavy metal transport/detoxification proteinIPR006121T
 AT3G15020Malate dehydrogenase, type 1IPR010097P
 AT3G47520Malate dehydrogenase, type 1IPR010097P
 AT1G53240Malate dehydrogenase, type 1IPR010097P
 AT3G10920Manganese and iron superoxide dismutaseIPR001189P
 AT4G08580Microfibrillar-associated1, C terminalIPR009730T
 AT3G63140NAD-dependent epimerase/dehydrataseIPR001509T
 AT3G04500Nucleotide-binding, α-β plaitIPR012677T
 AT1G14420Pectate lyase/Amb allergenIPR002022ST
 AT5G15110Pectate lyase/Amb allergenIPR002022ST
 AT3G24670Pectate lyase/Amb allergenIPR002022T
 AT3G07010Pectate lyase/Amb allergenIPR002022T
 AT3G01270Pectate lyase/Amb allergenIPR002022ST
 AT2G02720Pectate lyase/Amb allergenIPR002022ST
 AT1G69940Pectinesterase, catalyticIPR000070SP
 AT5G07430Pectinesterase, catalyticIPR000070ST
 AT5G07420Pectinesterase, catalyticIPR000070ST
 AT3G06830Pectinesterase, catalyticIPR000070ST
 AT5G43060Peptidase C1A, papainIPR013128T
 AT1G09850Peptidase C1A, papainIPR013128T
 AT4G39090Peptidase C1A, papainIPR013128T
 AT1G47128Peptidase C1A, papainIPR013128T
 AT3G63470Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G52000Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G05850Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G12480Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G45010Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G10410Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT4G00230Peptidase S8, subtilisin relatedIPR015500T
 AT3G14067Peptidase S8, subtilisin relatedIPR015500T
 AT4G26330Peptidase S8, subtilisin relatedIPR015500T
 AT4G34870Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130T
 AT2G16600Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT4G38740Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT2G21130Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT3G60570Pollen allergen, N terminalIPR007117ST
 AT2G39700Pollen allergen, N terminalIPR007117T
 AT1G29140Pollen Ole e 1 allergen and extensinIPR006041ST
 AT4G08685Pollen Ole e 1 allergen and extensinIPR006041T
 AT2G19760Profilin, plantIPR005455P
 AT4G29340Profilin, plantIPR005455SP
 AT2G19770Profilin, plantIPR005455SP
 AT3G44590Ribosomal protein 60SIPR001813T
 AT1G01100Ribosomal protein 60SIPR001813T
 AT4G00810Ribosomal protein 60SIPR001813T
 AT2G27710Ribosomal protein 60SIPR001813P
 AT1G74000Strictosidine synthaseIPR004141ST
 AT2G28190Superoxide dismutase, copper/zinc bindingIPR001424T
 AT1G08830Superoxide dismutase, copper/zinc bindingIPR001424P
 AT1G75040Thaumatin, pathogenesis relatedIPR001938T
 AT1G75050Thaumatin, pathogenesis relatedIPR001938T
 AT1G45145Thioredoxin foldIPR012335T
 AT3G08710Thioredoxin foldIPR012335T
 AT5G42980Thioredoxin foldIPR012335T
 AT1G21750Thioredoxin likeIPR017936P
 AT2G47470Thioredoxin likeIPR017936P
 AT1G77510Thioredoxin likeIPR017936P
 AT5G19510Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038P
 AT1G68185UbiquitinIPR000626T
 AT1G80040Ubiquitin system component CueIPR003892T
 AT2G39900Zinc finger, LIM typeIPR001781T
 AT3G55770Zinc finger, LIM typeIPR001781T
 AT5G64920Zinc finger, RING typeIPR001841T
 AT3G54680//T
 AT1G30750//T
Rice
 LOC_Os03g18850Bet v I allergenIPR000916T
 LOC_Os12g36850Bet v I allergenIPR000916T
 LOC_Os07g10570Bifunctional inhibitor/plant lipid transfer protein/seed storageIPR016140T
 LOC_Os01g55690Cupin1IPR006045ST
 LOC_Os10g26060Cupin1IPR006045T
 LOC_Os06g36010CupredoxinIPR008972ST
 LOC_Os01g08410Cyclin likeIPR005814T
 LOC_Os05g27730DNA-binding WRKYIPR003657T
 LOC_Os08g44660EF handIPR002048ST
 LOC_Os09g24580EF handIPR002048T
 LOC_Os06g48350eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os09g15770eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os10g08550EnolaseIPR000941TP
 LOC_Os03g14450EnolaseIPR000941T
 LOC_Os09g20820EnolaseIPR000941T
 LOC_Os04g25150Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os04g25160Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os04g25190Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45200Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g44470Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os06g45160Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45180Expansin, cellulose-binding-like domainIPR007117TP
 LOC_Os10g40090Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01610Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os12g36040Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01640Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os01g60770Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os02g51040Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os03g01630Expansin/pollen allergen, DPBB domainIPR007112TP
 LOC_Os01g47780FAS1 domainIPR000782T
 LOC_Os01g57570Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os05g42190Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os04g51440GelsolinIPR007122ST
 LOC_Os03g24220GelsolinIPR007122T
 LOC_Os06g44890GelsolinIPR007122T
 LOC_Os01g46080Gene family candidate, 0005331IPR001087T
 LOC_Os05g32190Gene family candidate, 0014938/T
 LOC_Os09g12620Gene family candidate, 0015182/ST
 LOC_Os06g21120Gene family candidate, 0015338/T
 LOC_Os08g09180Gene family candidate, 0018686/T
 LOC_Os03g17140Gene family candidate, 0020281/T
 LOC_Os02g38490Gene family candidate, 0021210/T
 LOC_Os08g13980Glycoside hydrolase, family 16IPR013320T
 LOC_Os06g39060Glycoside hydrolase, family 17IPR000490ST
 LOC_Os05g41610Glycoside hydrolase, family 17IPR000490T
 LOC_Os01g71810Glycoside hydrolase, family 17IPR000490T
 LOC_Os09g36280Glycoside hydrolase, family 17IPR000490TP
 LOC_Os04g41680Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g39330Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os08g41100Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g10300Glycoside hydrolase, family 28IPR012334STP
 LOC_Os01g33300Glycoside hydrolase, family 28IPR012334ST
 LOC_Os06g35320Glycoside hydrolase, family 28IPR012334ST
 LOC_Os08g23790Glycoside hydrolase, family 28IPR012334STP
 LOC_Os06g40890Glycoside hydrolase, family 28IPR012334T
 LOC_Os05g46510Glycoside hydrolase, family 28IPR012334T
 LOC_Os03g61800Glycoside hydrolase, family 28IPR012334T
 LOC_Os02g01590Glycoside hydrolase, family 32IPR013148T
 LOC_Os04g45290Glycoside hydrolase, family 32IPR013148T
 LOC_Os11g08440Heat shock protein70IPR001023T
 LOC_Os03g16920Heat shock protein70IPR001023T
 LOC_Os03g16880Heat shock protein70IPR001023T
 LOC_Os11g47760Heat shock protein70IPR001023P
 LOC_Os03g16860Heat shock protein70IPR001023T
 LOC_Os02g02410Heat shock protein70IPR001023T
 LOC_Os02g53420Heat shock protein70IPR001023T
 LOC_Os03g02260Heat shock protein70IPR001023P
 LOC_Os03g60620Heat shock protein70IPR001023TP
 LOC_Os09g31486Heat shock protein70IPR001023T
 LOC_Os05g38530Heat shock protein70IPR001023T
 LOC_Os04g01740Heat shock protein90IPR001404T
 LOC_Os08g39140Heat shock protein90IPR001404T
 LOC_Os09g30412Heat shock protein90IPR001404T
 LOC_Os09g29840Heat shock protein90IPR001404T
 LOC_Os01g51140Helix-loop-helix DNA bindingIPR011598T
 LOC_Os08g04390Helix-loop-helix DNA bindingIPR011598T
 LOC_Os10g25420Lipase, GDSLIPR001087T
 LOC_Os03g25040Lipase, GDSLIPR001087T
 LOC_Os05g49880Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g46070Malate dehydrogenase, type 1IPR010097T
 LOC_Os08g33720Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g61380Malate dehydrogenase, type 1IPR010097T
 LOC_Os03g56280Malate dehydrogenase, type 1IPR010097T
 LOC_Os05g25850Manganese and iron superoxide dismutaseIPR001189TP
 LOC_Os06g43710Man-6-P receptor, bindingIPR009011T
 LOC_Os07g06590MD-2-related lipid recognitionIPR003172T
 LOC_Os06g27760Met sulfoxide reductase BIPR002579T
 LOC_Os09g28060MORN motifIPR003409T
 LOC_Os06g05660Nucleosome assembly proteinIPR002164T
 LOC_Os06g05209Pectate lyase/Amb allergenIPR002022ST
 LOC_Os02g12300Pectate lyase/Amb allergenIPR002022ST
 LOC_Os06g38510Pectate lyase/Amb allergenIPR002022ST
 LOC_Os11g45730Pectinesterase, catalyticIPR012334ST
 LOC_Os02g43010Peptidase C13, legumainIPR001096T
 LOC_Os04g24600Peptidase C1A, papainIPR013128T
 LOC_Os02g27030Peptidase C1A, papainIPR013128T
 LOC_Os05g01810Peptidase C1A, papainIPR013128T
 LOC_Os01g73980Peptidase C1A, papainIPR013128T
 LOC_Os11g14900Peptidase C1A, papainIPR013128T
 LOC_Os03g09190Peptidase S10, Ser carboxypeptidaseIPR001563T
 LOC_Os04g47150Peptidase S8, subtilisin relatedIPR015500STP
 LOC_Os02g44590Peptidase S8, subtilisin relatedIPR015500T
 LOC_Os06g49480Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os09g39780Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g01270Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os02g02890Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g40010Plant lipid transfer proteinIPR000528T
 LOC_Os09g39950Pollen Ole e 1 allergen and extensinIPR006041SP
 LOC_Os06g36240Pollen Ole e 1 allergen and extensinIPR006041ST
 LOC_Os10g17660Profilin, plantIPR002097ST
 LOC_Os06g05880Profilin, plantIPR002097T
 LOC_Os02g55710Proteasome maturation factor UMP1IPR008012T
 LOC_Os04g57810Protein of unknown function DUF689IPR007785T
 LOC_Os01g72490Protein of unknown function DUF702IPR007818T
 LOC_Os02g05630Protein phosphatase 2C, N terminalIPR015655T
 LOC_Os01g16430Proteinase inhibitor I25, cystatinIPR000010T
 LOC_Os01g25540Rapid alkalinization factorIPR008801T
 LOC_Os06g48780Ribosomal protein 60SIPR001813T
 LOC_Os01g13080Ribosomal protein 60SIPR001813T
 LOC_Os05g37330Ribosomal protein 60SIPR001813T
 LOC_Os08g02340Ribosomal protein 60SIPR001813T
 LOC_Os01g09510Ribosomal protein 60SIPR001813T
 LOC_Os02g32760Ribosomal protein 60SIPR001813T
 LOC_Os03g22810Superoxide dismutase, copper/zinc bindingIPR001424TP
 LOC_Os07g46990Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os03g11960Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os11g36340Targeting for Xklp2IPR009675T
 LOC_Os01g24090Tetratricopeptide regionIPR013026T
 LOC_Os04g44830Thioredoxin, coreIPR013766T
 LOC_Os05g06430Thioredoxin likeIPR013766T
 LOC_Os09g27830Thioredoxin likeIPR013766T
 LOC_Os01g23740Thioredoxin likeIPR013766T
 LOC_Os06g42000Thioredoxin-like foldIPR012336T
 LOC_Os10g25290TifyIPR010399T
 LOC_Os07g46750Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038T
 LOC_Os07g42300Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038TP
 LOC_Os08g37310Uncharacterized protein family UPF0029, N terminalIPR001498T
 LOC_Os05g50490X8 domainIPR012946T
 LOC_Os09g36820Zinc finger, Mcm10/DnaG typeIPR015408T
 LOC_Os01g72480Zinc finger, RING typeIPR001841T
 LOC_Os05g30940//ST
 LOC_Os05g29740//ST
 LOC_Os09g31080//T
 LOC_Os08g25080//T
 LOC_Os04g57440//T
a

Genes that belong to the pollen specifically expressed gene cluster are marked S.

Table I.
Gene information and family classification of putative pollen allergens
Gene IdentifierGene FamilyInterproSpecific ExpressionaTranscriptomeProteome
Arabidopsis
 AT1G23800Aldehyde dehydrogenaseIPR015590T
 AT4G25780Allergen V5/Tpx-1 relatedIPR001283ST
 AT3G09590Allergen V5/Tpx-1 relatedIPR001283T
 AT1G01310Allergen V5/Tpx-1 relatedIPR001283ST
 AT1G66400EF handIPR002048P
 AT5G17480EF handIPR002048ST
 AT3G03430EF handIPR002048ST
 AT4G03290EF handIPR002048ST
 AT1G77840eIF4-γ/eIF5/eIF2-εIPR003307T
 AT2G36530EnolaseIPR000941P
 AT5G57320GelsolinIPR007122ST
 AT2G41740GelsolinIPR007122P
 AT4G27120Gene family candidate, 0016872/T
 AT2G05620Gene family candidate, 0020947/T
 AT5G18310Gene family candidate, 0025808/T
 AT3G14040Glycoside hydrolase, family 28IPR000743SP
 AT3G07850Glycoside hydrolase, family 28IPR000743SP
 AT5G48140Glycoside hydrolase, family 28IPR000743ST
 AT1G60590Glycoside hydrolase, family 28IPR000743T
 AT1G02790Glycoside hydrolase, family 28IPR000743SP
 AT1G48100Glycoside hydrolase, family 28IPR000743T
 AT3G07820Glycoside hydrolase, family 28IPR000743ST
 AT3G07840Glycoside hydrolase, family 28IPR000743ST
 AT3G07830Glycoside hydrolase, family 28IPR000743ST
 AT1G55120Glycoside hydrolase, family 32IPR001362T
 AT1G62660Glycoside hydrolase, family 32IPR001362T
 AT3G52600Glycoside hydrolase, family 32IPR001362ST
 AT2G36190Glycoside hydrolase, family 32IPR001362T
 AT1G12240Glycoside hydrolase, family 32IPR001362T
 AT1G09080Heat shock protein70IPR013126SP
 AT3G09440Heat shock protein70IPR013126P
 AT5G02490Heat shock protein70IPR013126P
 AT5G09590Heat shock protein70IPR013126P
 AT5G42020Heat shock protein70IPR013126P
 AT5G28540Heat shock protein70IPR013126P
 AT4G37910Heat shock protein70IPR013126T
 AT3G12580Heat shock protein70IPR013126T
 AT1G11660Heat shock protein70IPR013126P
 AT5G02500Heat shock protein70IPR013126P
 AT5G03030Heat shock protein DnaJ, N terminalIPR001623T
 AT4G24190Heat shock protein90IPR001404T
 AT5G56000Heat shock protein90IPR001404T
 AT5G03380Heavy metal transport/detoxification proteinIPR006121T
 AT3G15020Malate dehydrogenase, type 1IPR010097P
 AT3G47520Malate dehydrogenase, type 1IPR010097P
 AT1G53240Malate dehydrogenase, type 1IPR010097P
 AT3G10920Manganese and iron superoxide dismutaseIPR001189P
 AT4G08580Microfibrillar-associated1, C terminalIPR009730T
 AT3G63140NAD-dependent epimerase/dehydrataseIPR001509T
 AT3G04500Nucleotide-binding, α-β plaitIPR012677T
 AT1G14420Pectate lyase/Amb allergenIPR002022ST
 AT5G15110Pectate lyase/Amb allergenIPR002022ST
 AT3G24670Pectate lyase/Amb allergenIPR002022T
 AT3G07010Pectate lyase/Amb allergenIPR002022T
 AT3G01270Pectate lyase/Amb allergenIPR002022ST
 AT2G02720Pectate lyase/Amb allergenIPR002022ST
 AT1G69940Pectinesterase, catalyticIPR000070SP
 AT5G07430Pectinesterase, catalyticIPR000070ST
 AT5G07420Pectinesterase, catalyticIPR000070ST
 AT3G06830Pectinesterase, catalyticIPR000070ST
 AT5G43060Peptidase C1A, papainIPR013128T
 AT1G09850Peptidase C1A, papainIPR013128T
 AT4G39090Peptidase C1A, papainIPR013128T
 AT1G47128Peptidase C1A, papainIPR013128T
 AT3G63470Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G52000Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G05850Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G12480Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G45010Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G10410Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT4G00230Peptidase S8, subtilisin relatedIPR015500T
 AT3G14067Peptidase S8, subtilisin relatedIPR015500T
 AT4G26330Peptidase S8, subtilisin relatedIPR015500T
 AT4G34870Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130T
 AT2G16600Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT4G38740Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT2G21130Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT3G60570Pollen allergen, N terminalIPR007117ST
 AT2G39700Pollen allergen, N terminalIPR007117T
 AT1G29140Pollen Ole e 1 allergen and extensinIPR006041ST
 AT4G08685Pollen Ole e 1 allergen and extensinIPR006041T
 AT2G19760Profilin, plantIPR005455P
 AT4G29340Profilin, plantIPR005455SP
 AT2G19770Profilin, plantIPR005455SP
 AT3G44590Ribosomal protein 60SIPR001813T
 AT1G01100Ribosomal protein 60SIPR001813T
 AT4G00810Ribosomal protein 60SIPR001813T
 AT2G27710Ribosomal protein 60SIPR001813P
 AT1G74000Strictosidine synthaseIPR004141ST
 AT2G28190Superoxide dismutase, copper/zinc bindingIPR001424T
 AT1G08830Superoxide dismutase, copper/zinc bindingIPR001424P
 AT1G75040Thaumatin, pathogenesis relatedIPR001938T
 AT1G75050Thaumatin, pathogenesis relatedIPR001938T
 AT1G45145Thioredoxin foldIPR012335T
 AT3G08710Thioredoxin foldIPR012335T
 AT5G42980Thioredoxin foldIPR012335T
 AT1G21750Thioredoxin likeIPR017936P
 AT2G47470Thioredoxin likeIPR017936P
 AT1G77510Thioredoxin likeIPR017936P
 AT5G19510Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038P
 AT1G68185UbiquitinIPR000626T
 AT1G80040Ubiquitin system component CueIPR003892T
 AT2G39900Zinc finger, LIM typeIPR001781T
 AT3G55770Zinc finger, LIM typeIPR001781T
 AT5G64920Zinc finger, RING typeIPR001841T
 AT3G54680//T
 AT1G30750//T
Rice
 LOC_Os03g18850Bet v I allergenIPR000916T
 LOC_Os12g36850Bet v I allergenIPR000916T
 LOC_Os07g10570Bifunctional inhibitor/plant lipid transfer protein/seed storageIPR016140T
 LOC_Os01g55690Cupin1IPR006045ST
 LOC_Os10g26060Cupin1IPR006045T
 LOC_Os06g36010CupredoxinIPR008972ST
 LOC_Os01g08410Cyclin likeIPR005814T
 LOC_Os05g27730DNA-binding WRKYIPR003657T
 LOC_Os08g44660EF handIPR002048ST
 LOC_Os09g24580EF handIPR002048T
 LOC_Os06g48350eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os09g15770eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os10g08550EnolaseIPR000941TP
 LOC_Os03g14450EnolaseIPR000941T
 LOC_Os09g20820EnolaseIPR000941T
 LOC_Os04g25150Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os04g25160Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os04g25190Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45200Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g44470Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os06g45160Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45180Expansin, cellulose-binding-like domainIPR007117TP
 LOC_Os10g40090Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01610Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os12g36040Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01640Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os01g60770Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os02g51040Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os03g01630Expansin/pollen allergen, DPBB domainIPR007112TP
 LOC_Os01g47780FAS1 domainIPR000782T
 LOC_Os01g57570Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os05g42190Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os04g51440GelsolinIPR007122ST
 LOC_Os03g24220GelsolinIPR007122T
 LOC_Os06g44890GelsolinIPR007122T
 LOC_Os01g46080Gene family candidate, 0005331IPR001087T
 LOC_Os05g32190Gene family candidate, 0014938/T
 LOC_Os09g12620Gene family candidate, 0015182/ST
 LOC_Os06g21120Gene family candidate, 0015338/T
 LOC_Os08g09180Gene family candidate, 0018686/T
 LOC_Os03g17140Gene family candidate, 0020281/T
 LOC_Os02g38490Gene family candidate, 0021210/T
 LOC_Os08g13980Glycoside hydrolase, family 16IPR013320T
 LOC_Os06g39060Glycoside hydrolase, family 17IPR000490ST
 LOC_Os05g41610Glycoside hydrolase, family 17IPR000490T
 LOC_Os01g71810Glycoside hydrolase, family 17IPR000490T
 LOC_Os09g36280Glycoside hydrolase, family 17IPR000490TP
 LOC_Os04g41680Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g39330Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os08g41100Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g10300Glycoside hydrolase, family 28IPR012334STP
 LOC_Os01g33300Glycoside hydrolase, family 28IPR012334ST
 LOC_Os06g35320Glycoside hydrolase, family 28IPR012334ST
 LOC_Os08g23790Glycoside hydrolase, family 28IPR012334STP
 LOC_Os06g40890Glycoside hydrolase, family 28IPR012334T
 LOC_Os05g46510Glycoside hydrolase, family 28IPR012334T
 LOC_Os03g61800Glycoside hydrolase, family 28IPR012334T
 LOC_Os02g01590Glycoside hydrolase, family 32IPR013148T
 LOC_Os04g45290Glycoside hydrolase, family 32IPR013148T
 LOC_Os11g08440Heat shock protein70IPR001023T
 LOC_Os03g16920Heat shock protein70IPR001023T
 LOC_Os03g16880Heat shock protein70IPR001023T
 LOC_Os11g47760Heat shock protein70IPR001023P
 LOC_Os03g16860Heat shock protein70IPR001023T
 LOC_Os02g02410Heat shock protein70IPR001023T
 LOC_Os02g53420Heat shock protein70IPR001023T
 LOC_Os03g02260Heat shock protein70IPR001023P
 LOC_Os03g60620Heat shock protein70IPR001023TP
 LOC_Os09g31486Heat shock protein70IPR001023T
 LOC_Os05g38530Heat shock protein70IPR001023T
 LOC_Os04g01740Heat shock protein90IPR001404T
 LOC_Os08g39140Heat shock protein90IPR001404T
 LOC_Os09g30412Heat shock protein90IPR001404T
 LOC_Os09g29840Heat shock protein90IPR001404T
 LOC_Os01g51140Helix-loop-helix DNA bindingIPR011598T
 LOC_Os08g04390Helix-loop-helix DNA bindingIPR011598T
 LOC_Os10g25420Lipase, GDSLIPR001087T
 LOC_Os03g25040Lipase, GDSLIPR001087T
 LOC_Os05g49880Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g46070Malate dehydrogenase, type 1IPR010097T
 LOC_Os08g33720Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g61380Malate dehydrogenase, type 1IPR010097T
 LOC_Os03g56280Malate dehydrogenase, type 1IPR010097T
 LOC_Os05g25850Manganese and iron superoxide dismutaseIPR001189TP
 LOC_Os06g43710Man-6-P receptor, bindingIPR009011T
 LOC_Os07g06590MD-2-related lipid recognitionIPR003172T
 LOC_Os06g27760Met sulfoxide reductase BIPR002579T
 LOC_Os09g28060MORN motifIPR003409T
 LOC_Os06g05660Nucleosome assembly proteinIPR002164T
 LOC_Os06g05209Pectate lyase/Amb allergenIPR002022ST
 LOC_Os02g12300Pectate lyase/Amb allergenIPR002022ST
 LOC_Os06g38510Pectate lyase/Amb allergenIPR002022ST
 LOC_Os11g45730Pectinesterase, catalyticIPR012334ST
 LOC_Os02g43010Peptidase C13, legumainIPR001096T
 LOC_Os04g24600Peptidase C1A, papainIPR013128T
 LOC_Os02g27030Peptidase C1A, papainIPR013128T
 LOC_Os05g01810Peptidase C1A, papainIPR013128T
 LOC_Os01g73980Peptidase C1A, papainIPR013128T
 LOC_Os11g14900Peptidase C1A, papainIPR013128T
 LOC_Os03g09190Peptidase S10, Ser carboxypeptidaseIPR001563T
 LOC_Os04g47150Peptidase S8, subtilisin relatedIPR015500STP
 LOC_Os02g44590Peptidase S8, subtilisin relatedIPR015500T
 LOC_Os06g49480Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os09g39780Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g01270Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os02g02890Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g40010Plant lipid transfer proteinIPR000528T
 LOC_Os09g39950Pollen Ole e 1 allergen and extensinIPR006041SP
 LOC_Os06g36240Pollen Ole e 1 allergen and extensinIPR006041ST
 LOC_Os10g17660Profilin, plantIPR002097ST
 LOC_Os06g05880Profilin, plantIPR002097T
 LOC_Os02g55710Proteasome maturation factor UMP1IPR008012T
 LOC_Os04g57810Protein of unknown function DUF689IPR007785T
 LOC_Os01g72490Protein of unknown function DUF702IPR007818T
 LOC_Os02g05630Protein phosphatase 2C, N terminalIPR015655T
 LOC_Os01g16430Proteinase inhibitor I25, cystatinIPR000010T
 LOC_Os01g25540Rapid alkalinization factorIPR008801T
 LOC_Os06g48780Ribosomal protein 60SIPR001813T
 LOC_Os01g13080Ribosomal protein 60SIPR001813T
 LOC_Os05g37330Ribosomal protein 60SIPR001813T
 LOC_Os08g02340Ribosomal protein 60SIPR001813T
 LOC_Os01g09510Ribosomal protein 60SIPR001813T
 LOC_Os02g32760Ribosomal protein 60SIPR001813T
 LOC_Os03g22810Superoxide dismutase, copper/zinc bindingIPR001424TP
 LOC_Os07g46990Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os03g11960Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os11g36340Targeting for Xklp2IPR009675T
 LOC_Os01g24090Tetratricopeptide regionIPR013026T
 LOC_Os04g44830Thioredoxin, coreIPR013766T
 LOC_Os05g06430Thioredoxin likeIPR013766T
 LOC_Os09g27830Thioredoxin likeIPR013766T
 LOC_Os01g23740Thioredoxin likeIPR013766T
 LOC_Os06g42000Thioredoxin-like foldIPR012336T
 LOC_Os10g25290TifyIPR010399T
 LOC_Os07g46750Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038T
 LOC_Os07g42300Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038TP
 LOC_Os08g37310Uncharacterized protein family UPF0029, N terminalIPR001498T
 LOC_Os05g50490X8 domainIPR012946T
 LOC_Os09g36820Zinc finger, Mcm10/DnaG typeIPR015408T
 LOC_Os01g72480Zinc finger, RING typeIPR001841T
 LOC_Os05g30940//ST
 LOC_Os05g29740//ST
 LOC_Os09g31080//T
 LOC_Os08g25080//T
 LOC_Os04g57440//T
Gene IdentifierGene FamilyInterproSpecific ExpressionaTranscriptomeProteome
Arabidopsis
 AT1G23800Aldehyde dehydrogenaseIPR015590T
 AT4G25780Allergen V5/Tpx-1 relatedIPR001283ST
 AT3G09590Allergen V5/Tpx-1 relatedIPR001283T
 AT1G01310Allergen V5/Tpx-1 relatedIPR001283ST
 AT1G66400EF handIPR002048P
 AT5G17480EF handIPR002048ST
 AT3G03430EF handIPR002048ST
 AT4G03290EF handIPR002048ST
 AT1G77840eIF4-γ/eIF5/eIF2-εIPR003307T
 AT2G36530EnolaseIPR000941P
 AT5G57320GelsolinIPR007122ST
 AT2G41740GelsolinIPR007122P
 AT4G27120Gene family candidate, 0016872/T
 AT2G05620Gene family candidate, 0020947/T
 AT5G18310Gene family candidate, 0025808/T
 AT3G14040Glycoside hydrolase, family 28IPR000743SP
 AT3G07850Glycoside hydrolase, family 28IPR000743SP
 AT5G48140Glycoside hydrolase, family 28IPR000743ST
 AT1G60590Glycoside hydrolase, family 28IPR000743T
 AT1G02790Glycoside hydrolase, family 28IPR000743SP
 AT1G48100Glycoside hydrolase, family 28IPR000743T
 AT3G07820Glycoside hydrolase, family 28IPR000743ST
 AT3G07840Glycoside hydrolase, family 28IPR000743ST
 AT3G07830Glycoside hydrolase, family 28IPR000743ST
 AT1G55120Glycoside hydrolase, family 32IPR001362T
 AT1G62660Glycoside hydrolase, family 32IPR001362T
 AT3G52600Glycoside hydrolase, family 32IPR001362ST
 AT2G36190Glycoside hydrolase, family 32IPR001362T
 AT1G12240Glycoside hydrolase, family 32IPR001362T
 AT1G09080Heat shock protein70IPR013126SP
 AT3G09440Heat shock protein70IPR013126P
 AT5G02490Heat shock protein70IPR013126P
 AT5G09590Heat shock protein70IPR013126P
 AT5G42020Heat shock protein70IPR013126P
 AT5G28540Heat shock protein70IPR013126P
 AT4G37910Heat shock protein70IPR013126T
 AT3G12580Heat shock protein70IPR013126T
 AT1G11660Heat shock protein70IPR013126P
 AT5G02500Heat shock protein70IPR013126P
 AT5G03030Heat shock protein DnaJ, N terminalIPR001623T
 AT4G24190Heat shock protein90IPR001404T
 AT5G56000Heat shock protein90IPR001404T
 AT5G03380Heavy metal transport/detoxification proteinIPR006121T
 AT3G15020Malate dehydrogenase, type 1IPR010097P
 AT3G47520Malate dehydrogenase, type 1IPR010097P
 AT1G53240Malate dehydrogenase, type 1IPR010097P
 AT3G10920Manganese and iron superoxide dismutaseIPR001189P
 AT4G08580Microfibrillar-associated1, C terminalIPR009730T
 AT3G63140NAD-dependent epimerase/dehydrataseIPR001509T
 AT3G04500Nucleotide-binding, α-β plaitIPR012677T
 AT1G14420Pectate lyase/Amb allergenIPR002022ST
 AT5G15110Pectate lyase/Amb allergenIPR002022ST
 AT3G24670Pectate lyase/Amb allergenIPR002022T
 AT3G07010Pectate lyase/Amb allergenIPR002022T
 AT3G01270Pectate lyase/Amb allergenIPR002022ST
 AT2G02720Pectate lyase/Amb allergenIPR002022ST
 AT1G69940Pectinesterase, catalyticIPR000070SP
 AT5G07430Pectinesterase, catalyticIPR000070ST
 AT5G07420Pectinesterase, catalyticIPR000070ST
 AT3G06830Pectinesterase, catalyticIPR000070ST
 AT5G43060Peptidase C1A, papainIPR013128T
 AT1G09850Peptidase C1A, papainIPR013128T
 AT4G39090Peptidase C1A, papainIPR013128T
 AT1G47128Peptidase C1A, papainIPR013128T
 AT3G63470Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G52000Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G05850Peptidase S10, Ser carboxypeptidaseIPR001563ST
 AT2G12480Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G45010Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT3G10410Peptidase S10, Ser carboxypeptidaseIPR001563T
 AT4G00230Peptidase S8, subtilisin relatedIPR015500T
 AT3G14067Peptidase S8, subtilisin relatedIPR015500T
 AT4G26330Peptidase S8, subtilisin relatedIPR015500T
 AT4G34870Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130T
 AT2G16600Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT4G38740Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT2G21130Peptidyl-prolyl cis-trans-isomerase, cyclophilin typeIPR002130P
 AT3G60570Pollen allergen, N terminalIPR007117ST
 AT2G39700Pollen allergen, N terminalIPR007117T
 AT1G29140Pollen Ole e 1 allergen and extensinIPR006041ST
 AT4G08685Pollen Ole e 1 allergen and extensinIPR006041T
 AT2G19760Profilin, plantIPR005455P
 AT4G29340Profilin, plantIPR005455SP
 AT2G19770Profilin, plantIPR005455SP
 AT3G44590Ribosomal protein 60SIPR001813T
 AT1G01100Ribosomal protein 60SIPR001813T
 AT4G00810Ribosomal protein 60SIPR001813T
 AT2G27710Ribosomal protein 60SIPR001813P
 AT1G74000Strictosidine synthaseIPR004141ST
 AT2G28190Superoxide dismutase, copper/zinc bindingIPR001424T
 AT1G08830Superoxide dismutase, copper/zinc bindingIPR001424P
 AT1G75040Thaumatin, pathogenesis relatedIPR001938T
 AT1G75050Thaumatin, pathogenesis relatedIPR001938T
 AT1G45145Thioredoxin foldIPR012335T
 AT3G08710Thioredoxin foldIPR012335T
 AT5G42980Thioredoxin foldIPR012335T
 AT1G21750Thioredoxin likeIPR017936P
 AT2G47470Thioredoxin likeIPR017936P
 AT1G77510Thioredoxin likeIPR017936P
 AT5G19510Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038P
 AT1G68185UbiquitinIPR000626T
 AT1G80040Ubiquitin system component CueIPR003892T
 AT2G39900Zinc finger, LIM typeIPR001781T
 AT3G55770Zinc finger, LIM typeIPR001781T
 AT5G64920Zinc finger, RING typeIPR001841T
 AT3G54680//T
 AT1G30750//T
Rice
 LOC_Os03g18850Bet v I allergenIPR000916T
 LOC_Os12g36850Bet v I allergenIPR000916T
 LOC_Os07g10570Bifunctional inhibitor/plant lipid transfer protein/seed storageIPR016140T
 LOC_Os01g55690Cupin1IPR006045ST
 LOC_Os10g26060Cupin1IPR006045T
 LOC_Os06g36010CupredoxinIPR008972ST
 LOC_Os01g08410Cyclin likeIPR005814T
 LOC_Os05g27730DNA-binding WRKYIPR003657T
 LOC_Os08g44660EF handIPR002048ST
 LOC_Os09g24580EF handIPR002048T
 LOC_Os06g48350eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os09g15770eIF4-γ/eIF5/eIF2-εIPR003307T
 LOC_Os10g08550EnolaseIPR000941TP
 LOC_Os03g14450EnolaseIPR000941T
 LOC_Os09g20820EnolaseIPR000941T
 LOC_Os04g25150Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os04g25160Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os04g25190Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45200Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g44470Expansin, cellulose-binding-like domainIPR007117STP
 LOC_Os06g45160Expansin, cellulose-binding-like domainIPR007117ST
 LOC_Os06g45180Expansin, cellulose-binding-like domainIPR007117TP
 LOC_Os10g40090Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01610Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os12g36040Expansin/pollen allergen, DPBB domainIPR007112ST
 LOC_Os03g01640Expansin/pollen allergen, DPBB domainIPR007112STP
 LOC_Os01g60770Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os02g51040Expansin/pollen allergen, DPBB domainIPR007112T
 LOC_Os03g01630Expansin/pollen allergen, DPBB domainIPR007112TP
 LOC_Os01g47780FAS1 domainIPR000782T
 LOC_Os01g57570Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os05g42190Flavodoxin/nitric oxide synthaseIPR008254T
 LOC_Os04g51440GelsolinIPR007122ST
 LOC_Os03g24220GelsolinIPR007122T
 LOC_Os06g44890GelsolinIPR007122T
 LOC_Os01g46080Gene family candidate, 0005331IPR001087T
 LOC_Os05g32190Gene family candidate, 0014938/T
 LOC_Os09g12620Gene family candidate, 0015182/ST
 LOC_Os06g21120Gene family candidate, 0015338/T
 LOC_Os08g09180Gene family candidate, 0018686/T
 LOC_Os03g17140Gene family candidate, 0020281/T
 LOC_Os02g38490Gene family candidate, 0021210/T
 LOC_Os08g13980Glycoside hydrolase, family 16IPR013320T
 LOC_Os06g39060Glycoside hydrolase, family 17IPR000490ST
 LOC_Os05g41610Glycoside hydrolase, family 17IPR000490T
 LOC_Os01g71810Glycoside hydrolase, family 17IPR000490T
 LOC_Os09g36280Glycoside hydrolase, family 17IPR000490TP
 LOC_Os04g41680Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g39330Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os08g41100Glycoside hydrolase, family 19, catalyticIPR000726T
 LOC_Os02g10300Glycoside hydrolase, family 28IPR012334STP
 LOC_Os01g33300Glycoside hydrolase, family 28IPR012334ST
 LOC_Os06g35320Glycoside hydrolase, family 28IPR012334ST
 LOC_Os08g23790Glycoside hydrolase, family 28IPR012334STP
 LOC_Os06g40890Glycoside hydrolase, family 28IPR012334T
 LOC_Os05g46510Glycoside hydrolase, family 28IPR012334T
 LOC_Os03g61800Glycoside hydrolase, family 28IPR012334T
 LOC_Os02g01590Glycoside hydrolase, family 32IPR013148T
 LOC_Os04g45290Glycoside hydrolase, family 32IPR013148T
 LOC_Os11g08440Heat shock protein70IPR001023T
 LOC_Os03g16920Heat shock protein70IPR001023T
 LOC_Os03g16880Heat shock protein70IPR001023T
 LOC_Os11g47760Heat shock protein70IPR001023P
 LOC_Os03g16860Heat shock protein70IPR001023T
 LOC_Os02g02410Heat shock protein70IPR001023T
 LOC_Os02g53420Heat shock protein70IPR001023T
 LOC_Os03g02260Heat shock protein70IPR001023P
 LOC_Os03g60620Heat shock protein70IPR001023TP
 LOC_Os09g31486Heat shock protein70IPR001023T
 LOC_Os05g38530Heat shock protein70IPR001023T
 LOC_Os04g01740Heat shock protein90IPR001404T
 LOC_Os08g39140Heat shock protein90IPR001404T
 LOC_Os09g30412Heat shock protein90IPR001404T
 LOC_Os09g29840Heat shock protein90IPR001404T
 LOC_Os01g51140Helix-loop-helix DNA bindingIPR011598T
 LOC_Os08g04390Helix-loop-helix DNA bindingIPR011598T
 LOC_Os10g25420Lipase, GDSLIPR001087T
 LOC_Os03g25040Lipase, GDSLIPR001087T
 LOC_Os05g49880Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g46070Malate dehydrogenase, type 1IPR010097T
 LOC_Os08g33720Malate dehydrogenase, type 1IPR010097T
 LOC_Os01g61380Malate dehydrogenase, type 1IPR010097T
 LOC_Os03g56280Malate dehydrogenase, type 1IPR010097T
 LOC_Os05g25850Manganese and iron superoxide dismutaseIPR001189TP
 LOC_Os06g43710Man-6-P receptor, bindingIPR009011T
 LOC_Os07g06590MD-2-related lipid recognitionIPR003172T
 LOC_Os06g27760Met sulfoxide reductase BIPR002579T
 LOC_Os09g28060MORN motifIPR003409T
 LOC_Os06g05660Nucleosome assembly proteinIPR002164T
 LOC_Os06g05209Pectate lyase/Amb allergenIPR002022ST
 LOC_Os02g12300Pectate lyase/Amb allergenIPR002022ST
 LOC_Os06g38510Pectate lyase/Amb allergenIPR002022ST
 LOC_Os11g45730Pectinesterase, catalyticIPR012334ST
 LOC_Os02g43010Peptidase C13, legumainIPR001096T
 LOC_Os04g24600Peptidase C1A, papainIPR013128T
 LOC_Os02g27030Peptidase C1A, papainIPR013128T
 LOC_Os05g01810Peptidase C1A, papainIPR013128T
 LOC_Os01g73980Peptidase C1A, papainIPR013128T
 LOC_Os11g14900Peptidase C1A, papainIPR013128T
 LOC_Os03g09190Peptidase S10, Ser carboxypeptidaseIPR001563T
 LOC_Os04g47150Peptidase S8, subtilisin relatedIPR015500STP
 LOC_Os02g44590Peptidase S8, subtilisin relatedIPR015500T
 LOC_Os06g49480Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os09g39780Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g01270Peptidyl-prolyl cis-trans-isomeraseIPR002130T
 LOC_Os02g02890Peptidyl-prolyl cis-trans-isomeraseIPR002130P
 LOC_Os05g40010Plant lipid transfer proteinIPR000528T
 LOC_Os09g39950Pollen Ole e 1 allergen and extensinIPR006041SP
 LOC_Os06g36240Pollen Ole e 1 allergen and extensinIPR006041ST
 LOC_Os10g17660Profilin, plantIPR002097ST
 LOC_Os06g05880Profilin, plantIPR002097T
 LOC_Os02g55710Proteasome maturation factor UMP1IPR008012T
 LOC_Os04g57810Protein of unknown function DUF689IPR007785T
 LOC_Os01g72490Protein of unknown function DUF702IPR007818T
 LOC_Os02g05630Protein phosphatase 2C, N terminalIPR015655T
 LOC_Os01g16430Proteinase inhibitor I25, cystatinIPR000010T
 LOC_Os01g25540Rapid alkalinization factorIPR008801T
 LOC_Os06g48780Ribosomal protein 60SIPR001813T
 LOC_Os01g13080Ribosomal protein 60SIPR001813T
 LOC_Os05g37330Ribosomal protein 60SIPR001813T
 LOC_Os08g02340Ribosomal protein 60SIPR001813T
 LOC_Os01g09510Ribosomal protein 60SIPR001813T
 LOC_Os02g32760Ribosomal protein 60SIPR001813T
 LOC_Os03g22810Superoxide dismutase, copper/zinc bindingIPR001424TP
 LOC_Os07g46990Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os03g11960Superoxide dismutase, copper/zinc bindingIPR001424T
 LOC_Os11g36340Targeting for Xklp2IPR009675T
 LOC_Os01g24090Tetratricopeptide regionIPR013026T
 LOC_Os04g44830Thioredoxin, coreIPR013766T
 LOC_Os05g06430Thioredoxin likeIPR013766T
 LOC_Os09g27830Thioredoxin likeIPR013766T
 LOC_Os01g23740Thioredoxin likeIPR013766T
 LOC_Os06g42000Thioredoxin-like foldIPR012336T
 LOC_Os10g25290TifyIPR010399T
 LOC_Os07g46750Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038T
 LOC_Os07g42300Translation elongation factor EF1B, β- and δ-chains, guanine nucleotide exchangeIPR014038TP
 LOC_Os08g37310Uncharacterized protein family UPF0029, N terminalIPR001498T
 LOC_Os05g50490X8 domainIPR012946T
 LOC_Os09g36820Zinc finger, Mcm10/DnaG typeIPR015408T
 LOC_Os01g72480Zinc finger, RING typeIPR001841T
 LOC_Os05g30940//ST
 LOC_Os05g29740//ST
 LOC_Os09g31080//T
 LOC_Os08g25080//T
 LOC_Os04g57440//T
a

Genes that belong to the pollen specifically expressed gene cluster are marked S.

The AllFam database of allergen families (Radauer and Breiteneder, 2006) contains over 2,500 protein families present in seed plants, and of these 2,500 families, about 59 plant protein families are inhalation allergens. In our analysis, we identified 254 putative pollen allergens (145 in rice and 107 in Arabidopsis) that were classified into 81 protein families, including most of the known allergenic pollen protein families present in the AllFam database (Radauer et al., 2008; Table I; Supplemental Fig. S1). Of these 81 families, 10 of the 13 known allergens were identified, except for three known allergens (Ara t GLP, Ara t 3, and Ory s 23) from Arabidopsis and rice (Supplemental Table S1), demonstrating the reliability of our prediction method. The absence of three known allergens is possibly due to their low expression levels in pollen and the absence of probes in microarrays (Qin et al., 2009; Wei et al., 2010).

Expression Analysis and Functional Prediction of Candidate Pollen Allergens

To understand the biological functions of these identified putative pollen allergens from Arabidopsis and rice, we performed Gene Ontology analysis and observed that these putative pollen allergens have key housekeeping biological functions, such as metabolic and cellular activities, stress response, and cellular component formation (Fig. 2A). For instance: Bet v 1, PR10 proteins, are associated with stress responses; profilins regulate actin polymerization by sequestering or releasing actin monomer during pollen growth; and polcalcins are involved in calcium signaling to help guide pollen tube growth (Supplemental Table S1). To further characterize the functions of these putative pollen allergens, we performed in silico expression analysis. Our previous clustering analysis demonstrated that these candidates were present in pollen proteomic and transcriptomic data, but they also displayed distinct temporal expression patterns. In Figure 1, C and D, genes present in the green cluster (60 of 143 genes in rice and 33 of 107 genes in Arabidopsis) displayed ubiquitous expression that was associated mainly with stress responses, oxygen species metabolism, and glycolysis. The genes in the red cluster (31 of 143 genes in rice and 26 of 107 genes in Arabidopsis) exhibited high expression specifically in tricellular and mature pollens, and these genes were largely related to cell wall metabolism and organization (Figs. 1, C and D, and 2B). The main allergens specifically expressed in pollen include polygalacturonases, pectate lyases, and expansins that participate in the metabolism of carbohydrates and pollen tube wall formation during germination (Barral et al., 2005). These analyses imply that pollen-specific allergens were functionally restricted in pollen to be involved in cell wall metabolic activities, while the ubiquitous expressed putative allergens were associated mainly with stress responses (Supplemental Fig. S2).

Gene Ontology (GO) enrichment analysis of putative pollen allergens in rice and Arabidopsis. A, GO analysis of putative allergen genes in rice and Arabidopsis. B, Significant biological process GO terms of ubiquitously widely expressed putative pollen allergens and pollen-specific putative allergens. The significance of each GO term was evaluated by –log10 (P value). Enrichment against all genes in the same GO term and percentage of query genes in each GO term are shown in parentheses.
Figure 2.

Gene Ontology (GO) enrichment analysis of putative pollen allergens in rice and Arabidopsis. A, GO analysis of putative allergen genes in rice and Arabidopsis. B, Significant biological process GO terms of ubiquitously widely expressed putative pollen allergens and pollen-specific putative allergens. The significance of each GO term was evaluated by –log10 (P value). Enrichment against all genes in the same GO term and percentage of query genes in each GO term are shown in parentheses.

Phylogenetic Analysis of Putative Pollen Allergens among 25 Plant Species

To understand the evolutionary events that gave rise to pollen allergens in plants, we identified the closest homologs (present in protein families) of these putative pollen allergens from rice and Arabidopsis in 25 sequenced plant species ranging from lower plants (green alga) to higher plants (angiosperms; Fig. 3). During angiosperm evolution, multiple rounds of polyploidy occurred (Bowers et al., 2003; Adams and Wendel, 2005); therefore, we proposed that pollen allergens might have expanded via gene duplication. In our analysis, a total of 1,797 and 1,302 close homologs of pollen allergens in rice and Arabidopsis were identified from the genomes of the 25 plant species, and in most families, the number of homologs increased from green alga to angiosperms. Notably, some putative allergenic protein families displayed multiple sequences with high similarity in one species. For example, two rice expansins had 12 close sequence homologs in Fragaria vesca with little variability (Musidlowska-Persson et al., 2007).

Taxonomic distribution of putative pollen allergen homologs. The phylogenetic tree shows homologous genes of putative pollen allergens of each protein family identified in rice and Arabidopsis and 25 other plant species (25 species from green alga to higher plant). The numbers of homologous genes found in other species are shown in the matrix. The numbers of recognized plant allergens in databases are shown under the names for each family.
Figure 3.

Taxonomic distribution of putative pollen allergen homologs. The phylogenetic tree shows homologous genes of putative pollen allergens of each protein family identified in rice and Arabidopsis and 25 other plant species (25 species from green alga to higher plant). The numbers of homologous genes found in other species are shown in the matrix. The numbers of recognized plant allergens in databases are shown under the names for each family.

Among the 48 putative pollen allergenic protein families, such as HEAT SHOCK PROTEIN70 (Hsp70), profilin and thioredoxin-like families seemed to have an ancient origin, as evidenced by the presence of the homologs in lower plants. Hsp70 was shown to be expressed in maize (Zea mays) and tomato (Solanum lycopersicum) pollen grains, and without functional characterization, it is plausible to deduce that they are associated with protecting cellular structures from stress (Gruehn et al., 2003; Supplemental Table S1). One maize pollen-expressed profilin, designated ZmPRO4, was characterized to have the poly-l-Pro-binding function that is required for the modulation of actin cytoskeletal dynamics in pollen (Gibbon et al., 1998). Thioredoxin-like proteins expressed in Arabidopsis pollen grains have been reported to be required for osmotic stress tolerance and male sporogenesis as well as male-female interaction (Lakhssassi et al., 2012). Due to the conservation of these gene families in multiple species, it is likely that these genes share an ancient common ancestor and that their functions may be retained in plants (Fig. 3; Supplemental Table S2).

In contrast, putative allergens in 33 other families only had homologs in either monocots or dicots, suggesting that these genes were generated subsequently in higher plants. Plant lipid transfer proteins are small, abundant lipid-binding proteins that are able to exchange lipids between membranes. The pollen allergenic lipid transfer proteins such as Ara t 3, Zea m 14, and Tri a 14 may have the function of transfering lipids and fatty acids through cell membranes (Thoma et al., 1993; Arondel et al., 2000; Pastorello et al., 2000; Wang et al., 2005; Sander et al., 2011). One monocot-specific lipid transfer protein, OsC6, expressed mainly in tapetal cells, is shown to bind lipidic molecules and affect the pollen wall and fertility (Zhang et al., 2010). Several members encoding expansins containing the cellulose-binding-like domain (IPR007117) were observed only in monocots (Fig. 3; Supplemental Fig. S3). The expansin family regulates cell wall expansion, and pollen-expressed β-expansins aid in pollen tube growth and penetration (Supplemental Table S2). The cellulose-binding-like expansin homologs seemed to share a recent monocot common ancestor and to have high sequence conservation between species; however, these genes may have further evolved grass-specific functions compared with the DPBB expansins that are found in both monocots and dicots (Fig. 3). Likewise, the pollen-expressed polygalacturonase, one major allergen in some grass and cypress species, only has homologs in higher plants. Allergenic polygalacturonases from the Japanese cypress Chamaecyparis obtusa (Mori et al., 1999) and timothy grass (Phleum pratense; Suck et al., 2000) play roles in pollen maturation and pollen tube growth (Supplemental Table S2). Another allergen found only in higher plants, pollen Ole e 1 allergen (Jimenez-Lopez et al., 2012), accumulates in pollen tube cell walls and may have a role in pollen germination and pollen tube growth (Supplemental Table S2). Interestingly, Arabidopsis pectinesterase, another pollen allergen (Mahler et al., 2001), only has homologs in dicots, which have sequence variation with that of rice counterparts. Pectinesterase from olive (Olea europaea) was reported to affect cell wall stability during pollen germination and pollen tube growth through the deesterification of pectin into pectate and methanol (Salamanca et al., 2010; Esteve et al., 2012; Jimenez-Lopez et al., 2012). Altogether, our observations on the putative allergens of 33 other families suggest that they may have evolved in parallel in either monocots or dicots with diversified biological functions.

Evolutionary Events in Generating and Maintaining Pollen Allergens

Gene duplication events that produce functionally redundant genes have been considered a main driver underlying gene evolution (Nei, 1969; Lynch and Conery, 2000; Cui et al., 2015). Therefore, we asked whether sequence variation within these duplicated genes affects the allergenicity of proteins. Pollen allergens seemed to be produced by gene duplication events. The proportion of duplicated genes (including tandem repeat and block repeat) in pollen-expressed genes was about 40% in Arabidopsis and 30% in rice. However, the percentage of duplicated genes in putative pollen allergens increased markedly, 60% in Arabidopsis and 49% in rice (Fig. 4, A and B). In genetics, Ka/Ks represents the ratio of the number of nonsynonymous substitutions per nonsynonymous site to the number of synonymous substitutions per synonymous site, and the value of Ka/Ks can be used as an indicator of selective pressure on a protein-coding gene. A gene with Ka/Ks > 1 is usually regarded as having evolved under positive selection, while Ka/Ks < 1 is usually regarded as an indicator of genes having undergone purifying selection (Hurst, 2002). Although many putative pollen allergen genes seemed to be produced by duplication, the Ka/Ks values of these genes were low, which means a low ratio of nonsynonymous substitutions of these genes, suggesting that these pollen allergenic proteins evolved under purifying selection (Fig. 4, C and D). In rice, the Ka/Ks rate of allergen genes was around 0.25, which is similar to that of Arabidopsis (below 0.2), indicating that pollen allergens generated from duplication events have been maintained by purifying selection.

Percentage of gene duplication and Ka/Ks rate of potential pollen allergens. A, Percentage of gene duplication events found in background (pollen-expressed genes), putative pollen allergens, and pollen-specific allergens. Different duplication types are shown as different colors (blue = block, purple = tandem and block, and red = tandem). The number of genes detected and the total number of genes are shown on top of each column. The significance of differences was calculated by a hypergeometric distribution (**, P < 10−4 and ***, P < 10−5). B, Box plot of Ka/Ks rates of putative pollen allergens and pollen-expressed genes in rice and Arabidopsis. Four related species were selected to calculate Ka/Ks rates for rice (Oryza sativa ssp. indica 9311, Oryza brachyantha, Oryza glaberrima, and Peiai 64S [PA64S]) and Arabidopsis (Arabidopsis lyrata, Brassica rapa, Capsella rubella, and Thellungiella parvula).
Figure 4.

Percentage of gene duplication and Ka/Ks rate of potential pollen allergens. A, Percentage of gene duplication events found in background (pollen-expressed genes), putative pollen allergens, and pollen-specific allergens. Different duplication types are shown as different colors (blue = block, purple = tandem and block, and red = tandem). The number of genes detected and the total number of genes are shown on top of each column. The significance of differences was calculated by a hypergeometric distribution (**, P < 10−4 and ***, P < 10−5). B, Box plot of Ka/Ks rates of putative pollen allergens and pollen-expressed genes in rice and Arabidopsis. Four related species were selected to calculate Ka/Ks rates for rice (Oryza sativa ssp. indica 9311, Oryza brachyantha, Oryza glaberrima, and Peiai 64S [PA64S]) and Arabidopsis (Arabidopsis lyrata, Brassica rapa, Capsella rubella, and Thellungiella parvula).

Profilins Represent the Ancient Allergenic Families

To further investigate the evolution and the relationship between allergenicity and the biological function of pollen allergens, two major allergenic families, profilins and expansins, were analyzed further. Profilin is an actin-binding protein involved in the dynamic turnover and restructuring of the actin cytoskeleton. Plant profilins share many of the same biochemical properties and are structurally similar to nonplant profilins (Thorn et al., 1997). Profilin is a common pan-allergen in plants and is present in many plant organs, thereby leading to various routes of exposure depending on the plant species (Valenta et al., 1992). As shown in the phylogenetic tree, profilins in six monocots were present in one clade, and allergenic profilins that have the same route of exposure tend to be in another clade. For example, pollen profilins were seen in the grass family, fruit profilins in family Rosaceae, and seed profilins in family Leguminosae (Fig. 5A). LOC_Os10g17660 and LOC_Os10g17680 are tandem duplicated genes in rice, and both were highly expressed in late anther developmental stages, while tandem duplicated gene pairs (AtProfilin1/AtProfilin5 and AtProfilin2/AtProfilin4) showed totally different expression patterns in Arabidopsis. AtProfilin1 and AtProfilin2 were expressed in many tissues, while AtProfilin4 and AtProfilin5 were expressed specifically and highly in pollen (Fig. 5B). AtProfilin4 and AtProfilin5 redundantly regulate polarized pollen tube growth (Liu et al., 2015). Obviously, proteins like AtProfilin4 and AtProfilin5 have a higher probability to be pollen allergens. The sequence and structure of Ara h 5 in peanut have been studied extensively, and eight surface-exposed epitopes were identified (Radauer et al., 2006; Cabanos et al., 2010) and are were mapped in Figure 5C for comparison. These epitopes included some crucial amino acid residues required for biological function and structural roles in profilin; for example, epitope 1 includes two pyridoxal-5′-phosphate-binding residues (Thorn et al., 1997; Fig. 5C; Supplemental Fig. S5). Most known allergenic profilins, such as Zea m 12, Mal d 4, and Api g 4, displayed almost no variation in epitope sequence, while profilins in rice and lower plants exhibited more variation (Fig. 5C; Supplemental Fig. S4). These results indicate that the allergenicity of the profilin family was changed possibly through evolution. Furthermore, variations in epitope position caused structural changes in the proteins of Ara h 5 and AtProfilin1 (Fig. 5D). Variations in epitopes also were found among members of the profilin family of Arabidopsis.

Phylogenetic tree, expression patterns, and sequence alignment of the profilin family. A, Unrooted neighbor-joining tree generated from sequence alignments of profilins in rice and Arabidopsis and major profilin allergens in other species. Ama r, Amaranthus retroflexus; Amb a, Ambrosia artemisiifolia; Art v, Artemisia vulgaris; Bet v, Betula verrucosa; Mer a, Mercurialis annua; Ric c, Ricinus communis; Par j, Parietaria judaica; Pro j, Prosopis juliflora; Sal k, Salsola kali; Ana c, Ananas comosus; Cyn d, Cynodon dactylon; Lil l, Lilium longiflorm; Phl p, Phleum pratense; Tri a, Triticum aestivum; Zea m, Zea mays; Cit s, Citrus sinensis; Fra a, Fragaria ananassa; Mal d, Malus domestica; Pyr c, Pyrus communis; Pru p, Prunus persica; Man i, Mangifera indica; Gly m, Glycine max; Ara h, Arachis hypogaea; Sin a, Sinapis alba; Hev b, Hevea brasiliensis. B, Expression patterns of profilins in rice and Arabidopsis. Developmental stages and tissues are described in Figure 1. C, Partial sequence alignment (amino acids 1–56) of known pollen allergens in the profilin family and profilins in various species. The secondary structure of Ara h 5 is shown in the first line, and seven putative surface-exposed epitopes are marked by black boxes (#1–#7; Radauer et al., 2006; Cabanos et al., 2010). Within these seven surface-exposed epitopes, amino acids in common with Ara h 5 of allergenic profilins are colored in yellow. Amino acids different from Ara h 5 common epitopes are colored in blue. Crucial residues of known biological function and structural role are marked by red stars (Thorn et al., 1997). Ara_h, Arachis hypogaea; Mal_d, Malus domestica; Api_g, Apium graveolens; Zea_m, Zea mays; Os, Oryza sativa; CR, Chlamydomonas reinhardtii; VC, Volovox carteri; MRCC, Micromonas sp. RCC. D, Three-dimensional (3D) models of Ara h 4 (left) and AtProfilin1 (right), with putative surface epitopes of Ara h 5 and the corresponding position of AtProfilin1 shown. Different epitopes (#1–#7) were mapped on the surface in different colors.
Figure 5.

Phylogenetic tree, expression patterns, and sequence alignment of the profilin family. A, Unrooted neighbor-joining tree generated from sequence alignments of profilins in rice and Arabidopsis and major profilin allergens in other species. Ama r, Amaranthus retroflexus; Amb a, Ambrosia artemisiifolia; Art v, Artemisia vulgaris; Bet v, Betula verrucosa; Mer a, Mercurialis annua; Ric c, Ricinus communis; Par j, Parietaria judaica; Pro j, Prosopis juliflora; Sal k, Salsola kali; Ana c, Ananas comosus; Cyn d, Cynodon dactylon; Lil l, Lilium longiflorm; Phl p, Phleum pratense; Tri a, Triticum aestivum; Zea m, Zea mays; Cit s, Citrus sinensis; Fra a, Fragaria ananassa; Mal d, Malus domestica; Pyr c, Pyrus communis; Pru p, Prunus persica; Man i, Mangifera indica; Gly m, Glycine max; Ara h, Arachis hypogaea; Sin a, Sinapis alba; Hev b, Hevea brasiliensis. B, Expression patterns of profilins in rice and Arabidopsis. Developmental stages and tissues are described in Figure 1. C, Partial sequence alignment (amino acids 1–56) of known pollen allergens in the profilin family and profilins in various species. The secondary structure of Ara h 5 is shown in the first line, and seven putative surface-exposed epitopes are marked by black boxes (#1–#7; Radauer et al., 2006; Cabanos et al., 2010). Within these seven surface-exposed epitopes, amino acids in common with Ara h 5 of allergenic profilins are colored in yellow. Amino acids different from Ara h 5 common epitopes are colored in blue. Crucial residues of known biological function and structural role are marked by red stars (Thorn et al., 1997). Ara_h, Arachis hypogaea; Mal_d, Malus domestica; Api_g, Apium graveolens; Zea_m, Zea mays; Os, Oryza sativa; CR, Chlamydomonas reinhardtii; VC, Volovox carteri; MRCC, Micromonas sp. RCC. D, Three-dimensional (3D) models of Ara h 4 (left) and AtProfilin1 (right), with putative surface epitopes of Ara h 5 and the corresponding position of AtProfilin1 shown. Different epitopes (#1–#7) were mapped on the surface in different colors.

Allergenicity Evolved with the Functional Specification of Expansins in Grass

Expansins are proteins that promote cell wall loosening and extension (Cosgrove, 2000). In pollen, expansins may facilitate cell wall deposition in pollen grains and are involved in pollen germination (Choi et al., 2006). Even though expansins have numerous family members present in both dicots and monocots, only members in the EXPB-I (for β-expansin I) clade of β-expansins in grass are allergenic. In grasses, the EXPB-I clade was separated into two groups (conservative EXPB-I and divergent EXPB-I) by the sigma whole-genome duplication, while known allergenic β-expansins gathered in subbranches of divergent EXPB-I (Tang et al., 2010). The divergent EXPB-I might have evolved to act on highly substituted xylans that were the interstitial material of primary walls in grasses (Sampedro et al., 2015). Phylogenetic analysis showed that expansins in rice clustered into two main branches (conservative EXPB-I and divergent EXPB-I), and all expansins in Arabidopsis belonged to the conserved EXPB-I clade (Fig. 6A). Ory s 1 allergens, which include OsEXPB1, OsEXPB10, and OsEXPB13, were highly expressed in late developmental stages of anther (microspore/pollen) and inflorescence development (Xu et al., 1995; Hirano et al., 2013; Fig. 6B).

Phylogenetic tree, expression patterns, and sequence alignment of the expansin family. A, Unrooted neighbor-joining tree generated from sequence alignments of some known allergenic β-expansins in plants and β-expansins in rice and Arabidopsis. Known allergens are colored red. OsEXPB1, OsEXPB10, and OsEXPB13 are known pollen allergens and are shown as Ory s 1. Rice β-expansins were separated into two clades, the conservative EXPB-I (yellow) and the divergent EXPB-I (red and green). A short-range translocation event separated the divergent EXPB-I into two clades: a pollen-expressed clade and a vegetative-expressed clade. All known allergens clustered together with the pollen-expressed divergent EXPB-I. Cyn d, Cynodon dactylon; Dac g, Dactylis glomerata; Hol l, Holcus lanatus; Lol p, Lolium perenne; Phl p, Phleum pratense; Poa p, Poa pratensis; Zea m, Zea mays. B, Expression patterns of β-expansins in rice and Arabidopsis. Developmental stages and tissues are described in Figure 1. C, Partial sequence alignment (amino acids 1–47) of known pollen allergens in the β-expansin family and β-expansins in rice and Arabidopsis. The secondary structure of Zea m 1 is shown in the first line. The known epitopes are marked by black squares, and functional binding sites are marked by red stars. Zea_m, Zea mays; Cyn_d, Cynodon dactylon; Dac_g, Dactylis glomerata; Os, Oryza sativa; SM, Selaginella moellendorffii; PP, Physcomitrella patens.
Figure 6.

Phylogenetic tree, expression patterns, and sequence alignment of the expansin family. A, Unrooted neighbor-joining tree generated from sequence alignments of some known allergenic β-expansins in plants and β-expansins in rice and Arabidopsis. Known allergens are colored red. OsEXPB1, OsEXPB10, and OsEXPB13 are known pollen allergens and are shown as Ory s 1. Rice β-expansins were separated into two clades, the conservative EXPB-I (yellow) and the divergent EXPB-I (red and green). A short-range translocation event separated the divergent EXPB-I into two clades: a pollen-expressed clade and a vegetative-expressed clade. All known allergens clustered together with the pollen-expressed divergent EXPB-I. Cyn d, Cynodon dactylon; Dac g, Dactylis glomerata; Hol l, Holcus lanatus; Lol p, Lolium perenne; Phl p, Phleum pratense; Poa p, Poa pratensis; Zea m, Zea mays. B, Expression patterns of β-expansins in rice and Arabidopsis. Developmental stages and tissues are described in Figure 1. C, Partial sequence alignment (amino acids 1–47) of known pollen allergens in the β-expansin family and β-expansins in rice and Arabidopsis. The secondary structure of Zea m 1 is shown in the first line. The known epitopes are marked by black squares, and functional binding sites are marked by red stars. Zea_m, Zea mays; Cyn_d, Cynodon dactylon; Dac_g, Dactylis glomerata; Os, Oryza sativa; SM, Selaginella moellendorffii; PP, Physcomitrella patens.

Sequence alignment of β-expansins demonstrated that the identified epitopes of allergenic expansins differed from those of their nonallergic expansin orthologs present in lower plants (Selaginella moellendorffii and Physcomitrella patens), dicots, and monocots (Fig. 6C; Supplemental Fig. S5). These epitopes also included important residues: the epitope SITE-A identified by Esch and Klapper (1989) contained a short binding pocket, and SITE-D identified by Hiller et al. (1997) covered part of the long conserved binding surface with the motif TWYG (Yennawar et al., 2006). Sequence variation among these expansins may lead to diverse functions and allergenicity of each expansin. Rice Ory s 1 is homologous to the maize allergen Zea m 1 and two other pollen allergens, Lol p 1 and Phl p 1 from ryegrass (Lolium perenne) and timothy grass, respectively (Petersen et al., 1995; Cosgrove et al., 1997; Yennawar et al., 2006). Zea m 1 was suggested to be involved in cell wall loosening of the stigma and style, aiding in pollen tube invasion of maternal tissue (Cosgrove et al., 1997). Likewise, Zea m 1 and its isoforms also were shown to have a dose effect in inducing cell wall expansion in wheat (Triticum aestivum) pollen and nonreproductive cells (Li et al., 2003). Furthermore, mutated Zea m 1 isoforms caused delayed pollen growth and the accumulation of large aggregates, possibly as a consequence of aberrant cell wall expansion (Valdivia et al., 2009). Both Zea m 1 and Ory s 1 were present in the divergent EXPB-I group, showing a high expression in pollen (Hirano et al., 2013), suggesting that these pollen allergens may be evolved from a common ancestor and have a conserved biological function. Supportively, Zea m 1, Ory s 1, and Phl p 1 isoforms share a conserved functional binding site (Fig. 6; Petersen et al., 1995; Yennawar et al., 2006).

DISCUSSION

Pollen grain-caused allergen is one of the most intractable problems in allergy research. Large numbers of pollen allergens have been characterized, but little is known about their evolution and taxonomic distribution patterns. To provide answers to these questions, we performed genome-wide allergen prediction of transcriptomic and proteomic data sets in the model monocot rice and dicot Arabidopsis and performed phylogenetic analysis of pollen allergens. The taxonomic distribution of putative pollen allergens was investigated using phylogenetic analysis, which showed distinct distribution patterns for some of these allergens. Both the expression pattern and the taxonomic distribution of these putative pollen allergens in model plants are likely to be useful to predict potential allergens in other plant species, especially those species without complete genome sequences. The sequence variation of allergen proteins among species, especially between lower and higher plants, indicated that allergenicity might change along with plant evolution.

In many pollen allergens like allergenic expansins and profilins, epitopes usually include important functional amino acid residues. We observed low Ka/Ks values and higher gene duplication ratios in putative pollen allergens, which importantly also indicated a relationship between allergenicity and the evolution of protein functions. Therefore, we suggest that allergenicity might be a by-product of gene duplication and functional specification.

Conserved epitope sequences in allergens have been proposed to result in desensitization in humans after long-term exposure (Radauer et al., 2012). Gene duplication promotes neofunctionalization by variation of protein sequence, thereby promoting the opportunity for new allergen formation or changing the allergenicity of previous allergens. We observed significantly higher gene duplication rates of putative pollen allergens in both rice and Arabidopsis. Allergenicity emerged from gene duplication events in some cases. For example, the EXPB-I clade of this family was separated into two groups by gene duplication: a divergent group containing allergenic β-expansins and a conservative group (Sampedro et al., 2015). The lack of divergent EXPB-I genes in eudicots or in the recently sequenced genomes of banana (Musa spp.), date palm (Phoenix dactylifera), and oil palm (Elaeis guineensis) also supports a recent split (Tang et al., 2010). In rice, divergent and conservative EXPB-I groups were inferred to have evolved from the sigma whole-genome duplication in grasses, and changes in tissue expression of divergent EXPB-I permitted pollen-specific β-expansins (OsEXPB1, OsEXPB10, OsEXPB13, and OsEXPB9). OsEXPB1, OsEXPB10, and OsEXPB13 were produced by tandem duplication events, and OsEXPB9 was produced by the rho whole-genome duplication (Tang et al., 2010). In addition, features of the expansin family demonstrated the way that gene duplication led to function specification and allergenicity. Divergent EXPB-I proteins may have evolved to act on a preferred substrate, highly substituted xylans in grasses (Sampedro et al., 2015). Unfortunately, these changes also generated the specific epitopes recognized by immunoglobulins from individuals allergic to group 1 grass pollen allergens (Flicker et al., 2006).

Allergens have stringent structural and epitope requirements (Burks et al., 1999); however, variation within the epitope may create new allergens or disrupt the allergenicity. One good example is the peanut allergen Ara h 3 gene family, which arose by segmental and tandem duplications and evolved in a conservative manner (Ratnaparkhe et al., 2014). Low Ka/Ks rates of putative pollen allergens in rice and Arabidopsis indicate that these allergens might have experienced purifying selection (Fig. 4, C and D). The limited ratio of nonsynonymous mutations implied that these allergens might have evolved to have unique functions in pollen. The molecular function of a protein requires a stable structure, and so do existing allergens. Our data suggest that epitopes might be located in conserved functional sites of putative allergenic proteins, as we observed a limited ratio of nonsynonymous mutation in putative pollen allergens. As mentioned previously, pollen allergens tended to be involved in cell wall (pollen wall) metabolic processes and stress responses (Supplemental Table S2), which indicated that they underwent a strict purifying selection through pollen competition or other stresses to perform the function. That also may be the reason for the phenomenon that putative pollen allergens showed both higher gene duplication rates and lower Ka/Ks values. Allergenic β-expansins are good examples influencing the outcome of pollen competition by affecting pollen tube growth (Valdivia et al., 2007).

CONCLUSION

In summary, this work predicted 145 and 107 pollen allergens from rice and Arabidopsis, respectively and these pollen allergens are associated with stress responses and metabolic events during pollen development. Interestingly, sequence analysis across 25 plant species from low plants to high plants suggests that some pollen allergens belongs to large gene families generated by gene duplication, purifying selection, and functional diversification during evolution. During this process, two selection processes were evident: the fixation of duplication (maintaining the allergenicity) and the fixation of allergen-determining residues (retaining allergenic epitopes). Stress, pollen competition, and functional selection (like cell wall metabolic processes) could be involved in the fixation processes (Fig. 7). Our analysis of putative pollen allergens from model plants is helpful to predict pollen allergens in other species and future medical treatment of pollen allergenicity. Our model of pollen allergen evolution could provide an insight into the mechanisms underlying how allergenicity evolved and help in the identification of epitopes.

Model of the origination and evolution of pollen allergen genes in plants. Conserved allergens may lead to the induction of immunological tolerance, while duplicated genes may either diverge in protein sequence to generate new allergens or maintain the original allergenicity. During this process, two selection processes are likely: fixation of the duplication (copies maintained) or fixation of allergen-determining mutations (retaining of allergenic epitopes).
Figure 7.

Model of the origination and evolution of pollen allergen genes in plants. Conserved allergens may lead to the induction of immunological tolerance, while duplicated genes may either diverge in protein sequence to generate new allergens or maintain the original allergenicity. During this process, two selection processes are likely: fixation of the duplication (copies maintained) or fixation of allergen-determining mutations (retaining of allergenic epitopes).

MATERIALS AND METHODS

Identification of Allergenic Genes

Gene sequences for the prediction of allergens present in mature pollen grains of Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) were collected from the literature (Holmes-Davis et al., 2005; Noir et al., 2005; Dai et al., 2006; Sheoran et al., 2006). Gene expression data sets of pollen in Arabidopsis and rice (Qin et al., 2009; Wei et al., 2010), GSM692545, GSM692546, GSM69254, GSM433634, GSM433635, GSM433636, and GSM433637, were downloaded from the Gene Expression Omnibus at the National Center for Biotechnology Information (Barrett and Edgar, 2006). Only genes found in proteome data and presented (MAS5.0 AP call) in more than half of the replicates of microarray analysis were chosen as candidate genes for allergen prediction (Pepper et al., 2007).

All gene identifiers, transcript identifiers, and the corresponding descriptions were collected from the Arabidopsis Information Resource database (http://www.arabidopsis.org/; Lamesch et al., 2012), the Rice Genome Annotation Project database (http://rice.plantbiology.msu.edu/; Kawahara et al., 2013), or the Rice Annotation Project database (http://rapdb.dna.affrc.go.jp/; Sakai et al., 2013). Information of known pollen allergens was obtained from the Allergome database (http://www.allergome.org/; Mari et al., 2006) and the World Health Organization/IUIS Allergen Nomenclature official database (http://www.allergen.org/). Protein sequence data in FASTA format were downloaded from the Universal Protein Resource database release 2014_03 (http://www.uniprot.org/; UniProt Consortium, 2014).

Sequence-based and maximum relevance minimum redundancy feature selection methods were used to detect potential allergens in mature pollen using two published prediction tools, proAP (Wang et al., 2013b) and PREAL (Wang et al., 2013c), on our server. The sequence-based approach was proposed by the Food and Agriculture Organization of the United Nations and the World Health Organization (FAO/WHO, 2003), and the number of exact matches in a stretch of consecutive identical amino acids was set to more than eight (rule 1). Proteins predicted by both methods were retained as potential pollen allergens to ensure accuracy (Wang et al., 2013c). The prediction results and information on putative pollen allergens are shown in Supplemental Data Set S1.

Gene Expression Profile Analysis

The expression data of genes corresponding to potential allergens in Arabidopsis and rice were downloaded from the Bio-Analytic Resource for Plant Biology (http://bar.utoronto.ca/; Toufighi et al., 2005) or the Rice Oligonucleotide Array Database (http://www.ricearray.org/; Cao et al., 2012), respectively. Information about the expression data, such as growth stage, tissues, and samples, is listed in Supplemental Data Set S2. To avoid batch effects, ComBat (Johnson et al., 2007), an R package, was used to adjust expression data from different experiments. To examine expression patterns and the specificity of target genes, the data were clustered by Genesis release 1.7.6 (Sturn et al., 2002).

MapMan and GO Analysis

The PLAZA database version 2.5 (http://bioinformatics.psb.ugent.be/plaza/; Van Bel et al., 2012) and the PANTHER classification system (http://pantherdb.org/; Mi et al., 2013) were used to perform GO classification and enrichment analysis. To investigate the metabolic processes involved, MapMan was used to check the metabolic overview of potential allergens (Thimm et al., 2004), and significance was tested by a hypergeometric distribution test.

Protein Family and Taxonomic Distribution Analysis

Genes were classified into protein families using the Pfam protein families database version 27.0 (http://pfam.xfam.org/; Finn et al., 2014) and the Plant Gene Family Database (http://green.dna.affrc.go.jp/PGF-DB/). Homologs including in-paralogs (i.e. BLAST hit of genes in the same species having higher bit scores than the best hit from any other species) were obtained from 25 plants (including Arabidopsis and rice) after BLAST at the PLAZA database version 2.5 (E value threshold of 1e-05).

Construction of the Phylogenetic Tree, Sequence Analysis, and 3D Modeling

The Clustal Omega (Sievers and Higgins, 2014) server at the European Bioinformatics Institute (http://www.ebi.ac.uk/Tools/msa/clustalo/) was used to compare protein sequences downloaded from the UniProt database. Results of sequence alignments are shown with known secondary structure information from the Protein Data Bank (Berman et al., 2000) by the Web-based tool Easy Sequencing in PostScript (Robert and Gouet, 2014). Unrooted phylogenetic trees were reconstructed by MEGA6 (Tamura et al., 2013) using neighbor-joining and maximum likelihood methods. The 3D structures of Ara h 5 and AtProfilin1 were obtained from the Protein Data Bank under accession numbers 4ESP (Wang et al., 2013d) and 1A0K (Thorn et al., 1997), and the 3D models were visualized by UCSF Chimera (Pettersen et al., 2004).

Gene Duplication Analysis and Genome-Level Ka/Ks Estimation

Gene duplication data were obtained from the PLAZA database version 2.5 including tandem duplication and block duplication. These duplication events were identified through collinearity information using i-ADHoRe version 3.0 (Proost et al., 2012). To estimate selective pressure acting on genes, four closely related species to Arabidopsis and rice were chosen to calculate Ka/Ks rates in each species. Homolog gene pairs of Arabidopsis and Arabidopsis lyrata, Brassica rapa, Capsella rubella, or Thellungiella parvula were identified with the method of best hits of BLASTP at the PLAZA database version 2.5. ParaAT 1.0 (Zhang et al., 2012) and Clustal Omega were used for multiple sequence alignment, then Ka/Ks rates were calculated by KaKs_Calculator 2.0 (Wang et al., 2010) using the γ-MYN method (Wang et al., 2009). For homolog gene pairs in rice, Ka/Ks data calculated by the γ-MYN method in Oryza brachyantha, Oryza sativa ssp. indica 9311, Oryza glaberrima, and Peiai 64S (PA64S) were downloaded from RGKbase (Wang et al., 2013a). Gene pairs with number of nonsynonymous substitutions per nonsynonymous site < 0.5, number of synonymous substitutions per synonymous site < 5, and Ka/Ks < 2 were retained for comparison (Supplemental Data Set S3).

Accession Numbers

Accession numbers for the genes in this article are as follows: OsEXPB1a (LOC_Os03g01610), OsEXPB1b (LOC_Os03g01650), OsEXPB2 (LOC_Os10g40710), OsEXPB2 (LOC_Os10g40710), OsEXPB3 (LOC_Os10g40720), OsEXPB4 (LOC_Os10g40730), OsEXPB5 (LOC_Os04g46650), OsEXPB6 (LOC_Os10g40700), OsEXPB7 (LOC_Os03g01270), OsEXPB8 (LOC_Os03g01260), OsEXPB9 (LOC_Os10g40090), OsEXPB10 (LOC_Os03g01640), OsEXPB11 (LOC_Os02g44108), OsEXPB12 (LOC_Os03g44290), OsEXPB13 (LOC_Os03g01630), OsEXPB14 (LOC_Os02g44106), OsEXPB15 (LOC_Os04g46630), OsEXPB16 (LOC_Os02g42650), OsEXPB17 (LOC_Os04g44780), OsEXPB18 (LOC_Os05g15690), AtEXPB1 (AT2G20750), AtEXPB2 (AT1G65680), AtEXPB3 (AT4G28250), AtEXPB4 (AT2G45110), AtEXPB5 (AT3G60570), AtProfilin1 (AT2G19760), AtProfilin2 (AT4G29350), AtProfilin3 (AT5G56600), AtProfilin4 (AT4G29340), and AtProfilin5 (AT2G19770).

Supplemental Data

The following supplemental materials are available.

Glossary

     
  • Ka/Ks

    ratio of the number of nonsynonymous substitutions per nonsynonymous site to the number of synonymous substitutions per synonymous site

  •  
  • 3D

    three-dimensional

  •  
  • GO

    Gene Ontology

LITERATURE CITED

Adams
 
KL
,
Wendel
 
JF
(
2005
)
Polyploidy and genome evolution in plants
.
Curr Opin Plant Biol
 
8
:
135
141

Arondel
 
VV
,
Vergnolle
 
C
,
Cantrel
 
C
,
Kader
 
J
(
2000
)
Lipid transfer proteins are encoded by a small multigene family in Arabidopsis thaliana
.
Plant Sci
 
157
:
1
12

Barral
 
P
,
Suárez
 
C
,
Batanero
 
E
,
Alfonso
 
C
,
Alché
 
JdeD
,
Rodríguez-García
 
MI
,
Villalba
 
M
,
Rivas
 
G
,
Rodríguez
 
R
(
2005
)
An olive pollen protein with allergenic activity, Ole e 10, defines a novel family of carbohydrate-binding modules and is potentially implicated in pollen germination
.
Biochem J
 
390
:
77
84

Barrett
 
T
,
Edgar
 
R
(
2006
)
Gene Expression Omnibus: microarray data storage, submission, retrieval, and analysis
.
Methods Enzymol
 
411
:
352
369

Berman
 
HM
,
Westbrook
 
J
,
Feng
 
Z
,
Gilliland
 
G
,
Bhat
 
TN
,
Weissig
 
H
,
Shindyalov
 
IN
,
Bourne
 
PE
(
2000
)
The Protein Data Bank
.
Nucleic Acids Res
 
28
:
235
242

Bowers
 
JE
,
Chapman
 
BA
,
Rong
 
J
,
Paterson
 
AH
(
2003
)
Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events
.
Nature
 
422
:
433
438

Burks
 
AW
,
King
 
N
,
Bannon
 
GA
(
1999
)
Modification of a major peanut allergen leads to loss of IgE binding
.
Int Arch Allergy Immunol
 
118
:
313
314

Cabanos
 
C
,
Tandang-Silvas
 
MR
,
Odijk
 
V
,
Brostedt
 
P
,
Tanaka
 
A
,
Utsumi
 
S
,
Maruyama
 
N
(
2010
)
Expression, purification, cross-reactivity and homology modeling of peanut profilin
.
Protein Expr Purif
 
73
:
36
45

Cao
 
P
,
Jung
 
KH
,
Choi
 
D
,
Hwang
 
D
,
Zhu
 
J
,
Ronald
 
PC
(
2012
)
The Rice Oligonucleotide Array Database: an atlas of rice gene expression
.
Rice (N Y)
 
5
:
17

Choi
 
D
,
Cho
 
H
,
Lee
 
Y
(
2006
)
Expansins: expanding importance in plant growth and development
.
Physiol Plant
 
126
:
511
518

Cosgrove
 
DJ
,
Bedinger
 
P
,
Durachko
 
DM
(
1997
)
Group I allergens of grass pollen as cell wall-loosening agents
.
Proc Natl Acad Sci USA
 
94
:
6559
6564

Cosgrove
 
DJ
(
2000
)
Loosening of plant cell walls by expansins
.
Nature
 
407
:
321
326

Cui
 
X
,
Lv
 
Y
,
Chen
 
ML
,
Nikoloski
 
Z
,
Twell
 
D
,
Zhang
 
DB
(
2015
)
Young genes out of the male: an insight from evolutionary age analysis of the pollen transcriptome
.
Mol Plant
 
8
:
935
945

Dai
 
S
,
Li
 
L
,
Chen
 
T
,
Chong
 
K
,
Xue
 
Y
,
Wang
 
T
(
2006
)
Proteomic analyses of Oryza sativa mature pollen reveal novel proteins associated with pollen germination and tube growth
.
Proteomics
 
6
:
2504
2529

D’Amato
 
G
,
Cecchi
 
L
,
Bonini
 
S
,
Nunes
 
C
,
Annesi-Maesano
 
I
,
Behrendt
 
H
,
Liccardi
 
G
,
Popov
 
T
,
van Cauwenberge
 
P
(
2007
)
Allergenic pollen and pollen allergy in Europe
.
Allergy
 
62
:
976
990

Emberlin
 
J
(
2009
)
Grass, tree, and weed pollen
. In  
AB
 
Kay
,
AP
 
Kaplan
,
J
 
Bousquet
,
PG
 
Holt
, eds,
Allergy and Allergic Diseases
, Ed 2.
Wiley-Blackwell
,
Oxford
, pp
942
962

Esch
 
RE
,
Klapper
 
DG
(
1989
)
Identification and localization of allergenic determinants on grass group I antigens using monoclonal antibodies
.
J Immunol
 
142
:
179
184

Esteve
 
C
,
Montealegre
 
C
,
Marina
 
ML
,
Garcia
 
MC
(
2012
)
Analysis of olive allergens
.
Talanta
 
92
:
1
14

FAO/WHO
(
2003
)
Report of a Joint FAO/WHO Expert Consultation on Allergenicity of Foods Derived from Bio-technology.
 
FAO/WHO
,
Rome

Finn
 
RD
,
Bateman
 
A
,
Clements
 
J
,
Coggill
 
P
,
Eberhardt
 
RY
,
Eddy
 
SR
,
Heger
 
A
,
Hetherington
 
K
,
Holm
 
L
,
Mistry
 
J
, et al.  (
2014
)
Pfam: the protein families database
.
Nucleic Acids Res
 
42
:
D222
D230

Flicker
 
S
,
Steinberger
 
P
,
Ball
 
T
,
Krauth
 
MT
,
Verdino
 
P
,
Valent
 
P
,
Almo
 
S
,
Valenta
 
R
(
2006
)
Spatial clustering of the IgE epitopes on the major timothy grass pollen allergen Phl p 1: importance for allergenic activity
.
J Allergy Clin Immunol
 
117
:
1336
1343

Gadermaier
 
G
,
Dedic
 
A
,
Obermeyer
 
G
,
Frank
 
S
,
Himly
 
M
,
Ferreira
 
F
(
2004
)
Biology of weed pollen allergens
.
Curr Allergy Asthma Rep
 
4
:
391
400

Gibbon
 
BC
,
Zonia
 
LE
,
Kovar
 
DR
,
Hussey
 
PJ
,
Staiger
 
CJ
(
1998
)
Pollen profilin function depends on interaction with proline-rich motifs
.
Plant Cell
 
10
:
981
993

Grote
 
M
(
1999
)
In situ localization of pollen allergens by immunogold electron microscopy: allergens at unexpected sites
.
Int Arch Allergy Immunol
 
118
:
1
6

Grote
 
M
,
Vrtala
 
S
,
Valenta
 
R
(
1993
)
Monitoring of two allergens, Bet v I and profilin, in dry and rehydrated birch pollen by immunogold electron microscopy and immunoblotting
.
J Histochem Cytochem
 
41
:
745
750

Gruehn
 
S
,
Suphioglu
 
C
,
O’Hehir
 
RE
,
Volkmann
 
D
(
2003
)
Molecular cloning and characterization of hazel pollen protein (70 kD) as a luminal binding protein (BiP): a novel cross-reactive plant allergen
.
Int Arch Allergy Immunol
 
131
:
91
100

Hiller
 
KM
,
Esch
 
RE
,
Klapper
 
DG
(
1997
)
Mapping of an allergenically important determinant of grass group I allergens
.
J Allergy Clin Immunol
 
100
:
335
340

Hirano
 
K
,
Hino
 
S
,
Oshima
 
K
,
Okajima
 
T
,
Nadano
 
D
,
Urisu
 
A
,
Takaiwa
 
F
,
Matsuda
 
T
(
2013
)
Allergenic potential of rice-pollen proteins: expression, immuno-cross reactivity and IgE-binding
.
J Biochem
 
154
:
195
205

Holmes-Davis
 
R
,
Tanaka
 
CK
,
Vensel
 
WH
,
Hurkman
 
WJ
,
McCormick
 
S
(
2005
)
Proteome mapping of mature pollen of Arabidopsis thaliana
.
Proteomics
 
5
:
4864
4884

Hrabina
 
,
Peltre
 
G
,
Van Ree
 
R
,
Moingeon
 
(
2008
)
Grass pollen allergens
.
Clin Exp Allergy Rev
 
3
:
7
11

Hurst
 
LD
(
2002
)
The Ka/Ks ratio: diagnosing the form of sequence evolution
.
Trends Genet
 
18
:
486

Jimenez-Lopez
 
JC
,
Kotchoni
 
SO
,
Rodriguez-Garcia
 
MI
,
Alche
 
JD
(
2012
)
Structure and functional features of olive pollen pectin methylesterase using homology modeling and molecular docking methods
.
J Mol Model
 
18
:
4965
4984

Johnson
 
WE
,
Li
 
C
,
Rabinovic
 
A
(
2007
)
Adjusting batch effects in microarray expression data using empirical Bayes methods
.
Biostatistics
 
8
:
118
127

Kawahara
 
Y
,
de la Bastide
 
M
,
Hamilton
 
JP
,
Kanamori
 
H
,
McCombie
 
WR
,
Ouyang
 
S
,
Schwartz
 
DC
,
Tanaka
 
T
,
Wu
 
J
,
Zhou
 
S
, et al.  (
2013
)
Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data
.
Rice (N Y)
 
6
:
4

Lakhssassi
 
N
,
Doblas
 
VG
,
Rosado
 
A
,
Esteban del Valle
 
A
,
Pose
 
D
,
Jimenez
 
AJ
,
Castillo
 
AG
,
Valpuesta
 
V
,
Borsani
 
O
,
Botella
 
MA
(
2012
)
The Arabidopsis tetratricopeptide thioredoxin-like gene family is required for osmotic stress tolerance and male sporogenesis
.
Plant Physiol
 
158
:
1252
1266

Lamesch
 
P
,
Berardini
 
TZ
,
Li
 
D
,
Swarbreck
 
D
,
Wilks
 
C
,
Sasidharan
 
R
,
Muller
 
R
,
Dreher
 
K
,
Alexander
 
DL
,
Garcia-Hernandez
 
M
, et al.  (
2012
)
The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools
.
Nucleic Acids Res
 
40
:
D1202
D1210

Li
 
LC
,
Bedinger
 
PA
,
Volk
 
C
,
Jones
 
AD
,
Cosgrove
 
DJ
(
2003
)
Purification and characterization of four β-expansins (Zea m 1 isoforms) from maize pollen
.
Plant Physiol
 
132
:
2073
2085

Liu
 
X
,
Qu
 
X
,
Jiang
 
Y
,
Chang
 
M
,
Zhang
 
R
,
Wu
 
Y
,
Fu
 
Y
,
Huang
 
S
(
2015
)
Profilin regulates apical actin polymerization to control polarized pollen tube growth
.
Mol Plant
 
8
:
1694
1709

Lynch
 
M
,
Conery
 
JS
(
2000
)
The evolutionary fate and consequences of duplicate genes
.
Science
 
290
:
1151
1155

Mahler
 
V
,
Fischer
 
S
,
Heiss
 
S
,
Duchêne
 
M
,
Kraft
 
D
,
Valenta
 
R
(
2001
)
cDNA cloning and characterization of a cross-reactive birch pollen allergen: identification as a pectin esterase
.
Int Arch Allergy Immunol
 
124
:
64
66
.

Mari
 
A
,
Scala
 
E
,
Palazzo
 
P
,
Ridolfi
 
S
,
Zennaro
 
D
,
Carabella
 
G
(
2006
)
Bioinformatics applied to allergy: allergen databases, from collecting sequence information to data integration. The Allergome platform as a model
.
Cell Immunol
 
244
:
97
100

Mi
 
H
,
Muruganujan
 
A
,
Thomas
 
PD
(
2013
)
PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees
.
Nucleic Acids Res
 
41
:
D377
D386

Mori
 
T
,
Yokoyama
 
M
,
Komiyama
 
N
,
Okano
 
M
,
Kino
 
K
(
1999
)
Purification, identification, and cDNA cloning of Cha o 2, the second major allergen of Japanese cypress pollen
.
Biochem Biophys Res Commun
 
263
:
166
171

Mothes
 
N
,
Valenta
 
R
(
2004
)
Biology of tree pollen allergens
.
Curr Allergy Asthma Rep
 
4
:
384
390

Musidlowska-Persson
 
A
,
Alm
 
R
,
Emanuelsson
 
C
(
2007
)
Cloning and sequencing of the Bet v 1-homologous allergen Fra a 1 in strawberry (Fragaria ananassa) shows the presence of an intron and little variability in amino acid sequence
.
Mol Immunol
 
44
:
1245
1252

Nei
 
M
(
1969
)
Gene duplication and nucleotide substitution in evolution
.
Nature
 
221
:
40
42

Noir
 
S
,
Bräutigam
 
A
,
Colby
 
T
,
Schmidt
 
J
,
Panstruga
 
R
(
2005
)
A reference map of the Arabidopsis thaliana mature pollen proteome
.
Biochem Biophys Res Commun
 
337
:
1257
1266

Pastorello
 
EA
,
Farioli
 
L
,
Pravettoni
 
V
,
Ispano
 
M
,
Scibola
 
E
,
Trambaioli
 
C
,
Giuffrida
 
MG
,
Ansaloni
 
R
,
Godovac-Zimmermann
 
J
,
Conti
 
A
, et al.  (
2000
)
The maize major allergen, which is responsible for food-induced allergic reactions, is a lipid transfer protein
.
J Allergy Clin Immunol
 
106
:
744
751

Pawankar
 
R
,
Canonica
 
GW
,
Holgate
 
ST
,
Lockey
 
RF
,
Blaiss
 
M
(
2013
)
The WAO White Book on Allergy (Update 2013).
 
Wisconsin World Allergy Organization
,
Milwaukee, WI

Pepper
 
SD
,
Saunders
 
EK
,
Edwards
 
LE
,
Wilson
 
CL
,
Miller
 
CJ
(
2007
)
The utility of MAS5 expression summary and detection call algorithms
.
BMC Bioinformatics
 
8
:
273

Petersen
 
A
,
Schramm
 
G
,
Bufe
 
A
,
Schlaak
 
M
,
Becker
 
WM
(
1995
)
Structural investigations of the major allergen Phl p I on the Complimentary-DNA and protein levels
.
J Allergy Clin Immunol
 
95
:
987
994

Pettersen
 
EF
,
Goddard
 
TD
,
Huang
 
CC
,
Couch
 
GS
,
Greenblatt
 
DM
,
Meng
 
EC
,
Ferrin
 
TE
(
2004
)
UCSF Chimera: a visualization system for exploratory research and analysis
.
J Comput Chem
 
25
:
1605
1612

Proost
 
S
,
Fostier
 
J
,
De Witte
 
D
,
Dhoedt
 
B
,
Demeester
 
P
,
Van de Peer
 
Y
,
Vandepoele
 
K
(
2012
)
i-ADHoRe 3.0: fast and sensitive detection of genomic homology in extremely large data sets
.
Nucleic Acids Res
 
40
:
e11

Qin
 
Y
,
Leydon
 
AR
,
Manziello
 
A
,
Pandey
 
R
,
Mount
 
D
,
Denic
 
S
,
Vasic
 
B
,
Johnson
 
MA
,
Palanivelu
 
R
(
2009
)
Penetration of the stigma and style elicits a novel transcriptome in pollen tubes, pointing to genes critical for growth in a pistil
.
PLoS Genet
 
5
:
e1000621

Radauer
 
C
,
Breiteneder
 
H
(
2006
)
Pollen allergens are restricted to few protein families and show distinct patterns of species distribution
.
J Allergy Clin Immunol
 
117
:
141
147

Radauer
 
C
,
Breiteneder
 
H
(
2007
)
Evolutionary biology of plant food allergens
.
J Allergy Clin Immunol
 
120
:
518
525

Radauer
 
C
,
Bublin
 
M
,
Wagner
 
S
,
Mari
 
A
,
Breiteneder
 
H
(
2008
)
Allergens are distributed into few protein families and possess a restricted number of biochemical functions
.
J Allergy Clin Immunol
 
121
:
847
852.e7

Radauer
 
C
,
Guhslc
 
E
,
Bublin
 
M
,
Breiteneder
 
H
(
2012
)
Pollen allergens differ from nonallergenic pollen proteins by their lower extent of evolutionary conservation
.
World Allergy Organ J
(
Suppl 2
)
5
:
S23

Radauer
 
C
,
Willerroider
 
M
,
Fuchs
 
H
,
Hoffmann-Sommergruber
 
K
,
Thalhamer
 
J
,
Ferreira
 
F
,
Scheiner
 
O
,
Breiteneder
 
H
(
2006
)
Cross-reactive and species-specific immunoglobulin E epitopes of plant profilins: an experimental and structure-based analysis
.
Clin Exp Allergy
 
36
:
920
929

Ratnaparkhe
 
MB
,
Lee
 
TH
,
Tan
 
X
,
Wang
 
X
,
Li
 
J
,
Kim
 
C
,
Rainville
 
LK
,
Lemke
 
C
,
Compton
 
RO
,
Robertson
 
J
, et al.  (
2014
)
Comparative and evolutionary analysis of major peanut allergen gene families
.
Genome Biol Evol
 
6
:
2468
2488

Robert
 
X
,
Gouet
 
P
(
2014
)
Deciphering key features in protein structures with the new ENDscript server
.
Nucleic Acids Res
 
42
:
W320
W324

Saha
 
S
,
Raghava
 
GPS
(
2006
)
AlgPred: prediction of allergenic proteins and mapping of IgE epitopes
.
Nucleic Acids Res
 
34
:
W202
W209

Sakai
 
H
,
Lee
 
SS
,
Tanaka
 
T
,
Numa
 
H
,
Kim
 
J
,
Kawahara
 
Y
,
Wakimoto
 
H
,
Yang
 
CC
,
Iwamoto
 
M
,
Abe
 
T
, et al.  (
2013
)
Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics
.
Plant Cell Physiol
 
54
:
e6

Salamanca
 
G
,
Rodriguez
 
R
,
Quiralte
 
J
,
Moreno
 
C
,
Pascual
 
CY
,
Barber
 
D
,
Villalba
 
M
(
2010
)
Pectin methylesterases of pollen tissue, a major allergen in olive tree
.
FEBS J
 
277
:
2729
2739

Sampedro
 
J
,
Guttman
 
M
,
Li
 
LC
,
Cosgrove
 
DJ
(
2015
)
Evolutionary divergence of β-expansin structure and function in grasses parallels emergence of distinctive primary cell wall traits
.
Plant J
 
81
:
108
120

Sander
 
I
,
Rozynek
 
P
,
Rihs
 
HP
,
van Kampen
 
V
,
Chew
 
FT
,
Lee
 
WS
,
Kotschy-Lang
 
N
,
Merget
 
R
,
Bruning
 
T
,
Raulf-Heimsoth
 
M
(
2011
)
Multiple wheat flour allergens and cross-reactive carbohydrate determinants bind IgE in baker's asthma
.
Allergy
 
66
:
1208
1215

Sheoran
 
IS
,
Sproule
 
KA
,
Olson
 
DJH
,
Ross
 
ARS
,
Sawhney
 
VK
(
2006
)
Proteome profile and functional classification of proteins in Arabidopsis thaliana (Landsberg erecta) mature pollen
.
Sex Plant Reprod
 
19
:
185
196

Shi
 
J
,
Cui
 
M
,
Yang
 
L
,
Kim
 
YJ
,
Zhang
 
D
(
2015
)
Genetic and biochemical mechanisms of pollen wall development
.
Trends Plant Sci
 
20
:
741
753

Sievers
 
F
,
Higgins
 
DG
(
2014
)
Clustal Omega, accurate alignment of very large numbers of sequences
.
Methods Mol Biol
 
1079
:
105
116

Soeria-Atmadja
 
D
,
Lundell
 
T
,
Gustafsson
 
MG
,
Hammerling
 
U
(
2006
)
Computational detection of allergenic proteins attains a new level of accuracy with in silico variable-length peptide extraction and machine learning
.
Nucleic Acids Res
 
34
:
3779
3793

Songnuan
 
W
(
2013
)
Wind-pollination and the roles of pollen allergenic proteins
.
Asian Pac J Allergy Immunol
 
31
:
261
270

Stadler
 
MB
,
Stadler
 
BM
(
2003
)
Allergenicity prediction by protein sequence
.
FASEB J
 
17
:
1141
1143

Sturn
 
A
,
Quackenbush
 
J
,
Trajanoski
 
Z
(
2002
)
Genesis: cluster analysis of microarray data
.
Bioinformatics
 
18
:
207
208

Suck
 
R
,
Petersen
 
A
,
Hagen
 
S
,
Cromwell
 
O
,
Becker
 
WM
,
Fiebig
 
H
(
2000
)
Complementary DNA cloning and expression of a newly recognized high molecular mass allergen Phl p 13 from timothy grass pollen (Phleum pratense)
.
Clin Exp Allergy
 
30
:
324
332

Tamura
 
K
,
Stecher
 
G
,
Peterson
 
D
,
Filipski
 
A
,
Kumar
 
S
(
2013
)
MEGA6: Molecular Evolutionary Genetics Analysis version 6.0
.
Mol Biol Evol
 
30
:
2725
2729

Tang
 
H
,
Bowers
 
JE
,
Wang
 
X
,
Paterson
 
AH
(
2010
)
Angiosperm genome comparisons reveal early polyploidy in the monocot lineage
.
Proc Natl Acad Sci USA
 
107
:
472
477

Thimm
 
O
,
Bläsing
 
O
,
Gibon
 
Y
,
Nagel
 
A
,
Meyer
 
S
,
Krüger
 
P
,
Selbig
 
J
,
Müller
 
LA
,
Rhee
 
SY
,
Stitt
 
M
(
2004
)
MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes
.
Plant J
 
37
:
914
939

Thoma
 
S
,
Kaneko
 
Y
,
Somerville
 
C
(
1993
)
A non-specific lipid transfer protein from Arabidopsis is a cell wall protein
.
Plant J
 
3
:
427
436

Thorn
 
KS
,
Christensen
 
HEM
,
Shigeta
 
R
,
Huddler
 
D
,
Shalaby
 
L
,
Lindberg
 
U
,
Chua
 
NH
,
Schutt
 
CE
(
1997
)
The crystal structure of a major allergen from plants
.
Structure
 
5
:
19
32

Toufighi
 
K
,
Brady
 
SM
,
Austin
 
R
,
Ly
 
E
,
Provart
 
NJ
(
2005
)
The Botany Array Resource: e-northerns, expression angling, and promoter analyses
.
Plant J
 
43
:
153
163

UniProt Consortium
(
2014
)
UniProt: a hub for protein information
.
Nucleic Acids Res
 
43
:
D204
D212

Valdivia
 
ER
,
Stephenson
 
AG
,
Durachko
 
DM
,
Cosgrove
 
DJ
(
2009
)
Class B beta-expansins are needed for pollen separation and stigma penetration
.
Sex Plant Reprod
 
22
:
141
152

Valdivia
 
ER
,
Wu
 
Y
,
Li
 
LC
,
Cosgrove
 
DJ
,
Stephenson
 
AG
(
2007
)
A group-1 grass pollen allergen influences the outcome of pollen competition in maize
.
PLoS ONE
 
2
:
e154

Valenta
 
R
,
Duchene
 
M
,
Ebner
 
C
,
Valent
 
P
,
Sillaber
 
C
,
Deviller
 
P
,
Ferreira
 
F
,
Tejkl
 
M
,
Edelmann
 
H
,
Kraft
 
D
, et al.  (
1992
)
Profilins constitute a novel family of functional plant pan-allergens
.
J Exp Med
 
175
:
377
385

Van Bel
 
M
,
Proost
 
S
,
Wischnitzki
 
E
,
Movahedi
 
S
,
Scheerlinck
 
C
,
Van de Peer
 
Y
,
Vandepoele
 
K
(
2012
)
Dissecting plant genomes with the PLAZA comparative genomics platform
.
Plant Physiol
 
158
:
590
600

Vieths
 
S
,
Scheurer
 
S
,
Ballmer-Weber
 
B
(
2002
)
Current understanding of cross-reactivity of food allergens and pollen
.
Ann N Y Acad Sci
 
964
:
47
68

Wang
 
D
,
Xia
 
Y
,
Li
 
X
,
Hou
 
L
,
Yu
 
J
(
2013
 
a
)
The Rice Genome Knowledgebase (RGKbase): an annotation database for rice comparative genomics and evolutionary biology
.
Nucleic Acids Res
 
41
:
D1199
D1205

Wang
 
D
,
Zhang
 
Y
,
Zhang
 
Z
,
Zhu
 
J
,
Yu
 
J
(
2010
)
KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies
.
Genomics Proteomics Bioinformatics
 
8
:
77
80

Wang
 
DP
,
Wan
 
HL
,
Zhang
 
S
,
Yu
 
J
(
2009
)
γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates
.
Biol Direct
 
4
:
20

Wang
 
J
,
Yu
 
Y
,
Zhao
 
Y
,
Zhang
 
D
,
Li
 
J
(
2013
 
b
)
Evaluation and integration of existing methods for computational prediction of allergens
.
BMC Bioinformatics
(
Suppl 4
)
14
:
S1

Wang
 
J
,
Zhang
 
D
,
Li
 
J
(
2013
 
c
)
PREAL: prediction of allergenic protein by maximum relevance minimum redundancy (mRMR) feature selection
.
BMC Syst Biol
(
Suppl 5
)
7
:
S9

Wang
 
Y
,
Fu
 
TJ
,
Howard
 
A
,
Kothary
 
MH
,
McHugh
 
TH
,
Zhang
 
Y
(
2013
 
d
)
Crystal structure of peanut (Arachis hypogaea) allergen Ara h 5
.
J Agric Food Chem
 
61
:
1573
1578

Wang
 
Z
,
Xie
 
W
,
Chi
 
F
,
Li
 
C
(
2005
)
Identification of non-specific lipid transfer protein-1 as a calmodulin-binding protein in Arabidopsis
.
FEBS Lett
 
579
:
1683
1687

Wei
 
LQ
,
Xu
 
WY
,
Deng
 
ZY
,
Su
 
Z
,
Xue
 
Y
,
Wang
 
T
(
2010
)
Genome-scale analysis and comparison of gene expression profiles in developing and germinated pollen in Oryza sativa
.
BMC Genomics
 
11
:
338

Xu
 
H
,
Theerakulpisut
 
P
,
Goulding
 
N
,
Suphioglu
 
C
,
Singh
 
MB
,
Bhalla
 
PL
(
1995
)
Cloning, expression and immunological characterization of Ory s 1, the major allergen of rice pollen
.
Gene
 
164
:
255
259

Yennawar
 
NH
,
Li
 
LC
,
Dudzinski
 
DM
,
Tabuchi
 
A
,
Cosgrove
 
DJ
(
2006
)
Crystal structure and activities of EXPB1 (Zea m 1), a beta-expansin and group-1 pollen allergen from maize
.
Proc Natl Acad Sci USA
 
103
:
14664
14671

Zhang
 
D
,
Liang
 
W
,
Yin
 
C
,
Zong
 
J
,
Gu
 
F
,
Zhang
 
D
(
2010
)
OsC6, encoding a lipid transfer protein, is required for postmeiotic anther development in rice
.
Plant Physiol
 
154
:
149
162

Zhang
 
Z
,
Xiao
 
J
,
Wu
 
J
,
Zhang
 
H
,
Liu
 
G
,
Wang
 
X
,
Dai
 
L
(
2012
)
ParaAT: a parallel tool for constructing multiple protein-coding DNA alignments
.
Biochem Biophys Res Commun
 
419
:
779
781

Author notes

1

This work was supported by The National Key Technologies Research and Development Program of China (2016YFD0100804); the National Natural Science Foundation of China (grant nos. 31370026, 31570312, J1210047, and 31110103915); the China Innovative Research Team, Ministry of Education, and the Programme of Introducing Talents of Discipline to Universities (111 Project, grant no. B14016); the Chun-Tsung Program of Shanghai Jiao Tong University; the Innovative Research Team in University of the Ministry of Education of China; the Innovative Research Team in University of the Ministry of Science and Technology of China; the School of Agriculture, Food, and Wine, University of Adelaide (start-up grant to D.Z.); and the Australian Research Council (grant no. FT130100525 to I.S.).

2

These authors contributed equally to the article.

*

Address correspondence to zhangdb@sjtu.edu.cn.

The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Dabing Zhang (zhangdb@sjtu.edu.cn).

M.C., J.X., R.K., and D.D. carried out data analysis; J.X. and D.Z. conceived the study, supervised the work, and analyzed the data; J.S., D.Z., and I.S. participated in project discussions and wrote the article.

[OPEN]

Articles can be viewed without a subscription.

© The Author(s) 2016. Published by Oxford University Press on behalf of American Society of Plant Biologists. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data