High-Throughput Sequencing Identified Multiple Fig Viruses and Viroids Associated with Fig Mosaic Disease in Iraq
Article information
Abstract
Mosaic is the most common viral disease affecting fig plants. Although the Fig mosaic virus is the leading cause of mosaic disease, other viruses are also involved. High-throughput sequencing was used to assess viral infections in fig plants with mosaic. The genomic DNA and total RNAseq of mosaic-symptomatic fig leaves were sequenced using the Illumina platform. The analysis revealed the presence of fig badnavirus 1 (FBV-1), grapevine badnavirus 1 (GBV-1), citrus exocortis viroid (CEVd), and apple dimple fruit viroid (ADFVd). The FBV-1 and GBV-1 sequences were 7,140 bp and 7,239 bp long, respectively. The two genomes encode one open reading frame containing five major protein domains. The viroids, CEVd and ADFVd, were 397 bp and 305 bp long. Phylogenetic analyses revealed a close relationship between FBV-1 and Iranian isolates of the same species, while GBV-1 was closely related to Russian grapevine badnavirus isolates (Tem64, Blu17, KDH48, and Pal9). CEVd was closely related to other Iraqi isolates, while ADFVd was strongly related to a Spanish isolate. A registered endogenous pararetrovirus, caulimovirus-Fca1, with a size of 7,556 bp, was found in the RNA transcripts with a low expression level. This integrant was also detected in the genomes of the two lines ‘Horaishi’ (a female line) and ‘Caprifig 6085’ (a male line). Phylogenetic analyses revealed that caulimovirus-Fca1 was distinct from two other clades of different endogenous virus genera.
Humans have planted the fig (Ficus carica) tree since the dawn of civilization due to its nutritional benefits (Ciarmiello et al., 2015). Fig mosaic disease (FMD) does not receive enough attention from farmers despite its severe adverse effects on fig trees (Walia et al., 2009). Fifteen viruses from different taxonomic groups of the families Caulimoviridae, Closteroviridae, Flexiviridae, Partitiviridae, and Tymoviridae and three viroids have been detected in mosaic-produced fig trees associated with FMD, probably influencing the severity of symptoms and damages (Elbeaino et al., 2010; Martelli, 2009). The main registered viruses were fig leaf mottle-associated virus 1 (FLMaV-1), fig leaf mottle-associated virus 2 (FLMaV-2) (Caglar et al., 2011), fig latent virus 1 (FLV-1), fig mild mottle-associated virus (FMMaV), arkansas fig closterovirus 1 and 2 (AFCV-1 and -2), and fig cryptic virus (FCV) (Al-Kaeath et al., 2021; Elçi et al., 2012). The viroids infecting fig plants were apple dimple fruit viroid (ADFVd), hop stunt viroid (Elbeaino et al., 2012), and citrus exocortis viroid (CEVd) (Yakoubi et al., 2007). The FMD symptoms are mainly attributed to the Fig mosaic virus (FMV), genus Emaravirus, family Fimoviridae (Preising et al., 2021). Various fig-growing locations worldwide are infected with numerous viruses. Fig trees in Syria and Tunisia were infected with six and seven viruses, respectively, FMV being the most common (El-Air et al., 2015; Elbeaino et al., 2012). In Tunisia, FLV-1 was identified in symptomatic and asymptomatic trees in all examined regions. Fig trees in Iran had three viruses: FLV-1 (dominant), FLMaV-1, and FMV (Shahmirzaie et al., 2012). Several viruses were identified in fig trees in Lebanon, Egypt, and western Saudi Arabia, demonstrating commonly FLMaV-1 and FMV (Aldhebiani et al., 2015; Elbeaino et al., 2010; Elbeshehy and Elbeaino, 2011). Studies conducted in the USA, Iran, and Iraq reported fig badnavirus 1 (FBV-1) associated with FMV in mosaic-affected fig trees (Alishiri et al., 2016; Jamous et al., 2020; Laney et al., 2012; Preising et al., 2021; Zagier et al., 2021). Badnaviruses are widely distributed among fruit and ornamental plants in tropical and temperate regions of Africa, Asia, Australia, Europe, and South and North America. Their genomes comprise a single molecule of non-covalently closed circular double-stranded DNA, ranging from 7 to 9 kbp (Bhat et al., 2016). Significant advancement in genome sequencing techniques allowed the discovery of endogenous pararetroviruses (EPRVs) in various plants. Most of these viruses belong to the Caulimoviridae family and their respective genera, such as Caulimovirus, Badnavirus, Petuvirus, and Florendovirus (Alisawi, 2019). High-throughput sequencing (HTS) techniques have contributed greatly to genomic research by providing extensive information on host genome sequences. Even without prior suspicions, this technique can detect pathogenic elements unbiasedly (Adams et al., 2009). Unlike other techniques focusing on a single suspected virus, HTS provides a comprehensive list of viruses involved in complex infections (Jones et al., 2017). HTS with bioinformatics is useful in identifying unexpected pathogenic viruses and providing accurate statistics on their copy number and genome proportions (Goodrich et al., 2016; Huggett et al., 2015). Mixed viral infections decrease plant health and yield (Valverde et al., 2007; Wintermantel et al., 2008). This infection has caught virologists’ attention due to its cost-effectiveness and interesting and highly complicated aetiologies (Naidu et al., 2014, 2015). The severity of the negative effects depends on the interaction between the infecting viruses and the host. Unrelated viruses interact synergistically, facilitating each other’s action. The beneficiary viruses accumulate more in host plants, resulting in more severe symptoms than individual viruses (Syller, 2012; Tatineni et al., 2014). Several synergistic interactions have been described in tobacco plants, the best described involving Potato virus Y (PVY) and Potato virus X (PVX) (Rochow and Ross, 1955; Vance, 1991). Synergistic interactions will likely result from a suppressed host defense mechanism based on RNA silencing by viral proteins (Carrington et al., 2001; Ratcliff et al., 1999). However, some intriguing questions remain unanswered. The consequences of viral interactions on host defenses, particularly coinfection with two unrelated viruses, are more serious than the impacts of individual viruses (Mandadi and Scholthof, 2012; Tatineni et al., 2014). Syller (2012) states that the relationship between related plant viruses is usually antagonistic (competitive). Molecular mechanisms underlying these interactions are less understood than those of synergistic interactions, and studies aiming to elucidate their molecular basis remain insufficient. Cross-protection, also known as homologous interference or super-infection exclusion, is a relatively well-known antagonistic interaction (Bergua et al., 2014; Folimonova, 2012, 2013; Gutiérrez et al., 2012; Julve et al., 2013). An infection with a primary/protecting virus can prevent or interfere with the subsequent infection by a homologous secondary/challenge virus (DaPalma et al., 2010; Gal-On and Shiboleth, 2006; González-Jara et al., 2009; Ziebell and Carr, 2010). This study aimed to understand the etiology of FMD in Iraq. It was conducted to identify the major viruses affecting local fig trees in a mix-infection associated with FMD and determine their presence as pathogenic and integrated elements using Illumina platform and bioinformatics techniques.
Materials and Methods
Virus survey and plant material
Mosaic-symptomatic leaves (Fig. 1) were collected from fig trees for total nucleic acid extraction. The infected leaves were cut into 0.5 × 0.5 cm squares, immersed in 5× volume of RNA Later in a single Eppendorf tube, and sent to the DNA Link company in the Republic of Korea. The total nucleic acids were extracted following the manufacturer’s instructions. For DNA extraction, the young infected leaves were treated with cetyl-trimethylammonium bromide (CTAB), as described by Doyle and Doyle (1990), with minor modifications. These modifications included a 45-minute incubation of preheated CTAB buffer and the subsequent spinning down of the precipitated DNA. Total RNA was extracted using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany), following the manufacturer’s instructions.
High-throughput sequencing
The library was prepared within DNA Link company, Republic of Korea, utilizing the TruSeq DNA and TruSeq total RNA library preparation kits (Illumina, San Diego, CA, USA). DNA was sequenced using the Novaseq6000 platform (Illumina), a 2 × 150 bp reads technique, and the WGS (PCR Free550) application, following the manufacturer’s guidelines. The RNA sample’s quality was assessed using a 2100 Expert Bioanalyzer manufactured by Agilent Techonologies (Santa Clara, CA, USA). Subsequently, the RNA sample was sequenced using NovaSeq6000 with 2 × 101 PE reads to obtain the total RNA sequence. The raw DNA and RNA reads were subjected to Trimmomatic v 0.40 and then BBDuk tool in the Geneious prime to trim low-quality reads and produce clean and high-quality reads. Genome sequencing coverage was determined by multiplying the number of FASTQ reads by the read length (151 bp) and dividing it by the genome size of Ficus carica (366.34 Mb) (Bao et al., 2023).
Map to reference
Geneious and Geneious RNA mappers (medium-low sensitivity) were used to map DNA and RNAseq data to the reference sequences, with up to five iterations. RNA raw reads were mapped by Geneious prime v. 11 to the predicted sequences of fig viruses, and the consensus sequence was extracted. A representative sequence (76,145,671 nt) was created by concatenating all plant virus sequences (5,040 elements) and mapped to the whole RNA reads. RepeatExplorer was also used to map the extracted EPRVs against whole DNA reads. The outcomes were presented in a report containing many assembled reads and total used reads. The DNA data was used to calculate copy numbers (number of assembled reads × read length/reference sequence length) and genome proportions (number of assembled reads/numbers of total HTS reads × 100) of the EPRVs (Mustafa et al., 2018). Blastn search for the EPRV was conducted over two chromosome-scale genome sequences for ‘Horaishi’ (a female line) and ‘Caprifig 6085′ (a male line), to find virus hits on the fig lines’ chromosomes.
Phylogenetic analysis
The maximum likelihood method with 1,000 bootstrap replicates was employed for phylogenetic analysis. Alignment and subsequent optimization were conducted manually using Geneious prime v. 11 (http://www.geneious.com) (Kearse et al., 2012). Next, ClustalW alignment was implemented to extract sequences of all aligned lengths. The program MrBayes 3.2.6 (Huelsenbeck and Ronquist, 2001) was used for Bayesian inference of phylogeny. Fifteen badnavirus sequences were used in the phylogeny study of FBV-1 and FBV-2. Eighteen CEVd and 15 ADFVd sequences were used to construct each phlylogenetic trees of viroids. Further, 24 EPRV were used to build the phylogenetic tree of fig pararetroviruses.
Results
HTS analyses
The Illumina platform generated total RNAseq data comprising 39,933,818 short reads of 101 bases (SRR25146190) and whole DNA reads comprising 139,317,226 reads of 151 bases (SRR27800017). The BBDuk was totally removed 3,498 and 19,038 reads from DNA and RNA data respectively to get only clean and high-quality reads. All RNA and DNA reads were paired and mapped against suspected virus and viroid genomes using Geneious software. In addition to confirming the presence of FMV, whose sequence was reproduced, an additional 1,193 RNA reads and 5,439 DNA reads were assembled separately to the FBV-1 sequence and produced a consensus sequence of 7,140 bp (Fig. 2). The genome proportion was 0.003%, and the copy number was 115. The genome has three open reading frames, two shorts (432 and 408 nucleotides), and one long with 5,514 nucleotides. The nucleotide alignment with the reference virus genome FBV-1 isolate Iraq (MW522617) was 99.8% while the amino acid alignments for each open reading frame (ORF) was 99.3%, 100%, and 99.8% respectively. The sequence displayed an open reading frame containing five major protein domains (DUF1319, Smc, RT_LTR, RNase_HI_RT_Ty3, and RT_RNaseH_2). The complete virus sequence, named FBV-1 Hilla, was deposited in GenBank under the accession number OR619560. Furthermore, 439 RNA and 5,390 DNA reads were mapped separately to the GBV-1sequence, producing a 7,239 bp consensus sequence. The genome proportion was 0.003%, and the copy number was 112. The genome has four open reading frames, two shorts (ORF1 and ORF2) at the start and one short (ORF4) at the end 432, 408, and 450 nt respectively, and one long (ORF3) in the middle with 5,601 nucleotides. The nucleotide alignment with the reference virus genome GBV-1 isolate Blu17 (OP087316) was 99.8%, while the amino acid alignments for the ORFs 1, 2, and 4 were 100% for each, and the ORF3 was 99.4% respectively. The ORF3 in GBV-1 was similar to GBV-1 isolate Blu17 in nucleotides and amino acids lengths with 5,601 and 1,866, respectively. However, the ORF4 was longer than GBV-1 isolate Blu17 with 189 nt. Further, the nucleotide alignment with FBV-2 isolate Kashagar (MW842908) was 97.7%, while the amino acid alignments for the ORFs 1, 2, and 4 were 99.3%, 99.3%, and 100.0% respectively, and the ORF3 was 99.1%. In the isolates, Korla, Kashagar, and Atushi of FBV-2, the ORF3 was shorter than GBV-1, and has 4,440 nt and encodes 1,479 residues. The sequence displayed an open reading frame containing five major protein domains (DUF1319, SbcC, RT_LTR, RVT_1RNase_HI_RT_Ty3, and ZnF_C2HC). The complete virus sequence, GBV-1Iraq, was deposited in GenBank under the accession number OR619561 (Fig. 3). CEVd was identified in 5,468 RNA reads. Its 397 bp sequence was deposited in GenBank under the accession number OR024670, and the isolate was named Najaf. ADFVd was also found in 741 RNA reads; the 305 bp-long isolate was named Kufa and deposited in GenBank under the accession number OR024671. The phylogenetic tree analysis indicated a close relationship between FBV-1 Hilla and the Iraq and Iranian isolates of the same species, with a 99.9% identity, indicating the probability of a common ancestor (Fig. 4A). GBV-1Iraq was registered in Iraq for the first time and was phylogenetically closely related to Russian isolates of grapevine badna FI virus isolate Tem64, isolate Blu17, isolate KDH48, and isolate Pal9, with 99.9% identity (Fig. 4B). CEVd was closely related to other Iraqi isolates recently discovered in Iraq and Tikrit (Fig. 5A). Interestingly, ADFVd was found for the first time in Iraq and was strongly related to the Spanish isolate OP807948 (Fig. 5B).
EPRVs
The RepeatExplorer pipeline identified 225 clusters of different repetitive elements, but only cluster 51 contained EPRV. The extracted sequence of cluster 51 was subjected to six runs of mapping that produced 7,556 nt-long of the integrated virus, which was named Caulimovirus-FCa1 following Repbase dataset regulation. The sequence encodes five protein domains: RNase-H, RT, RT-LTR, RVT-1, and MP. The genome proportion was 0.05%, and copy numbers were 1,408 in the DNA reads (Fig. 6). However, the endogenous element was found poorly in the RNA transcripts, as only 128 reads were assembled. The phylogenetical analysis revealed that Caulimovirus-FCa1 was distinct from other caulimovirus clusters and was in balance within the Florendovirus, Wendovirus, and Caulimovirus clades (Fig. 7). Moreover, the Caulimovirus-FCa1 was also detected in the genomes of the two lines ‘Horaishi’ (a female line) and ‘Caprifig 6085’ (a male line) through a blastn search. In Caprifig 6085, there were 144 hits in multiple chromosome positions, 27 hits in chr.1, 23 hits in chr. 2, one hit in chr.3, 9 hits in chr.4, 12 hits in chr.5, 5 hits in chr.6, 27 hits in chr.7, 3 hits in chr.8, 5 hits in chr.9, 17 hits in chr.10, 6 hits in chr.11, and 9 hits in chr.12. The hit lengths varied between 156 to 7,401 bp, and the pairwise identity ranged between 67% to 94.9%. Horaishi had 95 hits in multiple chromosome positions, 30 hits in chr.1, 8 hits in chr.2, 3 hits in chr.3, 17 hits in chr.4, one hit in chr.5, 3 hits in chr.6, 4 hits in chr.8, 10 hits in chr.9, 10 hits in chr.10, 7 hits in chr.11, and 2 hits in chr.12. The hit lengths ranged between 152 to 7,401 bp, and the pairwise identity varied between 67% to 96.1%. The assembled reads of the existing viruses and viroids in the total DNA and RNA reads indicated nearly equal numbers of badnaviruses in DNA with a higher presence of FBV-1 in RNA. CEVd was higher in RNA than ADFVd, whereas Caulimovirus-FC1 was abundant in DNA but nearly silent in RNA (Fig. 8).
Discussion
In recent years, the incidence of FMD in Iraq has increased, but little research has been conducted on its causal agents. Studies on FMV were conducted earlier to study its symptoms since it was the only virus suspected of infecting figs (Mohammed et al., 2019). However, along with FMV, other viruses and virus-like agents are suspected of contributing to the etiology of FMD, including FBV-1, with which FMV is often found in mixed infections (Jamous et al., 2020; Preising et al., 2021; Zagier et al., 2021). According to studies conducted in Iran and the USA, FBV-1 reduces fig tree vigor (Alishiri et al., 2016; Laney et al., 2012). Isolates from Iran have been divided into two groups and four subgroups based on their geographic origin, while those from America have been divided into three groups (Alishiri et al., 2018). So far, the complete genomes of eight FBV-1 isolates from Iran and the USA and five GBV-1 isolates from Russia have been deposited in the GenBank database. This study is the second report on FBV-1 and the first on GBV-1 in fig plants in Iraq. The two badnaviruses are consistently found in DNA with nearly equal numbers of assembled reads, genome proportions, and copy numbers, although the transcription value of FBV-1 is higher than that of GBV-1. On the other hand, higher plants like figs are infected by viroids, which are small circular and non-coding RNAs (246–401 nt). These small RNAs (sRNAs) are generated by host Dicer-like enzymes, which are involved in RNA silencing pathways. Viroid-derived sRNAs accumulate in plant tissues as a sign of ongoing infection (Chiumenti et al., 2014). CEVd presence in infected figs reveals an additional agent that may be involved in strengthening FMD. CEVd has been registered lastly in multiple hosts, like lettuce (ON993891) and onion (OR589765), from different regions in Iraq. Interestingly, this study registered ADFVd for the first time in Iraq, despite the low copies, highlighting the hidden role of this viroid in fruit infections that may seriously threaten quality and yield. Further, ADFVd was detected in apples bearing mild symptoms of dapple apple. Japanese isolates differed from those found in China and Italy figs. In graft-inoculation experiments, the symptoms varied among cultivars but were virtually identical to those reported in Italy. The symptoms caused by ADFVd were similar to those of apple fruit crinkle viroid and apple scar skin viroid, suggesting that these viruses cannot be distinguished based on symptoms alone (Kasai et al., 2017). In Iraq, more work should be conducted on multiple fruit trees, such as apple, pear, and pomegranate, which displayed typical symptoms of suspected ADFVd. Moreover, various plant species have recently been identified with novel EPRVs. The exact function of these elements remains unclear, and more work is required to draw a complete picture. Petunia vein-clearing virus, Banana streak virus, and Tobacco vein-clearing virus are examples of integrants that can arise from host genomes to become infective factors (Richert-Pöggeler et al., 2003). Similarly to Caulimovirus-FC1, which displayed a weak transcription, pararetroviruses can also express themselves at low numbers, such as those found in eggplants (Khaffajah et al., 2022). The transcription of caulimoviruses and florendoviruses in petunia genomes has been reported by Alisawi (2019). Those integrants are expressed differently depending on the host effect and specificity (Geering et al., 2014; Hansen et al., 2005). Interestingly, based on blastn search, Caulimovirus-FC1 was found fragmented over fig chromosomes with various sequence lengths and identities. The virus’s existence is demonstrated in 12 out of 13 chromosomes in Caprifig 6085, and 11 out of 13 chromosomes in Horaishi (Falistocco, 2009). It is noteworthy that Caprifig 6085 is considered the wild and ancestral form of Horaishi, the female line (Valdeyron and Lloyd, 1979), which confirms that ancient integration occurred before speciation events from previous studies of other plant genomes (Chen et al., 2014). This research topic must be extended in such hosts to investigate integrant interaction, episomal virus expression, and host specificity. Probably dating back 1.6 billion years (Richert Pöggeler et al., 2021), pararetroviruses, taxon Caulimoviridae, represent the transition from an RNA to a DNA world and are typical of retroelements equipped with reverse transcriptases. In contrast to nuclear DNA replication, viral DNA synthesis occurs in the cytoplasm rather than in the nucleus after viral genome transcription. Pararetrovirus evolution led to horizontal transmission and episomal replication through RNA recombination between ancestral genomic retroelements and exogenous RNA viruses (Richert-Pöggeler et al., 2021). Since most EPRVs are partial or contain rearranged sequences or inactivating mutations, they are transcriptionally or translationally inactive. EPRV clusters are often formed when several virus copies are integrated simultaneously in tandem or consorted (Richert-Pöggeler et al., 2003). These integrated sequences are occasionally transcriptionally active, causing RNAs to function as precursors for extrachromosomal viral DNA and causing systemic and vertical infection transmission (Gayral et al., 2008; Hohn et al., 2008). Viral promoters can activate transcription within the integrated element, and plant promoters may activate transcription surrounding EPRV sequences (Kuriyama et al., 2020; Lockhart et al., 2000). On the other hand, EPRV-derived RNAs can also induce RNA interference and gene silencing mechanisms by generating small interfering RNAs (Bertsch et al., 2009; Ricciuti et al., 2021). In fig trees, Caulimovirus-FCa1 was the first EPRV reported; it probably shares a role with pathogenic viruses in strengthening the infection. More efforts are required to clarify such interaction. Crops worldwide suffer from huge economic losses due to viral diseases, and their management is a big challenge for growers and researchers. Mixed infection occurs when more than one virus is present in a single plant, causing varying symptoms simultaneously. Understanding the etiology of a disease when multiple viruses are present remains challenging. The latent nature of viral diseases and the low severity of symptoms make most of them go unnoticed. Disease symptoms can be more severe when these viruses cause infection in conjunction with other viruses at a particular time. To successfully manage viral diseases, particularly mixed infections, detecting and identifying the plant viruses causing the disease is imperative. Several approaches, such as next-generation sequencing/HTS, have been developed (Singhal et al., 2021) to detect mixed viral infections. A HTS system has become indispensable for analyzing plant virus diversity, as it is potent for identifying mixed infections; it allows the identification of all viruses present in plant samples without previous sequence information (Adams et al., 2009; Al Rwahnih et al., 2009; Donaire et al., 2009; Kreuze et al., 2009; Villamor et al., 2019). In mixed infections, several types of synergistic or antagonistic interactions occur between and among the viruses, exacerbating the disease with more severe symptoms than in single infections (Singhal et al., 2021). Synergistic interactions are usually described as mix infection of viruses resulting in more severe symptoms (Syller, 2012). Tatineni et al. (2022) examined the infection effects of different combined four viruses in wheat. Symptom observations and viral RNA and coat protein measurements were done to assess the infection’s outcome. Some virus combinations exhibited stronger symptoms without increasing virus titers. Field-grown crops exhibit complex antagonistic and synergistic interactions between viruses. From staple crop studies to investigations of important cash crops to the intricate synergistic effects on the tripartite interactions between viruses, plants, and vectors, these studies present various perspectives on current research on mixed infections of plant viruses in nature. A deep understanding of the mechanisms of mixed infections is crucial for developing effective and stable control strategies for viral pathogenesis and evolution (Xu et al., 2022). This study conducted the first genome search in Iraqi fruit trees to summarize most of the causal agents found in pathogenic tissues. Additional work is needed to clarify their interactions. Notably, this study indicates the impact of each virus within a mix infection of viruses based on its abundance in infected tissues. More research is needed to evaluate the roles of each virus and viroid and the antagonistic and synergistic interactions between individuals applying whole-genome sequencing and bioinformatics techniques.
This study revealed the presence of two badnaviruses and two viroids associated with FMD in Iraq using whole-genome sequencing and bioinformatics techniques. A novel EPRV was also detected in the examined fig genome, RNA transcripts, and also the related ancestor lines. This finding confirms these integrants’ role in genome biology and activity alongside pathogenic viruses and viroids, which remain poorly understood. Phylogenetic analysis confirmed the relationships between the agents examined and their common ancestry.
Notes
Conflicts of Interest
No potential conflict of interest relevant to this article was reported.