11.7 C
New York
Friday, March 24, 2023

Phylogenomic evaluation of Wolbachia genomes from the Darwin Tree of Life biodiversity genomics challenge


The Darwin Tree of Life (DToL) challenge goals to sequence all described terrestrial and aquatic eukaryotic species present in Britain and Eire. Reference genome sequences are generated from single people for every goal species. Along with the goal genome, sequenced samples typically comprise genetic materials from microbiomes, endosymbionts, parasites, and different cobionts. Wolbachia endosymbiotic micro organism are present in a variety of terrestrial arthropods and nematodes, with supergroups A and B the commonest in bugs. We recognized and assembled 110 full Wolbachia genomes from 93 host species spanning 92 households by filtering knowledge from 368 insect species generated by the DToL challenge. From 15 contaminated species, we assembled a couple of Wolbachia genome, together with instances the place people carried simultaneous supergroup A and B infections. Totally different insect orders had distinct patterns of an infection, with Lepidopteran hosts largely contaminated with supergroup B, whereas infections in Diptera and Hymenoptera have been dominated by A-type Wolbachia. Apart from these large-scale order-level associations, host and Wolbachia phylogenies revealed no (or very restricted) cophylogeny. This factors to the incidence of frequent host switching occasions, together with between insect orders, within the evolutionary historical past of the Wolbachia pandemic. Whereas supergroup A and B genomes had distinct GC% and GC skew, and B genomes had a bigger core gene set and tended to be longer, it was the abundance of copies of bacteriophage WO who was a powerful determinant of Wolbachia genome measurement. Mining uncooked genome knowledge generated for reference genome assemblies is a sturdy manner of figuring out and analysing cobiont genomes and giving better ecological context for his or her hosts.


The pure world is a posh internet of interactions between residing species. These interactions will be mutualistic, commensal, pathogenic, parasitic, predatory, or inconsequential, however every particular person lives alongside a wealthy variety of cobionts. Most eukaryotes affiliate intimately with a selected microbiota and are generally contaminated by a spread of microbial and different pathogens. For some microbial associates, the excellence between mutualism and pathogenicity or parasitism is fuzzy. For instance, Wolbachia (Proteobacteria; Alphaproteobacteria; Rickettsiales; Anaplasmataceae; Wolbachieae) are discovered residing intracellularly in a spread of terrestrial arthropods and nematodes. No free-living Wolbachia are recognized: The affiliation is important for his or her survival. In distinction, an infection with Wolbachia will be helpful to hosts however shouldn’t be often important.

Wolbachia have been first recognized as mosquito endobacteria that have been maternally transmitted, by the oocyte, and that induced a spread of reproductive manipulations on their hosts [1,2]. The commonest manipulation by Wolbachia is to induce cytoplasmic incompatibility (CI). Below CI, contaminated females are in a position to mate productively with all males, however uninfected females are solely in a position to mate with uninfected males (as mating with CI-inducing Wolbacha-infected males ends in zygotic loss of life). This asymmetry in health can drive unfold of the CI-inducing Wolbachia. Different reproductive manipulations embody feminisation of genetic males [3], male killing [4], and induction of parthenogenesis in females [5]. All these manipulations promote the transmission of contaminated oocytes to the subsequent host era and thus increase the unfold of Wolbachia. In most species that may be contaminated, populations are a mixture of contaminated and infection-free people, and hosts can evolve to withstand an infection [6,7]. Whereas Wolbachia are sometimes described as reproductive parasites, affiliation with Wolbachia can generally have helpful results, offering dietary supplementation to phloem-feeding Hemiptera [8] and enhancing host immunity to viruses and Plasmodium parasites [9]. Certainly, the host immunity-boosting phenotype might clarify the preliminary unfold of Wolbachia in beforehand uninfected populations. In nematodes, elimination of Wolbachia induces host sterility, and antibiotic therapy is an efficient addition to pharmacological therapy of human-infecting, Wolbachia-positive filarial nematodes [10].

Wolbachia an infection of terrestrial arthropods is quite common, with almost half of all insect species predicted to be contaminated [11]. Wolbachia will be categorized utilizing molecular phylogenetic analyses right into a collection of supergroups [12,13]. Supergroups C, D, and J are discovered solely in filarial nematodes; supergroups E and F are present in each nematodes and bugs; and supergroups A, B, and S (and others for which full genome knowledge should not out there) are discovered solely in arthropods. Supergroups A and B are the commonest Wolbachia present in terrestrial bugs.

Evaluation of Wolbachia biology has been expanded by the willpower of genome sequences for a lot of isolates. The genome sequences for Wolbachia from over 90 host species are publicly out there, and mining of host genomic uncooked sequence knowledge recognized a lot of extra partial genomes [14,15]. This understanding, that cobiont genomes will be assembled from the “contamination” current within the knowledge generated for a goal host, has been particularly helpful for the unculturable Wolbachia. We now have the chance to survey for the presence of Wolbachia genomes at an unprecedented scale, because the Darwin Tree of Life (DToL) challenge goals to sequence all described terrestrial and aquatic eukaryotic species present in Britain and Eire [16]. This challenge is utilizing high-accuracy lengthy learn and chromatin conformation lengthy vary sequencing to generate and launch publicly out there chromosomal genome assemblies, assembly precise requirements of contiguity and completeness, for hundreds of protists, fungi, vegetation, and animals. A number of hundred terrestrial arthropod assemblies are already out there (https://portal.darwintreeoflife.org). The DToL challenge sequences genomes from particular person, wild-caught specimens of goal species, and thus will even generate knowledge for the cobiome current in every specimen on the time of sampling. For a lot of smaller-bodied bugs, the entire organism is extracted. The place Wolbachia disseminates broadly inside an organism, it’s inevitable that cobiont genomes will likely be sequenced alongside the host genome.

Utilizing k-mer classification instruments, it’s potential to effectively and appropriately separate out cobiont knowledge from that of the host and to ship clear host assemblies [1719]. The cobiont knowledge are then out there for unbiased meeting and evaluation. Right here, we current a survey of the primary 368 terrestrial arthropod genome datasets produced in DToL for the presence of Wolbachia and assemble over 100 new Wolbachia genomes. We use these to discover patterns and processes in bacterial genome evolution and coevolution of Wolbachia with its hosts and with its personal bacteriophage parasites. Lepidopteran hosts have been largely contaminated with supergroup B, whereas infections in Diptera and Hymenoptera have been primarily brought on by A-type Wolbachia. Nonetheless, host and Wolbachia phylogenies revealed no (or very restricted) cophylogeny. We present that whereas B genomes tended to be longer in comparison with supergroup A, genome measurement in Wolbachia is correlated with the extent of integration of its double-stranded bacteriophage WO.


Screening a various set of insect genome knowledge for Wolbachia infections

We screened uncooked genomic sequence knowledge and first assemblies for 368 insect species (204 Lepidoptera, 61 Diptera, 52 Hymenoptera, 24 Coleoptera, 9 Hemiptera, 5 Trichoptera, 4 Orthoptera, 3 Ephemeroptera, 3 Plecoptera, 2 Odonata, and 1 Neuroptera) generated by DToL for the presence of Wolbachia (S1 Desk) utilizing the small subunit ribosomal RNA (SSU rRNA) as a marker gene. Wolbachia SSU sequences have been detected in 111 (30%) of the species. This degree of an infection shouldn’t be reflective of complete incidence, the proportion of host species inclined to an infection, as just one particular person was analysed for every taxon screened. Wolbachia prevalence, the proportion of contaminated people in a inhabitants, and an infection depth differ between species and between populations inside a species [20,21]. Due to this fact, the true incidence of an infection throughout the insect biota surveyed by DToL is probably going a lot greater. Nonetheless, the measured incidence of an infection is just like earlier survey-based estimates (roughly 22%) [22,23] however, as anticipated, is decrease than estimates deploying mathematical fashions to account for sampling bias (40% to 50%) [11,24]. An infection incidence was decrease in Coleoptera (4/24, 17%) in comparison with Lepidoptera (55/204, 27%), Diptera (21/61, 34%), and Hymenoptera (23/52, 44%) (Fig 1A).


Fig 1. Prevalence and relative abundance of Wolbachia in DToL insect genomes.

(A, B) Prevalence of Wolbachia in insect hosts, cut up by taxonomic order (A) and by intercourse (B). The cladogram of insect ordinal relationships is predicated on Misof and colleagues [28]. Orders with greater than 10 analysed species are proven in daring. Silhouettes are from PhyloPic (http://phylopic.org/). Intercourse of bugs was categorized as F (feminine), M (male), or U (unknown, the place not recorded on assortment). The information underlying this Determine will be present in S1 Knowledge. (C, D) The estimated variety of Wolbachia genomes per copy of the host nuclear genome cut up by taxonomic order (C) and by intercourse (D). The information underlying this Determine will be present in S1 Knowledge.


Though maternal inheritance requires that Wolbachia are predominantly localised within the germline, tropism to somatic cell sorts has been proven to be extremely regulated throughout host improvement [25,26]. We didn’t observe a bias in an infection degree by analysed tissue sort (S1 Fig), or by gender, with an equal prevalence of an infection in samples recognized as feminine (39/138, 28%) and male (45/153, 29%) (Fig 1B). Whereas the DToL challenge goals to sequence eukaryotes from throughout Britain and Eire, 82% of the samples screened have been sampled from the Wytham Woods Ecological Observatory, Oxfordshire (https://www.wythamwoods.ox.ac.uk/) [27]. No correlation between sampling location and an infection degree was detected, with 29% of all samples collected in Wytham Woods being Wolbachia optimistic, reflective of the general incidence degree (S2 Fig).

The DToL species have been sequenced utilizing PacBio Sequel II HiFi extremely correct lengthy learn platform, producing consensus uncooked reads of 10 to twenty kb with base degree accuracy of >99% (roughly Q30 to Q40). These lengthy, correct reads are perfect for meeting, notably for bacterial genomes the place the data content material per base is greater than in repeat-rich eukaryotes. The common sequence size of HiFi reads recognized as being derived from Wolbachia was 12 kb, indistinguishable from host HiFi reads. We separated and assembled all Wolbachia reads in every optimistic pattern and screened these assemblies to establish full genomes. We generated 110 full genomes, from 93 species, of which 77 have been round (S2 Desk). The common completeness of those genomes, assessed utilizing BUSCO, was 99.3%, with a imply duplication degree of 0.37%. The imply genome measurement of the brand new genomes was 1.47 Mb, which is considerably bigger than the common genome measurement of public Wolbachia genomes (1.32 Mb; Wilcoxon rank sum take a look at, p-value = 4.576 × 10−9) (S3 Fig). That is probably as a result of it’s potential to assemble throughout repeated loci (corresponding to built-in Wolbachia phage) with the lengthy, correct HiFi reads. The imply variety of contigs generated for the 33 genomes that might not be circularised was 2.12 (starting from 1 to six).

The dataset contains the primary full round Wolbachia genomes assembled from two insect orders, Odonata (dragonflies and damselflies) and Orthoptera (grasshoppers and crickets). Each species of dragonfly surveyed (Odonata) harboured Wolbachia (Fig 1A). The biggest round Wolbachia genome generated, 2.19 Mb, was remoted from the blue-tailed damselfly. That is the longest full Wolbachia genome but reported (S3 Fig). Though in most samples an infection by solely a single Wolbachia pressure was detected, 15 of 93 specimens (16%) have been contaminated with not less than two Wolbachia genomes. Inside Phalera bucephala (Lepidoptera) and Lasioglossum morio (Hymenoptera), three genomes have been assembled, whereas all different coinfections concerned two strains.

Having chromosomally full insect host genomes, in addition to full Wolbachia, permits for the estimation of the relative numbers of Wolbachia genomes per host genome. Wolbachia proliferation appears to be tightly managed and a relative abundance under ten Wolbachia genomes per host nuclear genome was noticed in most contaminated hosts. Significantly excessive abundances have been noticed in Thymelicus sylvestris and Athalia cordata (48 and 47 Wolbachia per host genome, respectively) (S2 Desk) (Fig 1C). The imply relative abundance in numerous taxonomic orders lay between 3 and 12, aside from the 2 crickets (Orthoptera), Chorthippus brunneus and Chorthippus parallelus, which have a 33 and 20 Wolbachia genome copies per host genome, respectively (Fig 1C). No important distinction was noticed between relative Wolbachia abundance and intercourse of the host (Fig 1D), with each female and male having a imply between 9 and ten copies.

Wolbachia phylogeny suggests frequent host switching occasions

We chosen 93 high-contiguity and high-completeness Wolbachia genomes from the general public INSDC databases, together with genomes from Wolbachia infecting Nematoda (13 genomes), Arachnida [4], Isopoda [1], and a number of other orders of Hexapoda [75] (S3 Desk). Including the 110 newly assembled genomes yielded a dataset of over 200 high-quality assemblies. We annotated all protein-coding genes in these genomes utilizing Prodigal [29] and clustered the expected protein units into orthologous teams utilizing OrthoFinder2 [30]. The ensuing 634 near-single copy genes have been used to deduce a phylogeny of Wolbachia (Figs 2A and S4). From this phylogeny, we assigned every genome to the beforehand outlined Wolbachia supergroups [12,13]. All newly assembled Wolbachia genomes belonged to both supergroup A or B. Whereas Lepidoptera have been predominantly contaminated with supergroup B Wolbachia (42/53, 80%), Wolbachia supergroup A was most frequent in all different insect courses (46/57, 81%). It has been beforehand noticed that supergroup B is the commonest Wolbachia sort in Lepidoptera [22,3133]. Of the 15 species the place coinfections occurred, Endotricha flammealis, Phalera bucephala, Philonthus cognatus, Protocalliphora azurea, and Sphaerophoria taeniata have been coinfected with strains from each A and B supergroups, and the opposite ten coinfections have been of distinct strains throughout the identical supergroup (S2 Desk).


Fig 2. Wolbachia DToL genomes broaden recognized phylogeny.

(A) Round phylogeny of supergroup A and B Wolbachia, visualised with the foundation positioned between the A and B supergroups and the remaining supergroups (C, D, E, F, J, S; nodes collapsed as gray wedge), highlighting newly sequenced genomes (black tip labels) and genomes from public databases (white). (B) Incongruence between host topology (left) and supergroup A and B Wolbachia topology (proper) is proven as a tanglegram. Overview of the supergroups infecting numerous insect orders is given in a desk (inset, backside proper). A purple field is drawn to level to a bunch switching occasion; see panel C. (C) Instance of a bunch switching occasion, the place the Wolbachia of the hoverfly Eupeodes latifasciatus has excessive nuclear sequence identification and genome colinearity to 4 Wolbachia genomes assembled from Lepidoptera.


Wolbachia usually don’t present strict cophylogeny with their hosts [7,21]. This sample was additionally noticed when evaluating host and Wolbachia phylogenies for the supergroup A and B genomes (Fig 2B). Carefully associated insect species could also be contaminated by dissimilar Wolbachia strains, and, conversely, intently associated Wolbachia can infect a various set of bugs. For instance, the Wolbachia strains infecting the hoverfly Eupeodes latifasciatus and 4 Lepidoptera (Pararge aegeria, Celastrina argiolus, Hylaea fasciara, and Watsonella binaria) (Fig 2C) share over 99% nucleotide identification. Though horizontal transmission appears to have been a dominant sample within the evolutionary historical past of Wolbachia, the propensity of Lepidoptera to be contaminated by Wolbachia sort B underlines the significance of distribution by cospeciation. As a result of most of our new samples got here from a single website (Wytham Woods Genomic Observatory), we have been additionally in a position to discover the horizontal switch of Wolbachia between hosts in a neighborhood context. Wytham Woods–derived Wolbachia have been no extra prone to be associated than every other Wolbachia subset (S5 Fig).

Intrinsic properties of Wolbachia distinguish supergroups

The completeness of the brand new genomes and, particularly, the round assemblies achieved for 77 of them permits analyses of genome properties that aren’t potential with fragmented and partial genomes. All circularised genomes, together with these from public databases, have been rotated to start out on the presumed origin of replication. The common pairwise whole-genome nucleotide identification between all Wolbachia genomes ranged between 77.3% and 100.0%, with not less than 92.8% and 93.5% identification inside supergroups A and B, respectively (Fig 3A). The variety of breakpoints interrupting pairwise whole-genome alignments was counted, normalised for the whole alignable size, and in comparison with common nucleotide identification (ANI) of the in contrast genomes (Fig 3A). A big correlation was noticed between nucleotide divergence and the variety of breakpoints in supergroups A (0.90, p < 2.2 × 10−16, Spearman correlation) and B (0.69, p < 2.2 × 10−16, Spearman correlation) (Fig 3A). This broad vary of nucleotide variety, even inside a supergroup, is indicative of the low degree of conserved synteny inside supergroups and the extent of rearrangements occurring.


Fig 3. Comparative genomics of Wolbachia.

(A) Complete-genome common nucleotide identification (ANI) plotted towards the variety of breakpoints in comparisons inside A supergroup genomes, inside B, between A and B and between different supergroup Wolbachia. The information underlying this Determine will be present in S1 Knowledge. (B) Index of skewness in comparison with GC content material for all circularised Wolbachia genomes. The information underlying this Determine will be present in S1 Knowledge.


Secure bacterial genomes accumulate extra guanines than cytosines on the strand within the path of replication. This phenomenon, GC skew, arises on account of differential mutation pressures on main versus lagging strands. Genomes which have undergone frequent rearrangement are anticipated to have decrease general GC skew, which will be summarised throughout the genome as a single metric, SkewI [34]. Genomes from supergroups A and B had distinct GC contents (Fig 3B), with supergroup A having the next imply GC (35.2%, normal deviation 0.15%), in comparison with B (34.0%, normal deviation 0.16%) (two-sample t take a look at p-value < 2.2 × 10−16). Genomes from different supergroups had distinct GC content material, typically very totally different from A and B genomes, however as so few examples have been sequenced, basic patterns should not discernible. In each A and B supergroups, SkewI values have been comparatively low, however genomes from Wolbachia from nematode hosts (C, D, J) had greater SkewI values (Fig 3B). A excessive diploma of GC skew was beforehand reported in supergroup C Wolbachia strains infecting filarial nematodes [35], and these genomes even have low rearrangement ranges and excessive gene-level synteny. In supergroups A and B, the low degree of skew is related to excessive ranges of chromosomal rearrangement (Fig 3A).

Conservation and variety in gene content material of Wolbachia

Wolbachia, as a result of they’re sheltered throughout the cells of their hosts, could also be comparatively remoted from different micro organism and thus have considerably closed pan-genomes. One path to acquisition and sharing of latest genes is thru the Wolbachia phage (WO phage), which alongside the important phage particle structural genes carry a cargo of genes which were implicated in host manipulation. We reannotated all 203 Wolbachia with the identical, normal gene discovering toolkit, Prodigal, to normalise annotations. Whereas this may increasingly have misplaced cautious handbook revision in beforehand decided gene units, it avoids points of knowledge incompatibility. Gene quantity correlated with genome measurement and the common gene quantity within the newly assembled set of supergroup A and B Wolbachia was bigger than in A and B genomes from the general public databases (S6 Fig). Evaluating all genomes, the imply variety of predicted genes was bigger in supergroup B (1,467) in comparison with A (1,385).

We used OrthoFinder with default settings to outline clusters of orthologous proteins throughout all Wolbachia genomes. Every genome contained between 0 and 184 novel, strain-specific genes (common 19). These novel genes have been shorter than all genes (common gene size general was 875 nucleotides or roughly 290 amino acids, whereas novel genes averaged 434 nucleotides or roughly 145 amino acids). As anticipated, supergroups that weren’t nicely represented typically contained extra strain-specific genes. For instance, wCfeT from supergroup E (which infects cat fleas, Ctenocephalides felis) uniquely encoded genes for pantothenate (panC-panG-panD-panB) [36] and thiamine (thiG-thiC) biosynthesis. Nonetheless, out of the ten genomes with most strain-specific genes, seven belonged to both supergroup A or B. These novel genes weren’t preferentially related to WO phage areas (S7 Fig), however the majority (78%) had annotations that related them with transposon and cell aspect operate. This implies that a lot of the novelty is related to cell components apart from WO phage, however we word that the growth in gene quantity could also be on account of cell element-driven pseudogenisation. Apart from clusters with one or two members, probably the most often noticed cluster sizes have been 203 ± 2. These clusters contained the single-copy (and near-single-copy) orthologs deployed in phylogenetic analyses (Fig 4A). Total, the vast majority of the proteins encoded within the Wolbachia genomes have been members of orthology clusters that have been current in not less than 95% of all strains.


Fig 4. Exploration of Wolbachia protein-coding gene variety.

(A) Histogram of protein household measurement per supergroup. The information underlying this Determine will be present in S1 Knowledge. (B) Rarefaction evaluation of pan- and core proteomes of supergroups A and B, based mostly on 500,000 random addition-order permutations of co-occurring orthogroups excluding novel genes. The information underlying this Determine will be present in S1 Knowledge. (C) Synteny of the biotin cluster reveals conserved gene order and punctuated sample of species presence (inset, species with biotin cluster current are highlighted with purple circles).


The considerable sampling of supergroup A and B genomes allowed us to deal with and examine the sizes of the core- and pan-proteomes of those teams. The bigger genome and proteome measurement present in supergroup B was mirrored in a bigger core proteome (Fig 4B), however supergroup A had a bigger pan-proteome (Fig 4B). Whereas the core proteomes differed, only a few of the protein households that have been a part of every supergroup’s core proteome have been distinctive to that supergroup. One supergroup-restricted set of protein households was discovered to comprise the operon for arginine transport (ArtM, ArtQ, and ArtP and the repressor of arginine degradation ArgR) [37], which was uniquely detected and conserved in supergroup A (current in 83/103 or 80% of all Wolbachia A genomes). Though the periplasmic arginine-specific binding protein (ArtI or ArtJ) was not detected, the presence of this ATP-binding cassette-type (ABC) transporter means that these Wolbachia are buying arginine from their hosts.

The operon-producing biotin (vitamin B7) [38] was detected in seven of the 110 new genomes, all belonging to supergroup A (Fig 4C). One derived from Icerya purchasi (Hemiptera), and 6 have been from Hymenoptera (two from Lasioglossus malacharum, which carried two strains, and single strains from three Andrena and a Nomada species). The biotin synthesis cluster has been described beforehand from a restricted however numerous set of supergroups, together with two A genomes from extra Nomada bee hosts. This distribution suggests potential ecological linkage [39], as Andrena bees are kleptoparasitised by Nomada cuckoo bees and phylogenetic analyses of each the biotin gene clusters and the Wolbachia core proteomes present shut relationships between these clusters of genomes (S8 Fig). The gene cluster is strongly conserved in bodily organisation of all six obligatory genes (bioA-D,F,H). Within the genomic area instantly surrounding the operon, we recognized recombinase and transposase genes, in addition to ankyrin repeat containing genes and toxin–antitoxin CI Cin gene pairs. In three genomes (from Andrena dorsata, Nomada fabricium, and one of many L. malacharum strains), the operon was independently disrupted by transposases. The area containing the biotin operon thus has the hallmarks of a “virulence island” that could be cell between genomes and will have accrued extra genes (ankyrin, Cin) that hitchhike with the biotin operon.

WO prophage insertions broaden genome measurement

Wolbachia can itself be contaminated by double-stranded DNA temperate bacteriophages, WO phage, which might combine within the genome of its host as a prophage. 4 modules are obligatory for building and performance of phage particles throughout the lytic stage: head, baseplate, tail, and fibre, and inserted and pseudogenised WO phage will be recognized and discriminated based mostly on the presence and completeness of those parts. Areas of a Wolbachia genome flanked by WO phage modules are prone to type parts which are transduced by the phage throughout an infection of latest cells, “cargo” loci that type the eukaryotic affiliation module (EAM) [40,41]. All of the Wolbachia genomes have been screened for prophage areas utilizing important module genes from beforehand annotated WO insertions (S4 Desk). Prophage areas have been deemed putatively full when all 4 modules have been noticed with not less than 80% of genes of every module current. An abundance of putative intact and pseudogenised WO phage have been recognized. For instance, the supergroup B Wolbachia from Ischneura elegans (the bluetail damselfly; the biggest Wolbachia genome assembled) contained three putative intact prophage and 9 WO phage fragments (Fig 5A) summing to 0.8 Mb of the genome.


Fig 5. WO prophage in Wolbachia.

(A) Annotation of the WO prophage built-in within the genome of the Wolbachia pressure infecting Ischnura elegans. (B) Wolbachia genome measurement is strongly correlated with built-in prophage span in supergroups with WO phage affiliation. Phylogenetic generalised least squares (PGLS) analyses have been carried out to evaluate the correlation between prophage size and genome measurement in a phylogenetically conscious method. The information underlying this Determine will be present in S1 Knowledge.


The fraction of complete prophage area in every genome ranged from 0% to 38%. Nematode-associated Wolbachia sometimes should not contaminated by WO phage [42], and no prophage areas have been detected in genomes of supergroups C, D, J, and nematode-infecting F (Fig 5B). A big correlation was discovered between genome measurement and WO prophage span in supergroups A and B (Fig 5B). This affiliation was sturdy to correction for phylogenetic relatedness of the genomes (mannequin match elevated to 0.84 and 0.87, respectively, with p-values <10−16).

Toxins are sometimes related to cell components

We recognized a number of potential cargo genes inside intact and fragmented prophage. These included transposases and integrases related to cell components, and different loci beforehand related to eukaryotic manipulation, corresponding to CI loci and ankyrin repeat containing genes, as anticipated from the EAM mannequin [40,41].

Wolbachia produces a set of poisons [43] that may have dramatic results on their hosts, corresponding to CI. The CI phenotype is brought on by two adjoining genes, CifA and CifB, which operate as a toxin–antitoxin pair [44,45]. Phylogenetic evaluation categorized most Wolbachia Cif gene pairs into 4 sorts (I to IV) [46]. A fifth sort (V) is rather more variable in construction. The toxin element can have nuclease exercise (wherein case the gene pair is often known as CinA-CinB), deubiquitinase (CidA-CidB), or each (CndA,CndB) [47]. All sort II, III, and IV pairs have nuclease domains, whereas all sort I’ve deubiquitinase and most have nuclease [46]. 300 and 5 full-length and certain purposeful Cif pairs have been detected in 140 of the 181 (77%) supergroup A and B genomes. One Cif pair was detected in most genomes, however many had a number of, with seven copies within the Wolbachia pressure infecting the holly tortrix moth (Rhopobota naevana). A lot of the gene pairs contained a deubiquitinase area (sort I, Cid) [87] or belonged to sort V (90), whereas the opposite three sorts occurred in roughly equal proportions (II: 39, III: 44, IV: 34) (S9 and S10 Figs). Many pairs (213/305; 70%) have been positioned within the predicted EAM of the prophage.

Loci encoding extra toxins corresponding to RelE/RelB and latrotoxin have been recognized in a number of Wolbachia genomes, often in prophage areas (175/586 [30%], 130/256 [51%] genes, respectively) (summarised in S5 Desk). The Tc pore-forming toxin complicated, which consists of two genes TcA (S11 Fig) and TcB-C (S12 Fig), was detected in a restricted variety of A and B supergroup genomes and likewise confirmed a predisposition to happen inside prophage (42/69 [61%] and 19/35 [54%], respectively). Extra toxin-encoding loci had restricted presence in numerous subgroups and weren’t related to prophage areas. ParD/ParE (S13 and S14 Figs) solely occurred in supergroups A, B, and E, and FIC (S15 Fig) solely occurred in supergroups A, E, F, and S. The sort IV toxin–antitoxin gene pair AbiEii/AbiGii-AbiEi, which protects towards the unfold of phage an infection [48], was solely detected in two genomes in supergroup E. It’s noteworthy that these two genomes had very low ranges of prophage-derived DNA (4.3% of their genome span).


Isolation of cobiont genomes, and particularly Wolbachia genomes, from shotgun high-throughput sequencing knowledge has been established for a few years [49]. Within the subject of prokaryotic and eukaryotic microbial metagenomics, metagenome-assembled genomes (MAGs) are prone to be the one strategy to entry many unculturable microbial genomes, even when the species they derive from are hyperabundant [50,51]. The abundance of uncooked sequencing knowledge within the Worldwide Nucleotide Sequence Database Collaboration (INSDC) databases has been a lovely prospecting floor for microbial associates of eukaryotic goal species. To this point, most uncooked knowledge out there for such searches have been quick reads from Illumina and different platforms. These reads are too quick to partition effectively into bins equivalent to putative distinct genomes. Preliminary meeting of such datasets is extra probably to have the ability to separate cobionts from goal genomes. These approaches have been utilized to hunt for Wolbachia with a latest tour de pressure producing almost 1,200 Wolbachia MAGs from publicly out there knowledge [14]. Nonetheless, these MAGs undergo from the anticipated problems with low completeness (on account of low efficient protection), fragmentation (on account of protection and sequence repeat points), undetected contamination, and lack of ability to tell apart coinfecting strains. Furthermore, the biased nature of public knowledge meant that these derived from solely 37 totally different host species.

We generated 110 Wolbachia assemblies from 368 terrestrial arthropod HiFi datasets, and 77 of those have been totally round genome assemblies. The genomes have been uniformly of excessive completeness (S3 Fig). As a result of excessive intrinsic base high quality of HiFi reads (Q30 to Q40; from one error in 1,000 to 1 error in 10,000), we have been in a position to distinguish insertions of Wolbachia DNA into the host genome from true parts of the Wolbachia genome and to independently assemble even intently associated strains with confidence. As we have been screening uncooked knowledge from a biodiversity genomics programme that goals to pattern a large phylogenetic variety of hosts, the brand new Wolbachia genomes introduced right here greater than double the variety of totally different host species from which Wolbachia genomes have been assembled. The assembled genomes embody the primary full representatives remoted from Odonata (damselflies) and Orthoptera (crickets). In 16 extra datasets, we recognized probably Wolbachia content material however weren’t in a position to produce credible genome assemblies (see S1 Knowledge and S2 Desk). This was often as a result of the Wolbachia sequence was current in very low efficient protection (roughly 3-fold), however in some samples, no credible meeting was generated regardless of excessive protection. These datasets might comprise a number of recombining strains or comprise giant insertions within the host genome and deserve additional exploration.

The distribution of Wolbachia in insect hosts is a operate of the stability between retention by cospeciation (vertical transmission of Wolbachia to daughters of the host species), acquisition by horizontal transmission (the place strains transfer between host species), and occasions of loss. Transmission amongst insect hosts was the dominant sample underpinning Wolbachia distribution. We word that earlier work has recommended that horizontal transmission somewhat than cospeciation might even clarify the presence of intently associated Wolbachia infecting intently associated taxa. For instance, genomic divergence between intently associated Wolbachia in sister Drosophila species was too low to be the product of unbiased evolution for the reason that final frequent ancestor of the flies [52,53]. Nonetheless, we recognized two options of the distribution, one native and one basic, that are of word. Lepidoptera have been extra prone to be contaminated with supergroup B Wolbachia than A, and Hymenoptera, Diptera, and Coleoptera have been extra prone to be contaminated with supergroup A strains. Multilocus sequence typing (MLST) has beforehand proven that supergroup B is the commonest Wolbachia sort in Lepidoptera [22,3133]. This implies some nonexclusive specialisation of Wolbachia on their hosts, which can be pushed by the interplay of Wolbachia and host genetics and/or a definite set of ecological transmission routes in every insect group. A lot of our genomes derived from bugs have been collected at one website, the Wytham Woods Genomic Observatory (S2 Fig), however this subset was no extra intently associated than different genomes from broadly separated websites (S5 Fig). It’s probably that the mobility of hosts, together with by seasonal migration, implies that sampling from one geographical website is a legitimate approximation of extra international sampling. Shut ecological affiliation between host species might promote sharing of Wolbachia isolates and localised genetic trade, for instance, inside predator–prey methods. The shut similarity of Wolbachia genomes from Andrena solitary bees and their Nomada cuckoo bee kleptoparasites (Fig 4C, inset) and the shared incidence of the biotin synthesis operon (Fig 4C) could also be a case of transmission inside an ecological community. The presence of the biotin operon in Wolbachia of bugs that largely or solely feed on low-protein plant fluids (nectar or phloem) means that Wolbachia could also be providing dietary help to their hosts [54] and thus that this cluster of genomes might have been positively chosen for his or her mutualist tendencies.

Wolbachia can promote reproductive success of their feminine hosts [1,2], and thus their very own Darwinian health, by reproductive manipulations corresponding to CI. The loci underpinning CI are a various set of toxin–antitoxin gene pairs. Our survey of Wolbachia recognized many extra CI gene pairs, primarily of the I Cid sort and largely related to WO phage. Many genomes had a couple of toxin–antitoxin pair, and a few particular person hosts have been contaminated with a number of Wolbachia strains carrying totally different CI gene pairs. These CI genes probably mediate battle between Wolbachia strains and the ecosystem of toxin deposition and rescue in particular person zygotes have to be complicated [46,55,56]. Curiously, we recognized CI gene pairs subsequent to five of the 14 biotin synthesis operons, suggesting that the cell components that transduce this presumably mutualist physiology are additionally engaged in CI battle.

One hanging function of the genomes assembled from the HiFi reads was that their common span was roughly 10% better than the common measurement of beforehand assembled Wolbachia genomes. As we additionally noticed a correlation between content material of WO phage within the genome and genome measurement (Fig 5B), we speculate that the decrease common measurement of earlier assemblies could also be as a result of the presence of near-identical segments of phage and different cell components led to break down of repeats and synthetic underestimation of true genome measurement. This underestimation of genome measurement can also have biased understanding of WO phage variety and of the range of genes that may be transduced by the phage. WO phage carry genes obligatory for manufacturing of phage particles and cargo genes which were hypothesised to type an EAM [40,41]. The elevated genome measurement and elevated decision of WO phage copies may additionally imply elevated gene content material and variety and an elevated set of frequent EAM loci. We estimated the pan-proteome of A and B supergroup strains and located that the supergroup A had the next pan-proteome however a smaller core proteome than supergroup B. Coupled with the remark of host-association bias between these supergroups, and different main genomic options corresponding to GC proportion, this implies that these divergent teams have adopted very distinct evolutionary trajectories, regardless of proof for transduction of loci between supergroups, and maybe have advanced distinct physiologies and host-manipulation or host-cooperation methods. We word that the ANI between A and B supergroup strains, and between strains from all supergroups, is comparatively low (within-supergroup identification >93%, between-supergroup identification <88%). This sample of great phylogenetic separation between supergroups suggests, as others have famous, that these supergroups have the options anticipated of bacterial species [37].

The DToL challenge [16] is certainly one of a rising constellation of biodiversity genomics initiatives worldwide that, underneath the banner of the Earth BioGenome Venture [57], intend to “sequence life for the way forward for life” (https://www.earthbiogenome.org). These initiatives, based mostly round ecological, regional, or taxonomic lists of goal species, will lay the foundations for organic analysis, bioindustry, and conservation for the subsequent a long time. Whereas their focus is to generate reference genomes for eukaryotic species, these initiatives will even yield crucial assets for the research of the microbial cobionts—mutualists, pathogens, parasites, and commensals—which dwell on and in eukaryotic organisms. Our understanding of Wolbachia and different frequent endosymbionts will thrive on a wealthy harvest of cobiont genomes from the tens to tons of of hundreds of host genomes that will likely be generated within the subsequent decade. The meeting of 110 high-quality Wolbachia genomes reveals the facility of the lengthy learn knowledge now being generated and the analytic method that allowed these low complexity metagenomes to be successfully separated into their constituent components. Evaluation of those genomes revealed a propensity to contaminate totally different insect orders amongst supergroups, whereas concurrently pinpointing to a number of host switching occasions throughout the course of the Wolbachia pandemic. Furthermore, we noticed that genome measurement in Wolbachia is correlated with the abundance of copies of bacteriophage WO.


Detection and meeting of Wolbachia genomes from DToL species knowledge

DToL uncooked knowledge are generated from entire or partial single specimens and thus comprise sequence from any cobionts in or on the specimen on the time of sampling. We screened knowledge for 368 insect genomes generated by the DToL challenge [16] for the presence of the intracellular endosymbiont Wolbachia (S1 Desk) utilizing a marker gene scan method by trying to find the SSU rRNA locus. The prokaryotic 16S rRNA alignment from RFAM (RF00177) [58] was remodeled right into a HMMER profile, and the profile was used to display contigs with nhmmscan [59]. We outlined a optimistic match as having an e-value <10−150 or an aligned size of >1,000 nucleotides. Putative optimistic areas have been extracted from the sequences and in comparison with the SILVA SSU database (model 138.1) [60] utilizing sina [61]. Matches have been filtered to retain solely these with >90% identification. Taxonomic classification of every optimistic was decided by way of a consensus rule of 80% of the highest 20 greatest hits, utilizing each the NCBI [62] and SILVA [63] taxonomies.

For Wolbachia-positive samples, all PacBio HiFi reads have been analysed utilizing kraken2 [64] with a customized database consisting of a genome from a species intently associated to the host, all RefSeq genomes of Anaplasmataceae, and reference genomes of moreover detected cobionts downloaded utilizing NCBI datasets and masked utilizing dustmasker [65]. Horizontal switch of fragments of endosymbiont and organellar DNA to the nuclear genome is a typical phenomenon. To keep away from inadvertently classifying nuclear Wolbachia insertions (NUWTs) as deriving from an unbiased bacterial replicon, Wolbachia reads recognized by kraken2 have been mapped to the insect genome meeting, and solely contigs totally lined by these reads have been retained. The Wolbachia reads have been additionally independently reassembled utilizing a number of meeting instruments: flye (model 2.9) (flye—pacbio-hifi {reads} -o {dir} -t {threads}—asm-coverage 50—genome-size 1.6m —scaffold) [66], hifiasm (model 0.14) (hifiasm -o {prefix} -t {threads} {reads} -D 10 -l 1 -s 0.999) [67], and hifiasm-meta (model 0.1-r022) (hifiasm_meta -o {prefix} -t {threads} {reads} -l 1) [68]. The a number of assemblies generated for every pattern have been ranked based mostly on their completeness utilizing BUSCO model 5.2.2 [69] and the Rickettsiales_odb10 dataset, alignment to reference genomes utilizing nucmer (model 4.0.0) [70], evenness of protection, and circularity. One of the best (most full, single-contig round most well-liked) meeting per pattern was chosen. For samples the place 10X Genomics Chromium knowledge have been out there, sprucing was carried out utilizing FreeBayes-called variants [71] from 10X quick reads aligned with LongRanger. The host origin, span, and completeness of all Wolbachia detected are introduced in S2 Desk.

Collation of Wolbachia genome dataset, gene prediction, and orthology inference

All out there Wolbachia genomes have been downloaded from NCBI GenBank on 01/02/2022 and supplemented with assemblies generated from short-read insect datasets by Scholz and colleagues [14]. This dataset contained replicate genomes for very intently associated Wolbachia from the identical host, and lots of fragmented and partial assemblies. Solely probably the most contiguous meeting per host species was retained. These genomes have been renamed utilizing the schema “R_Xyz_GenSpec_§”, the place Xyz is the primary three letters of the insect order of the host, GenSpec is an abbreviation derived from the generic and particular epithets of the host, and § signifies the supergroup. Retained assemblies have been assessed for the presence of contamination by performing a contig evaluation by kraken2 utilizing a database of solely round Wolbachia genomes. A listing of all eliminated contigs will be present in S3 Desk. Moreover, we solely included database-sourced Wolbachia genomes with not less than 90% BUSCO completeness [69] and at most 3% duplication with the Rickettsiales_odb10 dataset (S3 Desk). The exception to this filtering was the inclusion of genomes belonging to probably the most divergent supergroup S.

The entire publicly out there and newly assembled genomes have been annotated utilizing Prodigal (model 2.6.3) [29]. Protein households have been inferred utilizing OrthoFinder (model 2.4.0) [30]. We recognized 624 protein households, which have been single-copy in additional than 95% of all Wolbachia genomes. These have been individually aligned utilizing mafft in automated mode (model 7.490) [72]. Particular person most chance gene timber have been calculated utilizing iqtree (model 2.1.4) (iqtree -s {alignment} -nt {threads}) [73], and coalescence of those gene timber was decided utilizing ASTRAL (model 5.7.4) [74]. The person alignments have been trimmed utilizing trimAl (model 1.4) [75] and concatenated to type a supermatrix. This was used to deduce a most chance phylogeny with iqtree utilizing 1,000 ultrafast bootstrap approximation iterations (model 2.1.4) (iqtree -s {supermatrix} -m LG+G4 -bb 1000 -nt {threads}) [73]. The insect topology was subsampled from Chesters [76]. Incongruence in topology between the insect host and Wolbachia, host phylogeny was decided with ggtree in R [77].

Intrinsic genomic properties

All round genomes have been rotated to start out with HemE (OG0000716) on the optimistic strand, as this gene is positioned subsequent to the origin of replication [78]. All pairwise alignments have been calculated utilizing nucmer (model 4.0.0) [70], and breakpoints have been inferred and adjusted for the aligned protection. Complete-genome common nucleotide variety was calculated utilizing FastANI (model 1.33) [79]. GC and GC skew index values have been calculated for all genomes utilizing SkewIT [34].

Gene content material evaluation

To functionally annotate predicted genes, each Prokka (model 1.14.6) [80] and InterProScan (model 5.54–87.0) [81] have been run. The synteny plot of the biotin locus was created utilizing gggenes [82]. All six genes that make up the biotin locus (BioA-D, BioF, BioH) have been individually aligned with mafft in automated mode (model 7.490) [72] and remodeled right into a concatenated nucleotide alignment. A phylogenetic tree was constructed utilizing the mannequin GTR+F+G4 in iqtree (model 2.1.4) [73]. Genes answerable for CI have been recognized by a BLAST search [83] utilizing the next genes as queries: CidA: WP_010962721.1, WP_182158704.1, WP_012673228.1, WP_006014162.1, CAQ54402.1, NZ_MUIX01000001.1_1324, OAM06111.1; CifB: WP_010962722.1, WP_182158703.1, WP_012673227.1, WP_006014164.1, CAQ54403.1, NZ_MUIX01000001.1_1323, OAM06112.1. Furthermore, extra CifB sort V genes have been added as reference genes (Diachasma_alloeum_pair1, Diploeciton_nevermanni_pair5, wBor_pair2, wStri_pair1, wStri_pair2 and wTri-2_pair1). Solely pairs of recognized neighbouring genes (e-value 1 × 10−30, protection 80% to 120%) have been retained. Each CifA and CifB have been aligned utilizing mafft in automated mode (model 7.490) [72], adopted by most chance estimation utilizing iqtree (model 2.1.4) (iqtree -s {alignment} -nt {threads} -bb 10000).

WO prophage evaluation

A listing of recognized prophage sequences was generated based mostly on annotated areas described within the literature [41,44,84] (S4 Desk) for a set of genomes (R_Dip_DroSim_A, R_Hym_NasVit_A, R_Dip_DroAna_A, R_Dip_HaeIrr_A, R_Hym_CerSol_A, and R_Hym_WiePum_A) and linked to their respective gene households. Every Wolbachia genome was screened for steady stretches of linked prophage genes with at most 5 different genes in-between, and these have been annotated as prophage areas in the event that they contained not less than one gene from one of many 4 core phage modules (head, baseplate, tail, and fibre). This permitted detection of novel prophage-associated genes. Areas that contained not less than 5 of 6 head, 7 of 8 baseplate, 5 of 6 fibre, and 5 of 6 tail module genes have been deemed putatively full. Genomic maps of prophage integration have been created with circos [85]. Phylogenetic generalised least squares analyses have been carried out to evaluate the correlation between prophage size and genome measurement utilizing the ape R package deal [86], utilizing a Brownian mannequin of evolution and the phylogenetic tree in Fig 2A. R squared values have been calculated utilizing the package deal rr2 [87].

Supporting data

S3 Fig. Contiguity and genome measurement distribution of Wolbachia.

(A, B) Contiguity and genome measurement distribution of Wolbachia genomes assembled on this research (black) vs. reference genomes from different initiatives out there in NCBI (gray). The information underlying this Determine will be present in S1 Knowledge. (C) Genome measurement distribution of Wolbachia. Supergroups A (above) and B (under), on this research (black) and reference genomes from different initiatives out there in NCBI (gray) have been in contrast by Wilcoxon rank sum take a look at. The information underlying this Determine will be present in S1 Knowledge.



S8 Fig. Comparability of the phylogenies of biotin synthesis clusters and the Wolbachia strains that comprise them.

Comparability between phylogenies of Wolbachia genomes containing the biotin locus, based mostly on tree in Fig 2A (left) and a phylogeny inferred from the six nucleotide genes constituting the biotin synthesis operon (BioA-D, BioF, BioH) (proper). Inside nodes with bootstrap help greater than 80 are highlighted with black circles.




  1. 1.
    Yen JH, Barr AR. New Speculation of the Explanation for Cytoplasmic Incompatibility in Culex pipiens L. Nature. 1971 Aug;232(5313):657–658. pmid:4937405
  2. 2.
    Yen JH, Barr AR. The etiological agent of cytoplasmic incompatibility in Culex pipiens. Journal of Invertebrate Pathology. 1973 Sep;22(2):242–50. pmid:4206296
  3. 3.
    Bordenstein SR, O’Hara FP, Werren JH. Wolbachia-induced incompatibility precedes different hybrid incompatibilities in Nasonia. Nature. 2001 Feb 8;409(6821):707–10.
  4. 4.
    Hurst GDD, Jiggins FM, Hinrich Graf von der Schulenburg J, Bertrand D, West SA, Goriacheva II, et al. Male–killing Wolbachia in two species of insect. Proc R Soc Lond B. 1999 Apr 7;266(1420):735–40.
  5. 5.
    Stouthamer R, Breeuwer JAJ, Luck RF, Werren JH. Molecular identification of microorganisms related to parthenogenesis. Nature. 1993 Jan;361(6407):66–8. pmid:7538198
  6. 6.
    Hornett EA, Charlat S, Duplouy AMR, Davies N, Roderick GK, Wedell N, et al. Evolution of Male-Killer Suppression in a Pure Inhabitants. Keller L, editor. PLoS Biol. 2006 Aug 22;4(9):e283.
  7. 7.
    Werren JH, Baldo L, Clark ME. Wolbachia: grasp manipulators of invertebrate biology. Nat Rev Microbiol. 2008 Oct;6(10):741–51. pmid:18794912
  8. 8.
    Nikoh N, Hosokawa T, Moriyama M, Oshima Ok, Hattori M, Fukatsu T. Evolutionary origin of insect–Wolbachia dietary mutualism. Proc Natl Acad Sci USA. 2014 Jul 15;111(28):10257–62.
  9. 9.
    Pan X, Pike A, Joshi D, Bian G, McFadden MJ, Lu P, et al. The bacterium Wolbachia exploits host innate immunity to ascertain a symbiotic relationship with the dengue vector mosquito Aedes aegypti. ISME J. 2018 Jan;12(1):277–88. pmid:29099491
  10. 10.
    Hoerauf A, Mand S, Adjei O, Fleischer B, Büttner DW. Depletion of wolbachia endobacteria in Onchocerca volvulus by doxycycline and microfilaridermia after ivermectin therapy. Lancet. 2001 Might;357(9266):1415–6. pmid:11356444
  11. 11.
    Zug R, Hammerstein P. Nonetheless a Host of Hosts for Wolbachia: Evaluation of Latest Knowledge Suggests That 40% of Terrestrial Arthropod Species Are Contaminated. Cordaux R, editor. PLoS ONE. 2012 Jun 7;7(6):e38544. pmid:22685581
  12. 12.
    Zhou W, Rousset F, O’Neill S. Phylogeny and PCR–based mostly classification of Wolbachia strains utilizing wsp gene sequences. Proc R Soc Lond B. 1998 Mar 22;265(1395):509–15. pmid:9569669
  13. 13.
    Glowska E, Dragun-Damian A, Dabert M, Gerth M. New Wolbachia supergroups detected in quill mites (Acari: Syringophilidae). Infect Genet Evol. 2015 Mar;30:140–6. pmid:25541519
  14. 14.
    Scholz M, Albanese D, Tuohy Ok, Donati C, Segata N, Rota-Stabelli O. Giant scale genome reconstructions illuminate Wolbachia evolution. Nat Commun. 2020 Dec;11(1):5235. pmid:33067437
  15. 15.
    Pascar J, Chandler CH. A bioinformatics method to figuring out Wolbachia infections in arthropods. PeerJ. 2018 Sep 3;6:e5486. pmid:30202647
  16. 16.
    The Darwin Tree of Life Venture Consortium. Sequence domestically, suppose globally: The Darwin Tree of Life Venture. Proc Natl Acad Sci U S A. 2022 Jan 25;119(4):e2115642118. pmid:35042805
  17. 17.
    Challis R, Richards E, Rajan J, Cochrane G, Blaxter M. BlobToolKit–Interactive High quality Evaluation of Genome Assemblies. G3 (Bethesda). 2020 Apr 1;10(4):1361–74. pmid:32071071
  18. 18.
    Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, et al. Anvi’o: a complicated evaluation and visualization platform for ‘omics knowledge. PeerJ. 2015 Oct 8;3:e1319. pmid:26500826
  19. 19.
    Regan T, Barnett MW, Laetsch DR, Bush SJ, Wragg D, Budge GE, et al. Characterisation of the British honey bee metagenome. Nat Commun. 2018 Dec;9(1):4995. pmid:30478343
  20. 20.
    Hilgenboecker Ok, Hammerstein P, Schlattmann P, Telschow A, Werren JH. What number of species are contaminated with Wolbachia?–a statistical evaluation of present knowledge: Wolbachia an infection charges. FEMS Microbiol Lett. 2008 Apr;281(2):215–20.
  21. 21.
    Ahmed MZ, Breinholt JW, Kawahara AY. Proof for frequent horizontal transmission of Wolbachia amongst butterflies and moths. BMC Evol Biol. 2016 Dec;16(1):118. pmid:27233666
  22. 22.
    West SA, Prepare dinner JM, Werren JH, Godfray HCJ. Wolbachia in two insect host–parasitoid communities. Mol Ecol. 1998 Nov;7(11):1457–65.
  23. 23.
    Duron O, Bouchon D, Boutin S, Bellamy L, Zhou L, Engelstädter J, et al. The range of reproductive parasites amongst arthropods: Wolbachia don’t stroll alone. BMC Biol. 2008 Dec;6(1):27.
  24. 24.
    Weinert LA, Araujo-Jnr EV, Ahmed MZ, Welch JJ. The incidence of bacterial endosymbionts in terrestrial arthropods. Proc R Soc B. 2015 Might 22;282(1807):20150249. pmid:25904667
  25. 25.
    Strunov A, Schmidt Ok, Kapun M, Miller WJ. Restriction of Wolbachia Micro organism in Early Embryogenesis of Neotropical Drosophila Species by way of Endoplasmic Reticulum-Mediated Autophagy. Lemaitre B, editor. mBio. 2022 Apr 26;13(2):e03863–21.
  26. 26.
    Kamath AD, Deehan MA, Frydman HM. Polar cell destiny stimulates Wolbachia intracellular development. Growth. 2018 Jan 1;dev.158097.
  27. 27.
    Savill P, Perrins C, Kirby Ok, Fisher N. Wytham woods: Oxford’s ecological laboratory. 1. publ. in paperback. Oxford: Oxford Univ. Press; 2011. p. 263.
  28. 28.
    Misof B, Liu S, Meusemann Ok, Peters RS, Donath A, Mayer C, et al. Phylogenomics resolves the timing and sample of insect evolution. Science. 2014 Nov 7;346(6210):763–7. pmid:25378627
  29. 29.
    Hyatt D, Chen GL, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation website identification. BMC Bioinformatics. 2010 Dec;11(1):119. pmid:20211023
  30. 30.
    Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019 Dec;20(1):238. pmid:31727128
  31. 31.
    Russell JA, Goldman-Huertas B, Moreau CS, Baldo L, Stahlhut JK, Werren JH, et al. Specialization and geographic isolation amongst Wolbachia symbionts from ants and lycaenid butterflies. Evolution. 2009 Mar;63(3):624–40.
  32. 32.
    Werren JH, Windsor DM. Wolbachia an infection frequencies in bugs: proof of a worldwide equilibrium? Proc R Soc Lond B. 2000 Jul 7;267(1450):1277–85.
  33. 33.
    Tagami Y, Miura Ok. Distribution and prevalence of Wolbachia in Japanese populations of Lepidoptera: Wolbachia in Japanese Lepidoptera. Insect Mol Biol. 2004 Jul 20;13(4):359–64.
  34. 34.
    Lu J, Salzberg SL. SkewIT: The Skew Index Take a look at for large-scale GC Skew evaluation of bacterial genomes. Rzhetsky A, editor. PLoS Comput Biol. 2020 Dec 4;16(12):e1008439. pmid:33275607
  35. 35.
    Comandatore F, Cordaux R, Bandi C, Blaxter M, Darby A, Makepeace BL, et al. Supergroup C Wolbachia, mutualist symbionts of filarial nematodes, have a definite genome construction. Open Biol. 2015 Dec;5(12):150099. pmid:26631376
  36. 36.
    Mahmood S, Nováková E, Martinů J, Sychra O, Hypša V. Extraordinarily decreased supergroup F Wolbachia: transition to obligate insect symbionts [Internet]. Evol Biol. 2021 Oct [cited 2022 Aug 31]. Obtainable from: http://biorxiv.org/lookup/doi/10.1101/2021.10.15.464041.
  37. 37.
    Ellegaard KM, Klasson L, Näslund Ok, Bourtzis Ok, Andersson SGE. Comparative Genomics of Wolbachia and the Bacterial Species Idea. Matic I, editor. PLoS Genet. 2013 Apr 4;9(4):e1003381. pmid:23593012
  38. 38.
    Gerth M, Bleidorn C. Comparative genomics offers a timeframe for Wolbachia evolution and exposes a latest biotin synthesis operon switch. Nat Microbiol. 2017 Mar;2(3):16241.
  39. 39.
    Gerth M, Röthe J, Bleidorn C. Tracing horizontal Wolbachia actions amongst bees (Anthophila): a mixed method utilizing multilocus sequence typing knowledge and host phylogeny. Mol Ecol. 2013 Dec;22(24):6149–62. pmid:24118435
  40. 40.
    Bordenstein SR, Bordenstein SR. Eukaryotic affiliation module in phage WO genomes from Wolbachia. Nat Commun. 2016 Dec;7(1):13155. pmid:27727237
  41. 41.
    Bordenstein SR, Bordenstein SR. Widespread phages of endosymbionts: Phage WO genomics and the proposed taxonomic classification of Symbioviridae. Matic I, editor. PLoS Genet. 2022 Jun 6;18(6):e1010227. pmid:35666732
  42. 42.
    Gavotte L, Henri H, Stouthamer R, Charif D, Charlat S, Bouletreau M, et al. A Survey of the Bacteriophage WO within the Endosymbiotic Micro organism Wolbachia. Mol Biol Evol. 2006 Nov 13;24(2):427–35. pmid:17095536
  43. 43.
    Massey JH, Newton ILG. Variety and performance of arthropod endosymbiont toxins. Tendencies Microbiol. 2022 Feb;30(2):185–98. pmid:34253453
  44. 44.
    LePage DP, Metcalf JA, Bordenstein SR, On J, Perlmutter JI, Shropshire JD, et al. Prophage WO genes recapitulate and improve Wolbachia-induced cytoplasmic incompatibility. Nature. 2017 Mar;543(7644):243–7. pmid:28241146
  45. 45.
    Beckmann JF, Ronau JA, Hochstrasser M. A Wolbachia deubiquitylating enzyme induces cytoplasmic incompatibility. Nat Microbiol. 2017 Might;2(5):17007. pmid:28248294
  46. 46.
    Martinez J, Klasson L, Welch JJ, Jiggins FM. Life and Loss of life of Egocentric Genes: Comparative Genomics Reveals the Dynamic Evolution of Cytoplasmic Incompatibility. Larracuente A, editor. Mol Biol Evol. 2021 Jan 4;38(1):2–15.
  47. 47.
    Beckmann JF, Bonneau M, Chen H, Hochstrasser M, Poinsot D, Merçot H, et al. The Toxin–Antidote Mannequin of Cytoplasmic Incompatibility: Genetics and Evolutionary Implications. Tendencies Genet. 2019 Mar;35(3):175–85. pmid:30685209
  48. 48.
    Dy RL, Przybilski R, Semeijn Ok, Salmond GPC, Fineran PC. A widespread bacteriophage abortive an infection system features by a Kind IV toxin–antitoxin mechanism. Nucleic Acids Res. 2014 Apr;42(7):4590–605. pmid:24465005
  49. 49.
    Kumar S, Blaxter ML. Simultaneous genome sequencing of symbionts and their hosts. Symbiosis. 2011;55 (3):119–126. pmid:22448083
  50. 50.
    Almeida A, Nayfach S, Boland M, Strozzi F, Beracochea M, Shi ZJ, et al. A unified catalog of 204,938 reference genomes from the human intestine microbiome. Nat Biotechnol. 2021 Jan;39(1):105–14. pmid:32690973
  51. 51.
    Parks DH, Rinke C, Chuvochina M, Chaumeil PA, Woodcroft BJ, Evans PN, et al. Restoration of almost 8,000 metagenome-assembled genomes considerably expands the tree of life. Nat Microbiol. 2017 Nov;2(11):1533–42. pmid:28894102
  52. 52.
    Conner WR, Blaxter ML, Anfora G, Ometto L, Rota-Stabelli O, Turelli M. Genome comparisons point out latest switch of w Ri-like Wolbachia between sister species Drosophila suzukii and D. subpulchrella. Ecol Evol. 2017;7(22):9391–9404.
  53. 53.
    Turelli M, Cooper BS, Richardson KM, Ginsberg PS, Peckenpaugh B, Antelope CX, et al. Fast International Unfold of wRi-like Wolbachia throughout A number of Drosophila. Curr Biol. 2018 Mar;28(6):963–971.e8. pmid:29526588
  54. 54.
    Ju JF, Bing XL, Zhao DS, Guo Y, Xi Z, Hoffmann AA, et al. Wolbachia complement biotin and riboflavin to reinforce replica in planthoppers. ISME J. 2020 Mar;14(3):676–87. pmid:31767943
  55. 55.
    Lindsey ARI, Rice DW, Bordenstein SR, Brooks AW, Bordenstein SR, Newton ILG. Evolutionary Genetics of Cytoplasmic Incompatibility Genes cifA and cifB in Prophage WO of Wolbachia. Sloan D, editor. Genome Biol Evol. 2018 Feb 1;10(2):434–451.
  56. 56.
    Shropshire JD, Leigh B, Bordenstein SR. Symbiont-mediated cytoplasmic incompatibility: What have we discovered in 50 years? eLife. 2020 Sep 25;9:e61989. pmid:32975515
  57. 57.
    Lewin HA, Richards S, Lieberman Aiden E, Allende ML, Archibald JM, Bálint M, et al. The Earth BioGenome Venture 2020: Beginning the clock. Proc Natl Acad Sci U S A. 2022 Jan 25;119(4):e2115635118. pmid:35042800
  58. 58.
    Kalvari I, Nawrocki EP, Ontiveros-Palacios N, Argasinska J, Lamkiewicz Ok, Marz M, et al. Rfam 14: expanded protection of metagenomic, viral and microRNA households. Nucleic Acids Res. 2021 Jan 8;49(D1):D192–200. pmid:33211869
  59. 59.
    Eddy SR. Accelerated Profile HMM Searches. PLoS Comput Biol. 2011 Oct 20;7(10):e1002195. pmid:22039361
  60. 60.
    Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database challenge: improved knowledge processing and web-based instruments. Nucleic Acids Res. 2013 Jan 1;41(D1):D590–6. pmid:23193283
  61. 61.
    Pruesse E, Peplies J, Glöckner FO. SINA: Correct high-throughput a number of sequence alignment of ribosomal RNA genes. Bioinformatics. 2012 Jul 15;28(14):1823–9. pmid:22556368
  62. 62.
    Schoch CL, Ciufo S, Domrachev M, Hotton CL, Kannan S, Khovanskaya R, et al. NCBI Taxonomy: a complete replace on curation, assets and instruments. Database. 2020 Jan 1;2020:baaa062. pmid:32761142
  63. 63.
    Yilmaz P, Parfrey LW, Yarza P, Gerken J, Pruesse E, Quast C, et al. The SILVA and “All-species Dwelling Tree Venture (LTP)” taxonomic frameworks. Nucleic Acids Res. 2014 Jan;42(D1):D643–8. pmid:24293649
  64. 64.
    Wooden DE, Lu J, Langmead B. Improved metagenomic evaluation with Kraken 2. Genome Biol. 2019 Nov 28;20(1):257. pmid:31779668
  65. 65.
    Morgulis A, Gertz EM, Schäffer AA, Agarwala R. A quick and symmetric DUST implementation to masks low-complexity DNA sequences. J Comput Biol. 2006 Jun;13(5):1028–40. pmid:16796549
  66. 66.
    Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Meeting of lengthy, error-prone reads utilizing repeat graphs. Nat Biotechnol. 2019 Might;37(5):540–6. pmid:30936562
  67. 67.
    Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo meeting utilizing phased meeting graphs with hifiasm. Nat Strategies. 2021 Feb;18(2):170–5. pmid:33526886
  68. 68.
    Feng X, Cheng H, Portik D, Li H. Metagenome meeting of high-fidelity lengthy reads with hifiasm-meta. Nat Strategies. 2022 Jun;19(6):671–4. pmid:35534630
  69. 69.
    Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. BUSCO Replace: Novel and Streamlined Workflows together with Broader and Deeper Phylogenetic Protection for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes. Mol Biol Evol. 2021 Oct 1;38(10):4647–54. pmid:34320186
  70. 70.
    Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, Zimin A. MUMmer4: A quick and versatile genome alignment system. PLoS Comput Biol. 2018 Jan 26;14(1):e1005944. pmid:29373581
  71. 71.
    Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing [Internet]. arXiv. 2012 Jul [cited 2022 Jun 13]. Report No.: arXiv:1207.3907. Obtainable from: http://arxiv.org/abs/1207.3907.
  72. 72.
    Katoh Ok, Standley DM. MAFFT A number of Sequence Alignment Software program Model 7: Enhancements in Efficiency and Usability. Mol Biol Evol. 2013 Apr 1;30(4):772–80. pmid:23329690
  73. 73.
    Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: New Fashions and Environment friendly Strategies for Phylogenetic Inference within the Genomic Period. Mol Biol Evol. 2020 Might 1;37(5):1530–4. pmid:32011700
  74. 74.
    Mirarab S, Reaz R, Bayzid MdS, Zimmermann T, Swenson MS, Warnow T. ASTRAL: genome-scale coalescent-based species tree estimation. Bioinformatics. 2014 Sep 1;30(17):i541–8.
  75. 75.
    Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a instrument for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009 Aug 1;25(15):1972–3. pmid:19505945
  76. 76.
    Chesters D. Development of a Species-Stage Tree of Life for the Bugs and Utility in Taxonomic Profiling. Syst Biol. 2016 Oct 26;syw099.
  77. 77.
    Yu G, Lam TTY, Zhu H, Guan Y. Two Strategies for Mapping and Visualizing Related Knowledge on Phylogeny Utilizing Ggtree. Battistuzzi FU, editor. Mol Biol Evol. 2018 Dec 1;35(12):3041–3043.
  78. 78.
    Ioannidis P, Hotopp JCD, Sapountzis P, Siozios S, Tsiamis G, Bordenstein SR, et al. New standards for choosing the origin of DNA replication in Wolbachia and intently associated micro organism. BMC Genomics. 2007 Dec;8(1):182. pmid:17584494
  79. 79.
    Jain C, Rodriguez-R LM, Phillippy AM, Konstantinidis KT, Aluru S. Excessive throughput ANI evaluation of 90K prokaryotic genomes reveals clear species boundaries. Nat Commun. 2018 Nov 30;9(1):5114. pmid:30504855
  80. 80.
    Seemann T. Prokka: fast prokaryotic genome annotation. Bioinformatics. 2014 Jul 15;30(14):2068–9. pmid:24642063
  81. 81.
    Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, et al. InterProScan: protein domains identifier. Nucleic Acids Res. 2005 Jul 1;33(Net Server):W116–20. pmid:15980438
  82. 82.
    Wilkins D. gggenes [Internet]. Obtainable from: https://github.com/wilkox/gggenes.
  83. 83.
    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Primary native alignment search instrument. J Mol Biol. 1990 Oct;215(3):403–10. pmid:2231712
  84. 84.
    Miao Y heng, Xiao J hua, Huang D wei. Distribution and Evolution of the Bacteriophage WO and Its Antagonism With Wolbachia. Entrance Microbiol. 2020 Nov 13;11:595629. pmid:33281793
  85. 85.
    Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, et al. Circos: an data aesthetic for comparative genomics. Genome Res. 2009 Sep;19(9):1639–45. pmid:19541911
  86. 86.
    Paradis E, Schliep Ok. ape 5.0: an setting for contemporary phylogenetics and evolutionary analyses in R. Schwartz R, editor. Bioinformatics. 2019 Feb 1;35(3):526–528.
  87. 87.
    Ives AR. R2s for Correlated Knowledge: Phylogenetic Fashions, LMMs, and GLMMs. Harmon L, editor. Syst Biol. 2019 Mar 1;68(2):234–251. pmid:30239975

Related Articles


Please enter your comment!
Please enter your name here

Latest Articles