Cystic hydatid disease, an infection with the metacestode of the Canis familiaris cestode Echinococcus granulosus, is a global public health problem that infects humans and hoofed animals. The life cycle of E. granulosus involves dogs and other carnivores as definitive hosts and livestock as intermediate hosts.
Eggs shed in the fecal matter of definitive hosts are ingested by intermediate hosts where they develop into the metacestode phase and establish hydatid cysts. The infection in livestock is normally asymptomatic and detected during postmortem scrutiny at the slaughterhouses, yet it causes economic loss through disapproval of infected organs.
E. granulosus is a complex of distinct strains with different host affinities. To date, ten different genotypes have been described by molecular genetic analysis, and the genotypic variation closely follows the biological and phenotypic characteristics of the parasite.
It has been proposed that E. granulosus genotypes should be split into different species: E. granulosus sensu stricto (genotypes G1-G3), E. equinus (genotype G4), E. ortleppi (genotype G5), E. canadensis (genotypes G6-G10), and E. felidis (lion strain).
The strain variation of E. granulosus reflects the differences in life cycle form and host scope; therefore, knowledge of the genetic diversity and population genetics of this parasite is of enormous public health importance.
Mitochondrial DNA proves to be useful in distinguishing all the genotypes of E. granulosus and acts as an important genetic marker to analyze the population genetic structure of this parasite as it is monoploid, non-recombining, multicopy, rapidly evolving, and maternally inherited.
In Western India, four different genotypes (G1, G2, G3, and G5) were reported in different intermediate hosts such as cows, American bison, pigs, and sheep. In West Bengal, G1, G2, and G3 genotypes were found to infect the livestock.
Besides the studies published by Singh et al., 2012, from Ludhiana (North India), who found the G1 and G3 genotypes in ten isolates of E. granulosus, an extensive survey on the genotypes of the parasite on a large number of isolates, covering large geographical endemic areas, is missing.
The purpose of the present study was to genotype the North Indian animal isolates of E. granulosus by sequencing the mitochondrial cytochrome c oxidase gene. The results were further compared with nucleotide sequences of this parasite from other geographical regions to analyze the genetic variability and population genetics of this parasite.
Materials and Methods Sample Aggregation
Hydatid cysts were collected from four different geographical countries in North India during the period of 2009-2012. Cyst samples were obtained from 74 newly slaughtered and largely septic sheep from slaughterhouses located in Chandigarh (n=66), Shimla, Himachal Pradesh (n=3), and Srinagar, Jammu and Kashmir (n=5).
Seven hydatid cysts were kindly received from Guru Angad Dev University, Ludhiana, Punjab, North India, making a total of 81 analyzed isolates. The cysts were removed from the carcass and transported to the section of Parasitology, PGIMER under refrigerated conditions for further processing. The entire hydatid cysts were separated and washed with distilled H2O, with each cyst from an individual animal being considered as an isolate.
The cyst samples were washed thrice in PBS to remove ethanol, and genomic DNA was extracted from each sample by QIAamp DNA mini kit (Qiagen, Hilden, Germany), according to the manufacturer’s instructions.
For molecular identification, PCR amplification of the cox1 gene was performed using primers and PCR conditions as previously described with minor alterations. Briefly, amplification was performed in a 50 ?l final volume containing 2 ?l DNA, 0.2 mM premixed solution of dNTP, 10 pmol of each primer, 1x PCR buffer, and 1 U of TaqDNA polymerase.
The amplification program included an initial denaturation step of 95°C for 5 min and 38 cycles of denaturation (95°C for 50s), annealing (57°C for 50s), extension (72°C for 1 min), and a final extension of 72°C for 10 min. After agarose gel electrophoresis (1.5%), PCR products were purified and sequenced.
Different sequences of E. granulosus sensu stricto populations deposited in the GenBank from India and other South Asian, East Asian, European, Middle Eastern, African, and South American countries (China, Nepal) were retrieved from the National Center for Biotechnology (http://www.ncbi.nlm.nih.gov) and compared with sequences of isolates used in the present study.
Nucleotide sequence analysis was performed with BLAST sequence algorithms, and sequences were aligned using ClustalW. The genetic distance was calculated using Kimura two-parameter distance estimations, and samples were clustered using PhyML as part of SeaView v. 4.2.4.
Gene Family Trees
The identification of haplotypes and their webs was constructed based on parsimony criteria using the TCS version 1.2 package. The web evaluation was run at a 95% probability bound. This haplotype web analysis is useful for intraspecies data in revealing multiple connections between haplotypes and indicating possible missing mutational connections.
Population Familial Analysis
For population familial analysis, these sequences were grouped into seven populations: South Asia, East Asia, the Middle East, Europe, Africa, and South America. Population diverseness indices such as the number of segregating sites (S), haplotype number (H), haplotype diversity, nucleotide diversity, and the mean number of pairwise nucleotide differences within populations (K) were estimated using DnaSP 4.5 software. The neutrality indices of Tajima’s D and Fu’s Fs in each population were calculated by the population genetics package Arlequin 3.1.
The pairwise familial difference was estimated for all populations by computing Wright’s F-statistics (Fst) based on gene flow (Nm). In addition, the mean number of pairwise nucleotide differences (Kxy), nucleotide permutation per site (Dxy), and net base permutation per site (Da) between populations were also calculated by DnaSP.
The amplification of the cox1 gene with JB3/JB4.5 primer yielded a PCR product of 446 bp. The nucleotide sequence of all 81 isolates from North India was aligned with the reference sequence of each genotype within E. granulosus retrieved from GenBank.
A total of three genotypes of E. granulosus were found: American bison strain (G3 genotype n=58), sheep strain (G1 genotype n=22), and Tasmania sheep strain (G2 genotype n=1). The sequences of the haplotypes found in this survey were deposited in GenBank with accession numbers JX854022-34 and KC422644-45.
Sequences of these isolates, along with those retrieved from the gene bank, were used to construct a phylogenetic tree (Fig. 1). A total of 73 sequence discrepancies (named as Hap 1-Hap 73) were grouped into two main clades. Clade I comprises the G1 genotype and its microvariants, and Clade II comprises Genotype G3 and its microvariants. Sequence variant Hap 49 served as a connecting link between these two clades.
Gene/Allele Family Tree
The genealogical relationships among the cox1 sequences estimated by TCS package detected two lines of descent. The first line of descent clustered South Asian (12.62% n=13), Middle Eastern (45.26% n=43), European (39.21% n=28), South American (54.9% n=28), East Asian (49.05% n=26), and African (42.10% n=4) populations, and the second line of descent clustered South Asian (56.3% n=58), Middle Eastern (11.5% n=11), European (11.76% n=6), South American (9.80% n=5), East Asian (1.88% n=1), African (5.2% n=1), and Australian (20% n=1) populations.
Thus, the haplotypes in both lines of descent shared a broad geographical distribution, and the haplotypes in the first line of descent were reported predominantly in Middle Eastern, European, and South American populations, whereas the haplotype in the second line of descent was prevalent in the South Asian population.
A total of 73 haplotypes were found in 376 sequences: 20 in South Asia, 14 in East Asia, 17 in Europe, 25 in the Middle East, 10 in Africa, 13 in South America, and 5 in Australia. Along the 341 bp mentioned above, only nucleotide permutations were detected, and interpolations or omissions were not detected. 11 point mutants were noted, and 23 were parsimony enlightening sites.
Population familial indices were calculated using the nucleotide information of Cox1 gene from India and its adjacent states. The haplotype diversity (Hd) for all 376 sequences was calculated to be 0.803 +/- 0.016 SD. The average number of nucleotide differences, K, was found to be 1.82761, and nucleotide diversity (?) was 0.00536 +/- 0.00023.
The haplotype and nucleotide diversity indexes were highest in the Australian population, followed by the African population, and lowest in South Asian populations. Neutrality indices calculated by Tajima’s D and Fu’s Fs trial were negative in all populations.
The D value was significantly negative in South Asian, European, and South American populations, whereas except for African and Australian populations, the Fs value was significantly negative in the other 5 populations.
Inter-population Base Differences (Kxy) and Mean Number of Nucleotide Permutations per Site Between All These Populations (Dxy) Varied from 1.36 and 0.00399 (East Asia and South America) to 2.8 and 0.00821 (Africa and Australia), respectively. Pairwise familial distance (Fst) in these populations varied from -0.00206 with Nm value=infinite (between Europe and the Middle East) to 0.37828, Nm=0.82176 (between South Asia and South America).
The Fst value between Europe and the Middle East and Gst between Europe and South America were found to be negative, indicating no distinction at these loci. When the population of South Asia was compared with other populations, the value of Fst ranged from 0.10273-0.37828, with an Nm value range of 0.82176-4.36727, indicating these populations are differentiated with low gene flow.
Middle Eastern states, in comparison to other states, show very low familial distinction (Gst 0.00209-0.10137, Fst -0.002060-0.24707) with very high gene flow (Nm 1.52370-space). The population from the Middle East and Europe shows a negative value of Fst with an infinite value of Nm, indicating that populations in these states behave as one population with a very high degree of gene flow. Further, except between Europe and the Middle East, South Asia and Australia, South America and Australia, all other populations showed significant pairwise familial distance.
Pednekar et al. from Eastern India have reported four genotypes of E. granulosus, namely, the sheep strain (G1), Tasmanian sheep strain (G2), Indian American bison strain (G3), and cattle strain (G5) of E. granulosus in farm animals in Maharashtra and bordering countries in Western India.
The prevailing genotype was found to be the G3 genotype (63%) nowadays in all species of farm animals followed by the G5 (19.56%), the G1 (13%), and the G2 genotype (4.34%). In Ludhiana (North India), merely two genotypes, American bison strain (G3) and common sheep strain (G1), were found to infect the farm animals.
In the present survey, three genotypes of E. granulosus were found to infect farm animals: American bison strain (G3 genotype), sheep strain (G1 genotype), and Tasmanian strain (G2 genotype). In concordance with earlier studies from farm animals (cows, American bison, hog, and sheep) in India, the American bison strain (71.8%) was found to be the prevailing genotype.
The second most common genotype was the sheep strain found in 27.16% isolates. The G2 genotype was found in merely one isolate from Srinagar, Kashmir (North India), which was similar to the determination in Eastern India.
Further, in contrast to the consequences of the present survey, G1 was reported as the dominant genotype in other states, for example, 95.74% in China, 87.5% in Iran, 77.4% in Southern Brazil with (11.11% of G3 genotype), 71.59% in Italy (with 27.8% prevalence of G3 genotype), 66% in Turkey, and 55.8% in Pakistan (with 44.11% prevalence of G3).
These data suggest that while traveling from the Middle East to Europe, South America, and South Asia, the prevalence of the G3 genotype starts increasing, and this genotype emerges as the predominant genotype in South Asia. In East Asia, the G1 genotype once again emerges as the prevailing genotype.
To date, very few studies have explored in-depth the population genetic structure of E. granulosus. These studies have shown that the cox1 gene is a promising candidate for revealing the population genetics of E. granulosus. In the present survey, E. granulosus sensu stricto populations from broad geographical areas were analyzed to examine the parasite’s genetic diversity.
For this sequence, only E. granulosus sensu stricto composite was retrieved from GenBank/EMBL/DDBJ international Databases because of the scarcity of data in GenBank for other genotypes.
Despite the broad distributional scope, the appraisal of inter-population comparison (Kxy, Dxy, Gst, and Fst) also supports a low-moderate level of genetic distinction between these populations. The populations of EU, ME, and SAM showed low divergence and shared the most common haplotypes.
The EU and ME populations are highly closely related to each other, which is suggested by a very low and non-significant FST value. Gene flow (Nm) was also found to be very high. The SA population is most differentiated with very low gene flow among other populations. This result could be related to the presence of G3 as the prevailing genotype in this population.
Despite high haplotype diversity, low nucleotide diversity values suggest little difference between haplotypes. This is also demonstrated by the haplotype web, which represents mostly individual base differences between bulk haplotypes (Figure 2).
The combination of high haplotype and low nucleotide diversity, as observed in the present study, can be a signature of rapid population enlargement from a small effective population size. A number of statistical tests have been developed to test the selective neutrality of nucleotide variability and are used to determine such population growth.
These tests are based on the distribution of pairwise differences between nucleotide sequences within populations. In this study, we used two tests that are commonly used to identify population enlargement and differ slightly in their approach. Tajima’s D test is based on the comparison of the allelic frequency of segregating nucleotide sites.
A positive value of this test indicates a bias towards intermediate frequency alleles, a negative value indicates a bias towards excess of the number of rare alleles, and the latter being a signature of recent population enlargement. Fu’s FS test is based on the alleles or haplotypes distribution, and here too, negative values can indicate an excess number of alleles, as would be expected from recent population enlargement or from genetic hitchhiking.
In this study, Tajima’s D test was negative for all populations, however, only three populations, South America, Europe, and South America, differed significantly from neutrality. Fu’s FS test resulted in significant negative values for all populations except Africa and South America, which were negative but not significant.
The overall negative values of both neutrality tests indicate an excess of rare mutants in the populations, which can suggest recent population enlargement. Further analysis by including additional neutral DNA markers could provide a more complete view of population genetic structure.
The interpretation of demographic enlargement correlates well with the widely observed patterns of sheep domestication, which started around 12,500 B.C. The various genetic and archaeological evidence suggests that the domestication of sheep occurred first in Southwest Asia (Middle East) and then spread successfully into Europe and Africa, and the rest of Asia.
Initially, 70 sheep were brought to Australia from the Cape of Good Hope in 1788, and the next shipment was of 30 sheep from Calcutta and Ireland in 1793. The results of the present study have suggested that the parasite, along with its intermediate host, was introduced into Europe and Africa from the Middle East and then to South America, Australia, and other parts of Asia.
Recently, a similar hypothesis regarding the dispersion of the parasite was proposed in European, South American, and Middle Eastern populations.
In the present survey, the haplotype web has shown that all the haplotypes of E. granulosus sensu stricto appear to have been descended from a common hereditary haplotype (Hap1) of G1 genotypes, which is widely distributed in different geographical countries.
Interestingly, the nucleotide sequence of this haplotype (Hep1) was 100% identical to previously described prevalent haplotypes in Europe (EG1: JF513058), China and Peru (G01: AB491414), Iran, and Jordan (EG01). The haplotype (Hap11), which was found as the second-dominant genotype in the Middle East, China, Europe, and South America, is prevalent in South Asia.
n dendrogram analysis, Hap 49 appears to be a connective nexus between G1 and G3 genotypes, similar findings were found in the haplotype web where the G3 genotype and its microvariants appeared to originate from the EU11 (Hap 49) haplotype.
In conclusion, the present survey reveals high familial diversity within populations of E. granulosus but relatively low to chair familial distinction among the populations. Low familial distinction has also been reported in different populations of Taenia solium.
The observed patterns of familial diversity within and between the populations are likely caused by population enlargement after the introduction of the founder haplotype. Finally, we support that it is important to link molecular epidemiology with evolutionary biology so that population genetics and phylogenetic analyses are able to confer a considerable added value in the characterization of strains and species of pathogens.