Development of PCRSeqTyping—a novel molecular assay for typing of Streptococcus pneumoniae

Background Precise serotyping of pneumococci is essential for vaccine development, to better understand the pathogenicity and trends of drug resistance. Currently used conventional and molecular methods of serotyping are expensive and time-consuming, with limited coverage of serotypes. An accurate and rapid serotyping method with complete coverage of serotypes is an urgent necessity. This study describes the development and application of a novel technology that addresses this need. Methods Polymerase chain reaction (PCR) was performed, targeting 1061 bp cpsB region, and the amplicon was subjected to sequencing. The sequence data was analyzed using the National Centre for Biotechnology Information database. For homologous strains, a second round of PCR, sequencing, and data analysis was performed targeting 10 group-specific genes located in the capsular polysaccharide region. Ninety-one pneumococcal reference strains were analyzed with PCRSeqTyping and compared with Quellung reaction using Pneumotest Kit (SSI, Denmark). Results A 100% correlation of PCRSeqTyping results was observed with Pneumotest results. Fifty-nine reference strains were uniquely identified in the first step of PCRSeqTyping. The remaining 32 homologous strains out of 91 were also uniquely identified in the second step. Conclusion This study describes a PCRSeqTyping assay that is accurate and rapid, with high reproducibility. This assay is amenable for clinical testing and does not require culturing of the samples. It is a significant improvement over other methods because it covers all pneumococcal serotypes, and it has the potential for use in diagnostic laboratories and surveillance studies.


Background
Streptococcus pneumoniae, found in the upper respiratory tract of healthy children and adults, causes a range of infections including meningitis, septicemia, pneumonia, sinusitis, and otitis media. Children < 2 years of age and adults aged ≥65 years of age are particularly susceptible [1]. According to the Morbidity and Mortality Weekly Report, April 26 2013 [2], an estimated 14.5 million cases of serious pneumococcal disease (including pneumonia, meningitis, and sepsis) occur each year in children aged <5 years worldwide, which has resulted in approximately 500,000 deaths, mostly in low-and middle-income developing countries.
The high morbidity and mortality caused by pneumococci are not clearly understood. The pathogenicity of pneumococci has been linked to various virulence factors such as capsule, cell wall and its component polysaccharides, pneumolysin, PspA, complement factor H-binding component, autolysin, neuraminidase, peptide permeases, hydrogen peroxide, and IgA1 protease [3][4][5]. Capsular polysaccharide (CPS) is the primary virulence factor, and is also used to categorize, S. pneumoniaeinto more than 90 different serotypes [6][7][8]. Capsule is important for the survival of bacteria at infection site as it provides resistance to phagocytosis [9].
Pneumococcal CPS is generally synthesized by the Wzx/Wzy-dependent pathway, except for types 3 and 37, which are produced by the synthase pathway [10,11]. Most genes required for synthesis of capsule are within the capsule polysaccharide synthesis (cps) operon, which ranges from 10 kb (serotype 3) to 30 kb (serotype 38). Cps operon is flanked by dexB in 5′ end and aliA at 3′ end. Neither of these participates in capsule synthesis. The 5′-end of the CPS loci starts with regulatory and processing genes wzg, wzh, wzd, and wze (also known as cpsABCD), which are conserved with high sequence identity in all serotypes, followed by the central region consisting of serotype specific genes [12,13].
Pneumococcal serotyping is necessary for epidemiological and vaccine impact studies. It also aids in understanding the pathogenicity of the organism and closely monitors for the emergence of non-vaccine strains, replacement serotypes, and new serovars [14,15]. Widespread use of pneumococcal vaccines has led to replacement with serotypes that are not included in the vaccines. Continuous monitoring of serotypes is therefore essential for epidemiological surveillance and long-term vaccine impact studies [16][17][18][19][20].
Several phenotypic and genotypic methods are currently used to identify pneumococcal group and type. The phenotypic serotyping methods of capsular swelling reaction, latex agglutination and coagglutination tests are costly, require skilled personnel, and cannot detect all serotypes. Genotypic typing methods that assess genome variation include sequential multiplex polymerase chain reaction (PCR), sequential real-time PCR, restriction fragment length polymorphism (RFLP), microarray, sequetyping, and matrix-assisted lazer desorption ionization-time of flight (MALDI-TOF) analysis. In addition to general applicability and a high discriminatory power, these genotypic assays are economical, detect pneumococci directly from the clinical specimen, and detect emerging serovars, replacement strains, and vaccine escape recombinants [21]. However, many of these methods are multistep, intricate, and do not discriminate all serotypes [22][23][24][25][26].
It is crucial to develop a robust, simple method with complete serotype coverage for serotype detection and pneumococcal serogroup/serotype surveillance [27]. Herein, the authors describe an innovative serotyping approach that relies on sequencing of assembly genes located in the capsular operon to identify all pneumococcal serotypes.

Reference strains
There were 91 reference serotype strains of S. pneumoniae obtained from Staten Serum Institute, Copenhagen, Denmark (Table 1).

Media and culture conditions
Strains were stored in skim milk, tryptone, glucose, and glycerol (STGG) media at −80°C. They were cultured on 5% sheep blood agar (Chromogen, Hyderabad) for 18-24 hrs at37°C with 5% CO 2 . The isolates were characterized as S. pneumoniae by colony morphology, alpha hemolysis, bile solubility, and optochin susceptibility.

Serotyping
Quellung reaction was performed using Pneumotest kit and type-specific antisera (SSI, Denmark), as recommended by the manufacturer.

PCRSeqTyping
PCRSeqTyping assay was performed in two steps.
Step I involved PCR amplification and sequencing of the cpsB gene from genomic DNA. There were 91 serotypes that were divided into non-homologous group (Group I, 59 serotypes) and homologous group (Group II, 32 serotypes) based on the cpsB sequence data. The homologous group was further subdivided into 10 subgroups based on the sequence homology. The second step involved PCR and sequencing of each homology group by using specific primers in order to identify the unique serotypes.

Nucleic acid extraction
Genomic DNA was extracted from bacterial strains using QIAamp DNA mini kit (Qiagen, Germany), as per the manufacturer's protocol.

Homology group assignment and PCRSeqTyping Homology groups
Amplifiable serotypes that shared identical interceding sequences (e.g. sequences for serotypes 2 and 41A, 7B, and 40) were grouped into 10 different groups based on their homology by in silico analysis of cpsB region. Individual primer sets were designed for each subgroup. Sequetyping data obtained in Step I was used to assign the homologous strains into subgroups (Fig. 1). Serotypes were considered homologous when the highest bit score was shared between two or more serotypes (i.e. the same amount of nucleotide variation between query and database sequences), and then assigned to one of the 10 groups (Table 3). For homologous strains, a second round of PCR was performed using group specific primers as specified in Table 3. PCR products were subjected to sequencing reaction. The nucleotide sequence data was used to assign the serotype.

Results
PCRSeqTyping results for reference strains The 91 pneumococcal serotype reference strains (sourced from SSI) were tested with PCRSeqTyping protocol. All 91 strains were amplified using the modified method. In Step I of amplification and sequencing, 59 strains of the non-homologous group (Group I) were correctly assigned to their respective serotype. There were 32 strains (Group II) identified along with their homologous type. The homologous types were correctly assigned to their respective type in Step II by performing a second round of amplification using group specific primers and sequencing. Quellung reaction performed using Pneumotest kit (SSI), in parallel with PCRSeqTyping, showed 100% concordant results ( Table 1).
The results were further evaluated by blinded testing of PCRSeqtyping. Samples were evaluated randomly by assigning codes. Quellung reaction data showed no discrepancies between serotypes assigned by Quellung and PCRSeqTyping for all reference strains.

PCRSeqTyping results for clinical isolates
Twenty eight pneumococcal isolates tested in the study were from children <5 years with invasive pneumococcal disease. The predominant serotypes were 1, 6B, 19A, 19 F, 14 and 7 F ( Table 2). PCRSeqTyping results and serotyping results by Quellung reaction were in concordance, without any discrepancies. Among 28 isolates, 25 isolates were assigned to their serotype with the first step of PCRSeqTyping. Three isolates belonging to the homologous group were subsequently identified with the second step of PCRSeqTyping.

Discussion
There is a renewed interest in pneumococcal capsular typing techniques, as a result of an increased complexity in the management of pneumococcal disease and the widespread use of pneumococcal vaccines [8]. The ability to differentiate pneumococcal strains efficiently is essential to track the emerging serovars, and for epidemiological investigations.
The limitations of the Quellung serotyping method, many DNA-based typing protocols, PCR, restriction fragment length polymorphisms, hybridization assays, microarrays and sequencing for S. pneumoniae are well known. Different PCR strategies, namely multiplex PCR, sequential PCR, serotype-specific PCR, and real time multiplex PCR [25,[28][29][30][31][32][33][34][35][36] targeting serotype-specific regions of cps could detect only 22 serotypes uniquely, and 48 serotypes along with their homologous types [37,38]. Despite the fact these methods cover imited serotypes, PCR is a widely used technique, which avoids the use of serological reagents and requires specific expertise to conduct. Methods using multiple restriction enzymes and long cps fragments [39,40] for PCR make the amplification difficult and inconsistent. Another protocol based on sequencing of regulatory region of cps [30,31] shows poor resolution with cross reactivity of serotypes. An approach targeting serotype-specific glycosyl transferase genes [6] was only tested for serogroup 6 and serotype 19 F. The cross reactivity of serotypes, along with the requirement for a higher number of primers, and poor resolution limits their wide usage.
With the characterization of the cps locus of 92 serotypes [13], Leung et al. [26] developed sequetyping protocol using single primer pair, which binds in all pneumococcal serotypes. Recently, several research groups [27,[41][42][43] have published their results using sequetyping assay. Limitations of the sequetyping protocol were as follows: (i) only 84 serotypes out of 92 were predicted to be amplified by in silico analysis; (ii) crossreacting serotypes (30/84) belonging to homologous groups could not be uniquely identified; and (iii) considering the central 732 bp region of the cpsB amplicon which could be sequenced, only 46 of 54 serotypes could be sequetyped.
In the first step of this study's modified approach, successful amplification of all 91 serotypes was achieved with the addition of a new reverse primer to amplify 25A, 25 F and 38 serotypes specifically. Additionally, XT-5 polymerase used in the PCR amplification reactions contains Taq DNA polymerase and Pfu enzyme. This enzyme blend utilizes the powerful 5′-3′ polymerase activity of Taq DNA polymerase and the 3′-5′ exonuclease-mediated proof-reading activity of PR polymerase, resulting in high fidelity PCR products [44]. PCR annealing temperature of 50°C and extension time of 1 min were found to be optimal for amplification of cpsB gene of all 91 strains.
The serotypes were grouped into homologous (32) and non-homologous (59) based on cpsB sequence. Nonhomologous types were identified uniquely. The 32 homologous strains were further subdivided into 10 groups (HG 1-10) based on their sequence similarity. Homology group-specific primers were designed and evaluated for their ability to differentiate between strains. HG primers were designed to be able to assign the serotype accurately with second step of PCR and sequencing.
The limitation of using 732 bp region of cpsB amplicon in sequetyping assay, resulting in prediction of 46 of 54 serotypes, was overcome with the use of Long Seq module. Approximately 1.0 kb quality reads in a single sequencing reaction were obtained with modification. This resulted in providing good quality reads up to the end of the PCR template, identifying cross-reacting serotypes (15B/15C, 7 F/7A, 18B/18C, 9 L/9 N, 15B/C, 17 F/ 33C, 18B/C, 7A/F, 12A/46, 6C/6D) which have a single SNP in the cpsB region.
A 100% concordance of serotype results of PCRSeq-Typing and Quellung testing was seen for the 28 clinical isolates. Moving forward, the study will be extended for serotyping a larger number of clinical isolates and clinical samples. The limitation of the protocol will be in quantification and serotype identification in multiple carriage; however, studies are underway to address these issues. For multiple carriage, the PCR amplicon obtained in the first step will be subcloned into T/A cloning vector and the individual clones will be sequenced for assigning the specific serotype. As the corresponding cpsB gene sequence of the recently discovered serotypes 6E, 6 F, 6G, 6H, 11E, 20A, 20B and 23B1 [45][46][47] were unavailable at the time of the study design, they will be included in future studies.
In the study's center, the typing cost with Pneumotest Kit (SSI, Denmark) was US$35/isolate, while PCRSeq-Typing cost was US$10 for Group I (non-homologous strains) and US$15 for Group II (homologous strains). With the easy availability of outsourced sequencing services, the accurate and reliable PCRSeqTyping test can be adopted in a regular microbiology laboratory, even without the sequencing facility.
This modified typing method has several advantages over other reported methods. It involves techniques with a workflow that many microbiology laboratories can easily implement. The high throughput PCRSeqTyping method features good discriminatory power, reproducibility, and portability, making it suitable for epidemiological studies. The assay has the flexibility of incorporating additional primers for the characterization of emerging serotypes. An added advantage of this method is that raw data from experiments can be reanalyzed upon the addition of new entries to the serotyping database.

Conclusion
PCRSeqTyping assay is a cost-effective alternative to currently available phenotypic and molecular typing methods. The method is simple to perform, robust, and economical. It can identify all 91 serotypes specifically and uniquely.