Detection of human virulence signatures in H5N1

Detection of human virulence signatures in H5N1

Journal of Virological Methods 154 (2008) 200–205 Contents lists available at ScienceDirect Journal of Virological Methods journal homepage: www.els...

452KB Sizes 0 Downloads 2 Views

Journal of Virological Methods 154 (2008) 200–205

Contents lists available at ScienceDirect

Journal of Virological Methods journal homepage:

Short communication

Detection of human virulence signatures in H5N1 Nicole Waybright, Ellen Petrangelo, Peggy Lowary, Joseph Bogan, Niveen Mulholland ∗ Midwest Research Institute, 1330 Piccard Drive, Rockville, MD 20850, United States

a b s t r a c t Article history: Received 30 May 2008 Received in revised form 9 September 2008 Accepted 11 September 2008 Available online 30 October 2008 Keywords: Pandemic Influenza H5N1 RT-PCR Pyrosequencing

A method for detecting the emergence of potential pandemic-causing influenza strains has been developed. The system first uses real-time RT-PCR to detect H5, the highly pathogenic avian influenza subtype most likely to cause a pandemic. Pyrosequencing is then employed to scan for codon changes encoding amino acids known to define human influenza versus avian influenza signatures. The pyrosequencing assays were developed to screen at the nucleotide level for 52 amino acid changes defined as avian- or human-specific. A library has been built to screen the sequence data generated and properly identify the strain in question as a potential threat. This method can be used to screen samples for influenza and to determine if the detected virus contains mutations that may make the virus more infective or virulent to humans, potentially thwarting a pandemic outbreak. © 2008 Elsevier B.V. All rights reserved.

1. Introduction Two primary threats face the United States with regard to highly pathogenic avian influenza (HPAI): (1) arrival of the current H5N1 virus and (2) antigenic variation yielding an H5N1 strain more virulent and/or infective to humans. The former would primarily represent a threat to the poultry industry and could be economically catastrophic. The latter would present a threat to human health and potentially result in a pandemic due to lack of immunity to this subtype of influenza. The H5N1 influenza genome consists of 8 gene segments totaling ∼13,500 nucleotides. Although the virulence of avian influenza virus is polygenic, sequencing the entire genome is not necessary for screening an emerging threat. The virus has imposed a high fatality rate for infected humans (>60%), however only a small number of individuals have been infected (385 at the time of this writing) (WHO, 2008). This is attributed to the current form of the virus not being adapted to infect humans well, resulting in very limited human-to-human transmission. The changes in influenza genome that would yield a virus more infective and/or virulent to humans can come about either by infidelity of the viral RNA polymerase or by genetic reassortment (Webby and Webster, 2001). Poor viral RNA polymerase fidelity results in the incorporation of point mutations in the viral genome. These types of mutations are

∗ Corresponding author. Tel.: +1 240 632 0888; fax: +1 240 632 0599. E-mail address: [email protected] (N. Mulholland). 0166-0934/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.jviromet.2008.09.013

responsible for changes in seasonal influenza strains. Reassortment of gene segments can occur when a host cell is infected with different influenza virus strains. The pyrosequencing-based analysis described has been designed to detect the codons encoding amino acids, which may affect human virulence and infectivity. Chen et al. (2006) generated entropy plots comparing amino acid sequences of available human and avian influenza strains. Each of the 11 proteins encoded by the 8 gene segments was compared in the study. After analysis and validation, 52 amino acids were identified as being species-specific sites. Additional support for the robustness of the entropy plot-based analysis is provided by the identification of the well-characterized amino acid PB2-627, which is known to effect replication of influenza A in many mammals (Subbarao et al., 1993). The surveillance mechanism, SAFE: sequencing for avian flu epidemic, detects current circulating H5N1 strains as well as emerging H5N1 strains by combining real-time reverse transcription-PCR (RRT-PCR) and pyrosequencing (Fig. 1). The pyrosequencing-based surveillance system will first employ RRT-PCR to detect all influenza A virus subtypes simultaneously [by targeting the invariant matrix gene (M)] and to specifically detect the H5N1 subtype. This step serves as the Go/No Go step for further analysis. If H5N1 is detected, additional regions of interest will be amplified and sequenced to determine if critical human virulence signatures are present. Pyrosequencing provides real-time sequence data for target regions in the form of a pyrogram. The sequence data can be screened against a library, which allows for a simple mutation/no mutation read-out.

N. Waybright et al. / Journal of Virological Methods 154 (2008) 200–205


Fig. 1. SAFE surveillance concept. Samples of interest are analyzed by RRT-PCR for presence of H5. If H5 is detected, samples are further analyzed by pyrosequencing at 52 codons to detect mutations encoding human-specific amino acids.

2. Methods

3. Results

2.1. Influenza A RNA and environmental samples

3.1. Primer/probe and assay design

H5N1 (A/HK/213/03), H7N3 (A/Rt/NJ/65/85) and H9N2 (A/Ty/WI/66) RNAs were generously provided by Dr. Robert Webster at St. Jude Children’s Hospital. Seasonal influenza strains, H1N1 (A/PR/8/34) and H3N2 (A, X:31, A Aichi/68), were obtained from Charles River Laboratories. Avian thoracic swabs used for background testing were kindly supplied by Drs. Hohenhaus and Huang at the Maryland Department of Agriculture.

TaqMan One-Step RT-PCR Master Mix reagent (Applied Biosystems Inc.) was used for all real-time RT-PCR detection. RT-PCR reaction volumes were 50 ␮l and consisted of: 10 ␮l sample, 30 ␮l MM (25 ␮l Universal MM, 1.25 ␮l multiscribe + inhibitor, and 3.75 ␮l H2 O), and 10 ␮l Primer/Probe Mix. Real-time RT-PCR was performed on the ABI7900 with the following cycling conditions: 50 ◦ C 30:00; 95 ◦ C 10:00; 95 ◦ C 00:15, 60 ◦ C 1:00, 45 cycles.

Primers and probes for real-time RT-PCR were designed against the hemagglutinin (HA) and the matrix (M) gene segments of strain A/common magpie/Hong Kong/2125/2006 (H5N1). After initial design and testing, all sequences available through the Influenza Virus Resource (NCBI) (Bao et al., 2008) were aligned and sites within the primer/probe sequences were analyzed to locate sites, which were variable in more than 25% of the available sequences. H5N1 sequences were used for designing the HA primer/probe set, and all influenza A sequences were used for designing the M primer/probe set. Final primer and probe sequences were challenged in silico against low pathogenic avian influenza (LPAI) H5 sequences. Only primer and probe sets, which would not amplify or detect LPAI H5 were selected for further analysis. Primer and probe sequences were re-synthesized to include mixed bases at sites where alternate nucleotides were found >25% of the time. Primer/probe concentrations were optimized using a standard optimization matrix with the newly designed primer/probe sets and H5N1 RNA. The design process was repeated for H7 and H9 influenza A subtypes. All primers and probes used are listed in Supplemental Table 1. Each of the HA-specific probes are labeled with FAM at the 5 end and have MGB attached at the 3 -end. The M probe is labeled with VIC at the 5 -end and has MGB attached at the 3 -end. Each of the HA-specific primer probe sets has been tested in multiplex real-time RT-PCR reactions with the M-specific primer probe set.

2.4. Pyrosequencing

3.2. RT-PCR limit of detection

One-Step RT-PCR mix (Qiagen) was used to amplify target using primer pairs described in Supplemental Table 2. Amplification conditions were: 50 ◦ C 30:00; 95 ◦ C 15:00; 95 ◦ C 00:45, 55 ◦ C 00:45, 72 ◦ C 1:00, 45 cycles; 72 ◦ C 10:00. Amplicon was purified using sepharose beads and sequenced using primers listed in Supplemental Table 2.

The limit of detection (LOD) for the H5 RRT-PCR assay was determined by using a 10-fold dilution series of purified viral RNA. Each dilution in the series was tested in triplicate. The lowest concentration yielding 3/3 positive indications was deemed the broad range LOD. RNA quantification by spectrophotometry includes RNA purified from the cells used to propagate the virus. Because this will result in artificially high concentrations, hemagglutination titers of allantoic fluid are given as an alternate method for target quantification. Calculation of H5N1 RNA using the HA titer of allantoic fluid used to purify the RNA was calculated as follows: the purified H5N1 RNA used for LOD determination was purified from 400 ␮l of allantoic fluid with hemagglutination titers of 20 HA units/␮l;

2.2. Primer/probe design Primers and probes for RRT-PCR were designed using Primer Express 3.0 (Applied Biosystems Inc.). Primers used for pyrosequencing were designed using PSQ Assay Design (Biotage Inc.). All primers and probes used are listed in Supplemental Table 1. 2.3. Real-time RT-PCR

2.5. Background RNA extraction Avian thoracic swab samples were extracted using a QIAamp Viral RNA Mini Kit (Qiagen) as indicated by manufacturer.


N. Waybright et al. / Journal of Virological Methods 154 (2008) 200–205

Fig. 2. Limit of detection of RRT-PCR assays for H5, H7 and H9. RNA purified from allantoic fluid of infected chicken eggs was used to determine the lower limits of detection in the developed RRT-PCR assays. RNA is expressed as total RNA and dilutions of the original HA titer (see text). Each RRT-PCR assay is multiplexed to detect an HA subtype and the invariant M gene site. Ct values are plotted against total RNA and dilutions of the RNA yielding at least two out of three detections (≤40 Ct). Amplification plots with raw data are shown.

the total purified RNA was resuspended in 100 ␮l and was defined as RNA representing 80 HA units/␮l. Based on this definition, the H7 RNA represented 20 HA units/␮l and the H9 RNA represented 40 HA units/␮l. The determination of the LODs for each assay is shown graphically in Fig. 2 with the dilution of RNA used for consideration when analyzing based on HA titer. Based on spectrophotometric analysis of RNA concentration, the LOD for the H5 assay is 0.6 pg; the LOD for the H7 assay is 0.06 ng; and the LOD for the H9 assay is 1.8 pg. 3.3. RT-PCR sensitivity and specificity Sensitivity, or the probability of detecting target when target is truly present, was determined by challenging the assays with ‘clean’ and ‘dirty’ matrices spiked with appropriate target. Specificity, or the probability of detecting target when target is truly

absent, was determined by challenging the assays with ‘clean’ and ‘dirty’ matrices without added target. Contingency tables for each developed assay are presented in Table 1. Cohen’s Kappa values indicate excellent reproducibility for each of the assays. The false negative rate was determined using the following equation: 100% × [1 − (true negative/known negative)], where true negatives are samples not detected in RRT-PCR and known negatives are unspiked samples. Similarly, false positive rates were determined using the following equation: 100% × [1 − (true positive/known positive)], where true positives are samples detected in the RRT-PCR assay and known positives are spike samples. Forty replicates of each experimental scenario were tested to allow for determining a ∼10% failure rate with >95% confidence. The average Ct values and the false positive and negative rates are presented in Table 2. All three multiplexed RT-PCR assays resulted in 0% false positive rates in both ‘clean’ and ‘dirty’ matrices. All three assays

N. Waybright et al. / Journal of Virological Methods 154 (2008) 200–205 Table 2 Average Ct values, false positive and false negative rates of RRT-PCR assays.

Table 1 2X2 contingency tables.


True positive

True negative



H5 (average Ct; n = 40)

M (average Ct; n = 40)

79 1 80

0 118 118

79 119 198

H5 spiked (clean) H5 spiked (dirty) Unspiked (clean) Unspiked (dirty) H5 PCa

33.11 32.61 ND ND 33.15

31.75 31.36 ND ND 32.23

79 1 80

0 118 118

79 119 198

H7 spiked (clean) H7 spiked (dirty) Unspiked (clean) Unspiked (dirty) H7 PCb

29.69 30.44 ND ND 30.41

31.37 31.07 ND ND 31.82

77 3 80

0 118 118

77 121 198

H9 spiked (clean) H9 spiked (dirty) Unspiked (clean) Unspiked (dirty) H9 PCc

33.24 33.84 ND ND 32.82

33.06 32.92 ND ND 32.81

677 19 696

0 480 480

677 499 1176

Ä = 0.9895 H7 RRT-PCR+ H7 RRT-PCR− Total Ä = 0.9895 H9 RRT-PCR+ H9 RRT-PCR− Total Ä = 0.9625 M RRT-PCR+ M RRT-PCR− Total


Ä = 0.9668

resulted in <10% false negative rates in both ‘clean’ and ‘dirty’ matrices. These data show the high level of accuracy obtained with the assays. Specificity of the RT-PCR detection assays was further established by challenging each assay with four different HA subtypes in ‘clean’ and ‘dirty’ matrices. ‘Clean’ matrices are mock extractions of water and the ‘dirty’ matrices are extractions of chicken throat swabs. The influenza A H-subtypes selected were the three potential pandemic-causing (H5, H7 and H9) and two seasonal strains (H1 and H3). The reactions were performed in replicates of 40 because this was determined to be the minimum number required to give ∼10% failure rate with a confidence level of 95%. H5, H7 and H9

a False positive rate (clean) = 0%; false positive rate (dirty) = 0%; false negative rate (clean) = 2.5%; false negative rate (dirty) = 0%. b False positive rate (clean) = 0%; false positive rate (dirty) = 0%; false negative rate (clean) = 0%; false negative rate (dirty) = 2.5%. c False positive rate (clean) = 0%; false positive rate (dirty) = 0%; false negative rate (clean) = 0%; false negative rate (dirty) = 7.5%.

assays specifically detected their respective subtypes, while the M assay detected each subtype tested 100% of the time (Table 3). The H7/M and the H9/M assays failed to detect the M target from H5N1 RNA only when the RNA was very dilute. The matrix type, ‘clean’ or ‘dirty’, had no effect on the specificity results. These data show that the RRT-PCR assays are highly specific detecting influenza A in each case, while discriminating between subtypes. 3.4. Pyrosequencing The concept behind the screening tool is to detect an emerging threat. Pyrosequencing was employed to detect amino acid changes

Table 3 Specificity analysis of RRT-PCR assays. Numbers are Ct values. n = 40. Input RNA

Assay H5/M

H1N1 RNA 500 pg H1N1 RNA 50 pg H1N1 RNA 5 pg H3N2 RNA 500 pg H3N2 RNA 50 pg H3N2 RNA 5 pg H5N1 RNA 3 pg H5N1 RNA 0.3 pg H5N1 RNA 0.03 pg H7 RNA 6 ng H7 RNA 0.6 ng H7 RNA 0.06 ng H9 RNA 45 pg H9 RNA 4.5 pg H9 RNA 0.45 pg H5 PC 3 pg NTC NTC H7 PC 6 ng NTC NTC H9 PC 1.8 ng NTC NTC













18.00 22.32 27.71 18.11 22.21 26.89 – – – 20.56 24.17 27.14 24.18 28.19 32.02 – – –

17.04 21.25 27.56 17.48 21.24 26.04 – – – 19.93 23.72 27.43 23.81 27.75 31.23 – – –


23.28 27.22 31.10 23.55 28.31 31.91 37.19 42.37 43.08 – – – 30.11 34.27 38.32

21.96 26.14 30.19 22.43 26.76 30.35 35.47 40.61 ND – – – 29.21 32.88 37.03


23.29 27.45 30.8 23.44 27.63 31.71 35.85 39.32 ND 29.44 33.57 36.42 – – –

22.03 26.11 29.91 22.49 26.64 30.16 34.58 38.11 40.72 28.64 32.39 35.63 – – –

27.02 ND ND

– – –

– – –

– – –

– – –

NTC, no template control; ND, not detected; PC, positive control. a Target-Swab type.

29.76 ND ND


N. Waybright et al. / Journal of Virological Methods 154 (2008) 200–205

at the nucleotide level to distinguish human influenzas from avian influenzas. Chen et al. (2006) generated position-specific entropy profiles by comparing amino acid sequences of 95 avian and 306 human influenza strains. The analysis yielded 52 amino acids with entropy values less than −0.4 and thus defined as conserved between human and avian viruses. Pyrosequencing employs enzymatic reactions, catalyzed by ATP sulfurylase and luciferase, to monitor the inorganic pyrophosphate released during nucleotide incorporation. Real-time sequence data are provided for target regions in the form of a pyrogram. Detection software analyzes the peaks of the pyrogram to determine the exact sequence. The sequence read is then screened against a library by IdentiFire software to determine its origin. Pyrosequencing assays were designed against the 52 amino acid sites described by Chen et al. (2006) to distinguish human from avian influenzas. Because some of the amino acids were close to each other, they could be incorporated into a single sequence read. Within the parameters of the assay design tested, the longest read with consistent accurate results was 30 nucleotides. Therefore, the codons had to encode for amino acids within 11 positions to be incorporated into a single sequence read. The PCR and

Table 4 Accuracy of pyrosequencing assays. #


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 24 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

HA-237 HA-391 MP1-115/121 MP1-137 MP2-11 MP2-20 MP2-57 MP2-86 NP-16 NP-33 NP-61 NP-100 NP-109 NP-214 NP-283 NP-293 NP-305/313 NP-357 NP-372 NP-422 NP-442 NP-455 NS1-227/NS2-70 NS2-107 PA-28 PA-55/57 PA-225 PA-268 PA-356 PA-382 PA-404/409 PA-552 PB1-327 PB1-336 PB1F2-73/76 PB1F2-79 PB1F2-82/87 PB2-44 PB2-199 PB2-271 PB2-475 PB2-588 PB2-613 PB2-627 PB2-674

% sequencing reactions incorrect call Clean matrix

Dirty matrix

0 0 0 2.5 0 2.5 0 5 7.5 10 0 0 0 5 2.5 0 5 0 2.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7.5 0 7.5 2.5 0 0 0 0 0

7.5 0 2.5 7.5 7.5 5 0 2.5 0 12.5 2.5 5 0 0 0 0 2.5 2.5 5 12.5 0 2.5 0 0 0 0 2.5 2.5 0 0 0 0 0 0 2.5 0 10 2.5 7.5 5 7.5 2.5 2.5 0 0

sequencing primers used in each assay are shown in Supplemental Table 2. A library was built to include the target sequence that encodes for the human and avian amino acids for each of the 52 sites. Each potential sequence read was labeled as “Species Gene AA”. For example, the sequence encoding amino acid number 237 in the HA gene is represented in the library as “Human HA-237” or “Avian HA-237”. 3.5. Accuracy Each of the 45 pyrosequencing reactions was tested for accuracy by analyzing H5N1 RNA spiked into extracts from ‘clean’ and ‘dirty’ matrices. As with RRT-PCR assays, each pyrosequencing reaction was tested on 40 ‘clean’ and 40 ‘dirty’ samples. The percentages of sequence reads that resulted in an incorrect sequence determination are presented in Table 4. The overall accuracy was high with 33/45 reactions resulting in no wrong sequencing calls in the ‘clean’ matrix. Only one assay (NP-33) resulted in 10% erroneous sequencing calls in the ‘clean’ matrix. Incorrect sequence calls were only slightly higher in the ‘dirty’ matrix with the highest being 12.5%. Running each target in triplicate would reduce the risk of miscalling target sequences when using this system. 4. Discussion A surveillance method for detecting mutations in the influenza genome that would yield a virus more infective and virulent to humans is described. The amino acid sites targeted for surveillance were selected based on thorough sequence analysis and entropy plots comparing human and avian influenzas (Chen et al., 2006). These analyses resulted in the definition of 52 amino acid sites throughout the influenza genome as either human- or aviansignatures. The implication of the differences at these amino acid sites between human and avian influenzas is that they are critical in establishing a strain capable of infecting and maintaining viability in the human population. Subsequent reports have described additional species-specific residues. Further assay development would be required to incorporate detection at these sites (Finkelstein et al., 2007; Miotto et al., 2008). The surveillance method is useful both for monitoring HPAI in animals and, most importantly, for detecting the emergence of mutations or reassortants that threaten the human population. The library generated for use with the Biotage IdentiFire interface generates a data output modeled for the non-technical end user. Any sequence matched with a human signature in the library will immediately alert the end user of a mutated or altered influenza strain. The materials cost of analyzing one sample by the RRT-PCR method described is roughly $7 per sample and pyrosequencing of all 52 sited is ∼$140. Conventional RT-PCR would cost $9 per sample and sequencing by capillary sequencing the influenza genome is ∼$269. The pyrosequencing method, therefore, is economical and provides the added benefit of less labor time required due to the method of data reporting used by the Identifire software. Establishing the Influenza Genome Sequencing Project greatly facilitated the analysis comparing human and avian influenza sequences (Bao et al., 2008; Ghedin et al., 2005). Additional entries into the database will allow continual analyses and greater understanding of the changes required for conversion of an avian to a human influenza strain. While informative and valuable, these in silico studies cannot predict the number of amino acid changes required for avian to human conversion. Representation of the assayed amino acids in past pandemic strains is not 100%. The 1918 H1N1 pandemic strain, which killed >40 million people, contains

N. Waybright et al. / Journal of Virological Methods 154 (2008) 200–205

16 of the 52 human-specific amino acids (31%). The H2N2 strain from 1957 contains 43 (83%) and the 1968 H3N2 pandemic strain contains 45 human-specific amino acids (87%). Structural modeling and biochemical experiments must be conducted to determine the role of these amino acids in pathogenesis of influenza viruses in humans. Acknowledgments Drs. Robert Webster and Richard Webby (St. Jude Children’s Research Hospital) kindly provided Influenza RNA. The Maryland Department of Agriculture provided Chicken swabs. Dr. John Lednicky (MRI) kindly provided Influenza RNA and critical evaluation of the manuscript. Ms. Karin Bauer (MRI) provided statistical support. The authors would also like to thank Dr. Ted Hadfield for critical review of the manuscript. This work was funded in part by the Department of Homeland Security. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.jviromet.2008.09.013.


References Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., Lipman, D., 2008. The influenza virus resource at the National Center for Biotechnology Information. J. Virol. 82, 596–601. Chen, G.-W., Chang, S.-C., Mok, C.-K., Lo, Y.-L., Kung, Y.-N., Huang, J.-H., Shih, Y.-H., Wang, J.-Y., Chiang, C., Chen, C.-J., Shih, S.-R., 2006. Genomic signatures of human versus avian influenza A viruses. Emerg. Infect. Dis. 12, 1353–1360. Finkelstein, D.B., Mukatira, S., Mehta, P.K., Obenauer, J.C., Su, X., Webster, R.G., Naeve, C.W., 2007. Persistent host markers in pandemic and H5N1 influenza viruses. J. Virol. 81, 10292–10299. Ghedin, E., Sengamalay, N.A., Shumway, M., Zaborsky, J., Feldblyum, T., Subbu, V., Spiro, D.J., Sitz, J., Koo, H., Bolotov, P., Dernovoy, D., Tatusova, T., Bao, Y., St George, K., Taylor, J., Lipman, D.J., Fraser, C.M., Taubenberger, J.K., Salzberg, S.L., 2005. Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution. Nature 437, 1162–1166. Miotto, O., Heiny, A., Tan, T.W., August, J.T., Brusic, V., 2008. Identification of humanto-human transmissibility factors in PB2 proteins of influenza A by large-scale mutual information analysis. BMC Bioinformatics 9 (Suppl. 1), S18. Subbarao, E.K., London, W., Murphy, B.R., 1993. A single amino acid in the PB2 gene of influenza A virus is a determinant of host range. J. Virol. 67, 1761– 1764. Webby, R.J., Webster, R.G., 2001. Emergence of influenza A viruses. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 356, 1817–1828. WHO, 2008. Cumulative Number of Confirmed Human Cases of Avian Influenza A/(H5N1) Reported to WHO. World Health Organization.