|
|
||||||||
Gamete Biology; |
Dairy and Swine Research and Development Center,3 Agriculture and Agri-Food Canada, Sherbrooke, Quebec, Canada J1M 1Z3
The Semex Alliance, Inc.,4 Saint-Hyacinthe, Quebec, Canada J2S 7B8
ABSTRACT
Spermatozoa are terminally differentiated cells produced during the complex process of spermatogenesis. Although the role of their residual RNA content is still being debated, this transcriptome may represent a fingerprint of spermatogenesis quality. In the present study, we undertook differential transcript profiling of spermatozoa from fertile bulls with extreme nonreturn rates (NRRs): a low-fertile group, and a high-fertile group. Using the suppression-subtractive hybridization technique in combination with macroarray analysis, we also identified novel genes. Both extreme NRR index groups retained redundant identity, such as ribosomal and mitochondrial sequences, at a statistically significant level. An elevated number of 12S, 18S, and Large Chain R rRNA gene copies were found in low-fertile bulls and validated in spermatozoa by quantitative RT-PCR for a small cohort of bulls with known fertility index. Whereas the high-NRR library exhibited a large proportion (29%) of transcripts associated with known functions (e.g., metabolism, signal transduction, translation, glycosylation, and protein degradation), only 10% of the low-NRR sequences did. This difference is also conveyed by two other categories: 17% Bovine Genome and 48% unknown in the high-NRR library, compared with 3% and 80%, respectively, in the low-NRR library. Some of the unknown transcripts are similar to expressed sequence tags detected in the male reproductive organ of certain plants and retain homology to a putative human protein. Whereas the individual transcriptome profiles may be useful in fertility assessment, these findings also suggest cross-species conservation, could contribute to a better understanding of spermatogenesis, and provide new insights regarding idiopathic infertility.
gamete biology, sperm, spermatogenesis
Spermatozoa transport not only paternal DNA but also protein-coding and noncoding RNAs to the oocyte. Since the early study by Pessot et al. [1], several other studies have discussed the detection of specific transcripts in mammalian spermatozoa [2]. The majority of such transcripts have been linked to sperm development; however, with the advent of high-throughput technology, such as microarray technology, the functional diversity of spermatozoal RNA is increasing.
Spermatozoa, which are devoid of transcriptional and translational activity, are terminally differentiated cells produced in the complex process of spermatogenesis. Spermatogenic cells produce an impressive amount of polyadenylated RNA, accounting for as much as 30% of total RNA [2]. Many researchers have stated that spermatogenic cells produce too much polyadenylated RNA for their own needs, especially given the reduced translation efficiency of these cells [3]. Another peculiarity of spermatogenesis is that much of the RNA produced in spermatogenic cells is stable. Some RNAs transcribed in spermatocytes are preserved until late spermiogenesis [4].
Various authors have used RNA profiles to qualify a function or role of sperm in early embryogenesis. The mRNA distributions for specific transcripts between low- and high-motile sperm isolated from the same sample have been compared [5]. The levels of different transcripts coding for molecules involved in either nuclear condensation or sperm function are associated with the motility status of the sperm cells, although viability was similar in high- and low-motile fractions. Recent studies suggest that spermatozoal RNA might have a role in postfertilization events. Ostermeier et al. [6] reported that some mRNAs detected in sperm cells and early embryos are absent in nonfertilized oocytes. Furthermore, microRNAs in spermatozoa [7] could also be involved in embryo development. Because of its origins and the peculiarities of spermatogenesis, spermatozoal RNA can be regarded not only as remnants that escape cytoplasmic extrusion during the last steps of spermiogenesis but also as a fingerprint of spermatogenesis quality and fertility outcome. An interesting clinical application of transcript profiling has been proposed that consists of assessing fertility potential using sperm samples as noninvasive biopsy specimens. For example, abnormal KLHL10 gene (two missense mutations) was detected in spermatozoal RNA and associated with impaired fertility. Spermatozoal RNA profiles of normal-fertile men have been established [8, 9] and will play an important role in the assessment of idiopathic cases of infertility [10]. Transcript analysis of spermatozoa content is now considered to be a noninvasive approach for analyzing genetic defects of human spermatogenesis [11].
Genetic selection for milk and meat production in the cattle industry has been linked to a decrease in fertility. In general, the selection of sires for reproduction is based solely on visual semen examination and sperm counts. Other laboratory assays that can be used in the evaluation of spermatozoal cellular attributes include the sperm chromatin structure assay, induction of capacitation/acrosome reaction, zona-free hamster egg penetration assay, and evaluation of sperm membrane integrity [12–15]. The main field reference used for genetic evaluation of male fertility potential, however, is the nonreturn rate (NRR), an assessment that cannot be predicted by conventional parameters [16]. Bulls differ in terms of their reproductive performance as evaluated on the basis of NRR. This measure is defined as the percentage of cows that were inseminated and not reinseminated within a specified interval, typically 56 days (i.e., 56-day NRR). The NRR is the result of conception [17] and, therefore, is called the field fertility measure. Although conventional tests of semen quality are widely used to assess fertility status, they are still considered to be inconsistent predictors of field reproductive efficiency unless several sophisticated protocols are performed [18, 19]. Subfertility is still unpredictable and, thus, has significant costs for the dairy industry. The cattle artificial insemination industry is focusing on the development of accurate methods capable of predicting field fertility from frozen semen.
The objective of the present study was to evaluate the differences in RNA profiles among spermatozoa from bulls with known fertility indexes, with the aim of identifying potential mRNAs that could serve as fertility markers. The technique of suppression-subtractive hybridization (SSH) was used to enrich libraries with transcripts that are differentially expressed between high- and low-fertile bulls based on the NRR indexes.
All procedures were reviewed and approved by the Institutional Animal Care and Use Committee of the Dairy and Swine Research and Development Center of Agriculture and Agri-Food Canada (document 160, October 2002).
Cryopreserved semen from 10 bulls of known NRR sired by the same bull was used to construct the cDNA libraries. Semen from the most fertile sons (5 of 180 in each extreme tail of the NRR evaluation) of a Canadian elite bull-sire was assembled to form a high-performing pool (high NRR,
71%) and a low-performing pool (low NRR,
65%). The NRR is computed by the Canadian Dairy Network on a semiannual basis. Consideration was also given to routine laboratory measurements (ejaculate volume, sperm concentration, and motility rate) to ensure that lots with similar characteristics were used. Differential transcript abundance was validated on these 10 family-sire bulls as well as for 40 genetically unrelated bulls selected on the basis of their NRR performance (20 high and 20 low). This group is referred to hereafter as the OTHER group of sires.
Semen Preparation and Processing
Frozen semen was thawed in water at 30°C for 1 min. Sperm cells were isolated from the extender, cryoprotectant, and other cell types by centrifugation on discontinuous density (40%–70%) gradients (Percoll; Sigma-Aldrich) in TL-SPERM (100 mM NaCl, 3.11 mM KCl, 0.3 mM NaH2PO4, 10 mM Hepes, 3 ml of sodium lactate, 2 mM CaCl2 · 2H2O, 4 mM MgCl2 · 6H2O, and 25 mM NaHCO3). For the preparation of the RNA samples used in the SSH libraries, 12 straws per bull were processed. The contents of two straws per bull were layered per gradient, and a total of six Percoll gradients were processed per bull. The gradient and spermatozoa were centrifuged at 700 x g for 45 min at room temperature.
Leukocyte-Enriched and Semen Preparations Examined by Flow Cytometry
Bovine lymphocyte cells were isolated from blood samples from three different cows and used to validate the purification of motile spermatozoa. Approximately 15 ml of blood were drawn from the caudal vein into 10-ml, K3EDTA-vacuum tubes (Becton, Dickinson and Company) and transported on ice to the laboratory. Within 1 h, the blood sample was centrifuged for 40 min at 350 x g at 15°C. The buffy-coat fraction (leukocyte-enriched) was pipetted, and cells were washed with PBS. The number of viable cells was determined by trypan blue exclusion using the hemocytometer. The semen and buffy-coat cell mixture was deposited on top of a Percoll gradient before centrifugation, which was conducted as described above. Monoclonal antibodies were used to detect the bovine T-cell marker CD2 and the T-lymphocyte subpopulation markers CD4 and CD8: anti-CD2 (BAQ95A), anti-CD4 (IL-A11), anti-CD8 (CACT80C), anti-B cells (LCT-2A), and anti-
-
T cells (GB21A). All antibodies were purchased from VMRD. The protocol used is as described previously [20] but with minor modifications. Isolated leukocytes were resuspended in PBS-azide-BSA, and 1 000 000 cells were then transferred to 12- x 75-mm polypropylene tubes (two for each primary antibody). The assays included controls (no primary antibody). Ice-cold PBS containing 0.5% BSA was used to dilute the antibodies and to wash the plates. All cell-labeling steps were done on ice, and all centrifugations were performed at 10°C. Fluorescein isothiocyanate-conjugated goat anti-mouse immunoglobulin G2a/2b and phycoerythrine-conjugated goat anti-mouse immunoglobulin G1 (BD Pharmingen) were used as secondary antibodies for fluorescent staining. Single-labeling was performed by incubating the cells with 50 µl of the specific antibody at 2.5 µg/ml for 20 min, followed by two washes. The cells were then labeled with 50 µl of the secondary antibody at 2.5 µg/ml, followed by two washes. Double-labeling was performed as follows to characterize CD4 and CD8 cell populations: Cell preparations were simultaneously incubated with 25 µl of anti-CD4 at 5 µg/ml and with 25 µl of anti-CD8 at 5 µg/ml for 20 min and washed twice. Then, 25 µl of each secondary antibody at 5 µg/ml were added for the labeling of cells. The cells were resuspended in PBS supplemented with 2% paraformaldehyde and analyzed on a Coulter Epics XL-MCL flow cytometer using Expo 32 software (Beckman Coulter). Lymphocytes were gated by using forward and side light-scattering, and data were collected for 25 000 events. Background fluorescence was determined by labeling cells with secondary antibodies only. Gates of each leukocyte type were customized to achieve the lowest percentage of nonspecific fluorescence and the highest percentage of specific fluorescence. Leukocyte types were identified by flow cytometry based on their forward and side scatter parameters.
After the centrifugation, sperm cells were washed in TL-SPERM, and the sperm pellets were processed for RNA extraction using TRIzol reagent (Invitrogen). After resuspension of the cells in TRIzol reagent, the cells were incubated at 60°C for 30 min as previously described [21], with vortexing every 10 min for lysis. The procedure was then carried out according to the manufacturer's instructions, except that ethanol precipitation was performed in the presence of linear acrylamide (Ambion/Applied Biosystems). A DNase I (Ambion) treatment was performed on every sample, and RNA was precipitated with ammonium acetate and ethanol in the presence of linear acrylamide. Quantification of RNA was done with the RiboGreen reagent (Molecular Probes) and simultaneously in the NanoDrop ultraviolet/visible spectrophotometer (NanoDrop Technologies). A fraction of each RNA sample (
5 ng/µl) was tested by multiplex PCR using the multiplex Master Mix Kit (Qiagen). The PCR program was a follows: initial denaturation at 95°C for 15 min; 35 cycles of denaturation at 94°C for 30 sec, primer annealing at 55°C for 30 sec, and extension at 72°C for 1 min; and final extension at 72°C for 10 min. The primers were designed for the deleted in azoospermia-like (DAZL; GenBank accession no. NM_001351) and protamine 1 (PRM1; GenBank accession no. NM_174156.2) genes (Supplemental Table 1, and all other supplemental data, is available at www.biolreprod.org) and encompassed one intron; therefore, they distinguished cDNA from the genomic sequence.
|
Suppression-Subtractive Hybridization
A total of 140 ng of spermatozoal RNA from both high and low pools was first amplified with the BD Super SMART cDNA Synthesis Kit (BD Biosciences) according to the manufacturer's instructions. The two pools of the amplified cDNA, hereafter called FN (high-fertile and nonsubtracted) and LN (low-fertile nonsubtracted), were then used to enrich the differential representation of high- and low-fertile cDNAs using the forward and reverse SSH procedures, respectively, with the PCR-Select cDNA Subtraction Kit (BD Biosciences) according to the manufacturer's instructions. The cDNA pool prepared from the five high-fertile bulls (FN) was used as the tester following the SSH procedures and was subtracted with the "driver," where the driver was the cDNA pool prepared from the five low-fertile bulls (LN), generating the FS (fertile subtracted) cDNA pool. A fraction of the FS PCR products was cloned and made up the NRR-SSHF library, whereas the remaining PCR product was kept for the macroarray experiments and real-time PCR validation. The reverse SSH procedure was carried out using the LN cDNA pool as the tester, and the subtraction was performed using the FN cDNA pool as the driver, resulting in the LS PCR products (low-fertile subtracted). A fraction of the LS was cloned and made up the library called the NRR-SSHL. The PCR products were cloned into PCRII vector (Invitrogen) and then transformed into Max Efficiency DH5
competent cells (Invitrogen). For the respective libraries, the 1920 clones were selected as follows: A bacterial colony was collected and resuspended in 1 ml of LB broth that included 100 mg/ml of ampicillin and was grown with agitation (225 rpm) for 14 h at 37°C. The PCR amplifications were carried out using 2 µl of bacterial suspension diluted in 20 µl of MilliQ water (Ambion) with the Taq DNA Polymerase (UBI or Qiagen) according to the respective manufacturers' instructions. The remaining bacterial suspension was kept in 30% glycerol at –80°C. To identify clones containing single inserts, 10 µl from each PCR reaction were electrophoresed on a 2% agarose gel.
Macroarray Analysis and Sequence Identification
The PCR products of the selected clones were analyzed by PCR amplification and visualized on agarose gels. Next, these PCR amplified inserts (ranging from 300 to 700 nucleotides) were alkaline denatured and spotted onto 10 nylon Hybond-N membranes (Amersham Biosciences) using the 96-pin VP402 replicator (V&P Scientific). Empty cloning vectors were also spotted as negative controls. The spotted DNAs were fixed on the nylon filter by ultraviolet cross-link and analyzed for differential gene representation using the PCR-Select Differential Screening Kit (BD Biosciences). Hybridizations were carried out in duplicate using FS and LS as the [
-33P]dCTP radiolabeled probes. After incubation, the filters were washed twice with 0.2x SSC (1x SSC: 0.15 M sodium chloride and 0.015 M sodium citrate) and 0.1% SDS at 65°C and then exposed to a phosphor screen (Amersham Biosciences) for detection. To evaluate the quality of the subtraction, another set of filters was also hybridized with radiolabeled probes generated from FN and LN (nonsubtracted) amplified cDNAs. Hybridized membranes were scanned using the Storm data acquisition system (Amersham Biosciences). Quantification was performed using the ImageQuant TL image-analysis software (Amersham Biosciences), and the data were transferred to a spreadsheet program for ratio analysis. Signal intensities were averaged from the duplicate hybridizations. The differential expression of each clone was estimated by calculating the ratio of the signal with the homologous subtracted probe to that of the signal with the heterologous subtracted probe. The clones for which the ratio was >5, a condition more stringent than in the company recommendations (i.e., >3), were sequenced. The raw sequences obtained from the sequencing apparatus were screened for vector contamination, and trimmed sequences were incorporated into the MS-SQL database. Only trimmed sequences were used for similarity searches. The putative identity of the sequences was determined using a DNA (BLASTN) and protein (BLASTX) sequence search and a matching program developed at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov). Version 2.0.6 of a standalone program called BLASTALL (NCBI) [22], hereafter referred to as BLAST, was used to compare all the DNA sequences on a local server. To build the local database, the following databases were downloaded (13 July 2006) from the NCBI: NR, NT, SwissProt, Bovine Genome version 3.1, Mammals, Ref Sequences RNA, and ESTdb. Default settings were used for all BLAST parameters except for the BLAST cut-off E-value, which was set at 1 x 10–3. The results were updated in August 2007 by removing a set of sequences that had been withdrawn from submission at the NCBI and by adding sequences with high similarities. Sequences were also compared using BLAST against the sequenced clones to evaluate the redundancy of database content. Default settings were used for all BLAST parameters except for the BLAST cut-off E-value, which was set at 1 x 10–10.
RT-PCR of Selected Genes of Interest on the SSH cDNA Pools
Selected clones obtained following the SSH treatment and selected by macroarray analysis in either library were used to validate their differential expression using sequence-specific primers (Supplemental Table 2). Aliquots of the cDNA samples prepared using the BD Super SMART cDNA Synthesis kit, as described above, were tested. The PCR was performed as follows on cDNA samples collected before and after the subtraction steps: 95°C for 5 min; 30 cycles of 95°C for 30 sec, 55°C for 30 sec, and 72°C for 1 min; and then 72°C for 7 min.
To confirm the presence of the selected transcripts in the sperm population, RT-PCR was performed using amplified spermatozoal RNA (aRNA) and testicular RNA. Fresh ejaculated bovine semen collected with an artificial vagina was obtained from the Centre d'Insémination Artificielle du Québec. The extracts were prepared from either fresh ejaculate or cryopreserved semen doses following a discontinuous Percoll gradient centrifugation and TRIzol RNA extraction (see above). Selected clones were tested on several tissues, including heart, liver, gut, muscle, ovary, spleen, kidney, and testis, obtained from a slaughterhouse (Abattoir Giroux). Aliquots of 1 g of tissues were conserved in RNAlater (Ambion) during transfer to the laboratory and at –20°C until the RNA extraction (TRIzol). Whereas three rounds of amplification of spermatozoal RNA were performed with a SMART-T7 mRNA Amplification Kit (BD Biosciences), according to the manufacturer's instructions, total RNA samples from the other tissues were used without mRNA amplification. An aliquot of 2 µg of total RNA for each tissue was treated with DNase I and used for RT assay (Superscript II system; Invitrogen). The aRNA was reverse transcribed with Superscript II reverse transcriptase using an oligo(dT)12–18 as primer (Ambion). A total of 100 ng (2-µl aliquot of the RT sample) was used for end-point PCR, whereas a total of 10 ng equivalent of aRNA (2-µl aliquot of a one-tenth diluted RT) was quantified by real-time RT-PCR (see next section), using the same primers and PCR conditions as described above.
To confirm the absence of somatic and other nonspermatic cells (e.g., epithelial cells) in the RNA preparation of each semen preparation, PCR reactions were performed on each cDNA sample. One microgram of purified mRNA amplified with the SMART-T7 technology was used in the RT assays. The reactions were performed using the Superscript II in the presence of oligo(dT)12–18 according to the supplier's instructions. The common leukocyte antigen (CD45) and epithelial E-cadherin (CDH1) gene markers (Supplemental Table 1) were used in the PCR reaction containing 0.2 µM of the forward and reverse primers and Taq DNA Polymerase (Bishop Canada). The PCR was performed with the program described above.
DNA Standard Curves for Quantitative Two-Step RT-PCR
Query genes (selected clones), normalization marker for RT efficiency (EGFP), and putative housekeeping genes (MYC and PRM2) were amplified using primers specific to the desired application (end-point or quantitative two-step RT-PCR [qRT-PCR]) (Supplemental Table 2). Primers were designed using PrimerExpress 3.0 software (Applied Biosystems) on a consensus sequence generated from the multiple sequencings (data not shown). Standard curves for all genes were composed of five points covering 10 log orders of magnitude, which corresponds to a detection range of 20 cycle threshold (Ct) value. They were prepared using a serial dilution of a purified PCR product, obtained from the amplification of the respective clone of the cDNA library or bovine testis cDNA (e.g., MYC and PRM2). Slopes of the standard curves were between –3.3 and –3.8, reflecting amplification kinetics ranging from 80% to 100%. The means of the slopes of each calibration curve within each plate (triplicate) and across the tree assays (tree plates) were less than 1.8%. Linear regression values for the respective calibration curves were greater than 0.99. With the exception of the MYC gene (annealing at 59°C), end-point PCR reactions were performed as follows: initial denaturation at 94°C for 2 min, followed by 35 cycles of denaturation at 94°C for 30 sec, annealing at 65°C for 30 sec, and extension at 72°C for 45 sec, with a final extension step at 72°C for 2 min. The single-band amplicons were visualized on a 1.5% agarose gel stained with ethidium bromide, purified using the QIAquick PCR purification kit (Qiagen). The identity of the amplicon was confirmed by sequencing, performed using BigDye Terminator Chemistry (version 3.1; Applied Biosystems) and a model 9700 thermal cycler (Applied Biosystems). The PCR program for all sequencing reactions included initial denaturation at 96°C for 1 min, followed by 25 cycles of denaturation at 96°C for 10 sec, primer annealing at 50°C for 5 sec, and extension at 60°C for 4 min. The sequencing products were purified by ethanol/EDTA precipitation, resuspended in a formamide solution (Applied Biosystems), and analyzed with the ABI 3100-Avant capillary sequencer (Applied Biosystems). Gene homology was confirmed via the BLAST network service of the NCBI using the GenBank database (http://www.ncbi.nlm.nih.gov/). For each reference gene, the products of the PCR reactions were cleaned with a PCR purification kit (Qiagen). Serial dilutions of the purified amplicons were made in nuclease-free water (Ambion), aliquoted, and used once to avoid freeze-thaw and to ensure reproducible results.
|
Real-Time PCR using cDNA (RT) and PCR (FS, FN, LS, and LN) Products
To quantify the PCR products (FS, FN, LS, and LN), appropriate dilutions were performed (1:1000 for testing mitochondrial candidates: L_05_B10, L_05_C06, and F_04_B07; 1:20 for testing other candidates). Quantitative PCR (qPCR) assays were performed on RT prepared with aRNA analyzed on Bioanalyzer 2100 (Agilent Technologies). The RT reactions contained 1 pg of EGFP mRNA per sample as described previously [23]. One master mix spiked with EGFP mRNA was evenly distributed throughout the samples containing an equivalent of 1 µg of spermatozoal aRNA per RT reaction of 20 µl. Two RT reactions per individual were performed using the Superscript II system and then pooled. The RT samples were diluted 1:10 in nuclease-free water or as otherwise specified and aliquoted; they were used once to avoid a freeze-thaw effect on cDNA quality and qPCR performance. All results in the present study were produced using aliquots prepared from the same RT (pool) for each bull. The RT lot and storage cannot account for the differences in gene detection.
Quantitative RT-PCR was performed in triplicate using an Applied Biosystems 7500 Fast Real-Time PCR System. For quantification of EGFP, PRM2, MYC, F_03_B08, F_05_B09, F_05_B12, and L_04_G12 genes, hydrolysis probe technology (i.e., TaqMan probes) (Supplemental Table 2) was used as the detection system. The PCR mixture contained 5 µl of 2x TaqMan Universal Master Mix (Applied Biosystems), 0.3 µM of forward and reverse primers, 0.2 µM of probe, and 2 µl of diluted cDNA. All primers and probes used for qRT-PCR assays were purchased from Applied Biosystems. Nuclease-free water was added to give a final volume of 10 µl/well, and SYBR Green chemistry was used to quantify the other genes. All genes tested for each bull were quantified using aliquots prepared from the same RT sample (two pooled RTs) to avoid variations attributable to differences between RT assays. The absence of primer-dimers for both EGFP and candidate genes was validated using the SYBR Green I technique [24]. The PCR reactions contained 5 µl of 2x Power SYBR Green Master Mix (Applied Biosystems), 0.3 µM of forward and reverse primers, and 0.25 U of AmpErase UNG (Applied Biosystems). All runs (i.e., plates) included a triplicate of the standard curve and from three to six negative controls (absence of target DNA) that were partitioned at the beginning and end of the plate to detect potential contamination during preparation of the plate. Thermal cycling conditions were as follows: 50°C for 2 min (incubation for the AmpErase UNG) and then the first denaturation step at 95°C for 10 min, followed by 50 cycles of 95°C for 15 sec and 60°C for 1 min. Melting-curve analysis was performed for SYBR Green amplifications by plotting fluorescence intensity in a graphic model. To validate the specificity of the qRT-PCR assays, a single melting temperature peak representing a single amplicon (SYBR Green amplifications) was confirmed. Both products (single band) of the SYBR Green and the TaqMan assays were detected on a 2.5% agarose gel stained with SYBR gold nucleic acid stain (Invitrogen) and sequence confirmed (data not shown).
Quantitative RT-PCR data were analyzed using the threshold setup recommended by Applied Biosystems (Absolute Quant Standard Curve; Applied Biosystems 7900HT Fast Real-Time PCR System; available as a pdf from Applied Biosystems, www.appliedbiosystems.com, "Support, General Literature" [cms_042176.pdf]) before being transferred to a Microsoft Excel spreadsheet. The slope of the calibration curve was calculated from the plot of base-20 log of the initial target amount versus the corresponding Ct. The PCR efficiency (E = 10(–1/slope)) was determined from the slope. All the PCR assays showed 80–100% efficiency. Basic analyses (mean Ct and SEM) and preliminary statistical analyses were performed using the Analysis ToolPack in Excel, whereas significance of number of gene copies (qRT-PCR measurements) among group and ANOVA were performed using Statistical Analysis System (Release 9.1, 2002; SAS Institute). Data recorded as the number of copies were normalized using a log transformation. Student t-tests were performed to compare mean values of LOW versus HIGH using the Satterthwaite correction for unequal variance when applicable. Mean values, reconverted to number of copies, are presented.
The aim of the present study was to identify transcripts differentially expressed in bovine spermatozoa from bulls of known fertility. We also wanted to reveal weakly expressed but potentially important transcripts using SSH. This powerful normalization method was used to identify novel genes by enriching transcripts that were differentially expressed [25] in the two populations of bulls: high fertile (high-NRR index, 71.6 ± 0.5 [mean ± SD]), and low fertile (low-NRR index, 63.4 ± 2.7 [mean ± SD]). Semen samples were treated as described in Materials and Methods. An average of 25 ng of total RNA was recovered from each two-straw set (
106 cells/straw were recovered following Percoll gradient purification) for the respective bulls (data not shown). Six extractions totaling 12 straws were performed for each bull. A fraction of each RNA extraction equivalent to 5 ng (1/5) was tested by multiplex PCR (see Materials and Methods). The absence of genomic DNA was confirmed for each RNA set before pooling. The sensitivity of multiplex PCR allowed detection of as little as 1 ng of genomic DNA, given the detection of the 550- and 234-bp amplicons corresponding to the genomic fragment of the DAZL and PRM1 genes, respectively (Fig. 1).
|
To ensure that the spermatozoal fractions were free of hematopoietic and epithelial cells, different validation tests were performed on the sperm preparation: microscopic examination, flow cytometry, and molecular validations (see Materials and Methods). To test the Percoll gradient tightness (i.e., the impermeability of the different density fractions to somatic cells), buffy-coat cells were added to the semen preparation as described in Materials and Methods. The semen and buffy-coat cell mixture was deposited on top of the gradient before centrifugation. Three types of samples were analyzed by flow cytometry: pellet-containing spermatozoal fractions obtained after Percoll gradient centrifugation from sperm samples (two straws), either alone or with the addition of approximately 5 x 106 leukocytes, and a leukocyte sample as control. The semen of two different bulls was tested in duplicate, and the experiment was reproduced once. Immunofluorescence was observed only for samples that were not subjected to gradient centrifugation. The purity of the spermatozoal preparations was verified both by flow cytometry and by visual examination (data not shown). Visual examination (both trypan blue and eosin-nigrosin exclusion) of the interface fraction of Percoll gradients (40%–70%) revealed not only the presence of dead or nonmotile spermatozoa but also the location of the leukocytes, which are excluded from the dense, 70% fraction of the Percoll gradient (data not shown). Furthermore, as described in Materials and Methods, all the RNA extractions were tested by RT-PCR for both the CDH1 (E-cadherin) and CD45 (tyrosine phosphatase, a common antigen of leukocytes) genes, which are markers of epithelial and leukocyte cells, respectively [5]. The absence of contaminants in the purified spermatozoal fraction was confirmed by end-point RT-PCR and qRT-PCR (Supplemental Figure 1).
Sequences from Nonsubtracted Pools and Comparison with Subtractive Libraries
As recommended in the SSH protocol, the subtraction procedure is first appreciated by agarose gel electrophoresis. The PCR products generated from the SSH procedure presented a different profile from the original cDNA sample. In the forward subtraction experiment, transcripts preferentially expressed in the cDNA sample prepared from the high-fertile bulls (Fig. 2, lane FS) presented a profile on gel electrophoresis different from that of the corresponding nonsubtracted cDNA sample (Fig. 2, lane FN). Similar results were obtained with the cDNA sample from the low-fertile bulls (Fig. 2, lane LS vs. lane LN) and with the control sample provided with the kit (Fig. 2, lane CS vs. lane CN). After cloning, the range of the insert size of randomly picked clones was
300–600 bp (data not shown), which is consistent with the average size generated by the RsaI restriction digestion step in the SSH protocol. The clones were selected for further analysis based on the macroarray results (Fig. 3, see Materials and Methods).
|
|
To verify the differential expression pattern in the subtracted population, 100 clones were randomly selected from both FN and LN nonsubtracted cDNA samples. These clones were sequenced and subjected to a BLAST homology search. The results for the FN and LN identities are presented in Figure 4, C and D, respectively.
|
For the FN sample prepared using spermatozoa from bulls with a high-NRR index (Fig. 4C), half the randomly selected clones showed homology to contigs of the Bovine Genome. Within the Unknown category (16%), 11 of these polyadenylated cDNAs could not be further identified by a BLAST search against public databases. This suggests that spermatozoa contain unique and not-yet-characterized transcripts. A large portion of cDNA clones (21%) was represented in the mitochondrial-related section, therefore suggesting a mitochondrial origin.
The analysis of the clones from the LN pool presented a somewhat different picture (Fig. 4D). In this case, sequences of mitochondrial origin represented 51% of the clones. Sequences of the other major group (19%) fell into the Unknown category. One unusual feature was that most of the unknown sequences showed homology to pine sequences, which we will discuss further, because they were well represented in the low-NRR library.
The general comparison based on gene identity (NCBI blast) (Fig. 4, C and D) of the sequences obtained from both original cDNA pools (i.e., FN and LN pools) revealed some similarity. Specifically, seven sequences of the nonsubtracted samples made up 30% and 29% of the FN and LN cDNA sample, respectively. This similar redundancy was not observed when the results for the two NRR-SSHF and NRR-SSHL libraries were compared. Only four GenInfo Identifiers (GIs; i.e., a GI is a sequence identification number for a nucleotide sequence in the NCBI database collection) were found in both the NRR-SSHF and NRR-SSHL libraries, representing 2.7% and 5%, respectively, of the sequence identity (data not shown). In general, these results point to the advantage of subjecting cDNA pools to subtraction before gene identification.
High-Fertile Subtracted Library: NRR-SSHF
The NRR-SSHF library represents a group of candidate transcripts obtained through the SSH procedure. This treatment was used to enrich the FN cDNA pool prepared with spermatozoa from five high-fertile sons of an elite bull-sire with mostly overrepresented transcripts relative to the five sons presenting the lowest fertility index. This NRR-SSHF library included 1774 uncharacterized and randomly picked subtracted clones. The analysis of the difference in transcript levels, carried out by the differential screening method (i.e., macroarray [see Materials and Methods]) using probes prepared with the respective cDNA pools, allowed us to confirm which transcripts were overexpressed among these clones. A typical example of macroarray analysis is presented in Figure 3. These results represent one of the two hybridization replicates, which comprised one set of four membranes spotted with a subset of 672 clones from the NRR-SSHF library and hybridized with the respective probes: FS (Fig. 3A), LS (Fig. 3B), FN (Fig. 3C), and LN (Fig. 3D). Each membrane was hybridized once, and the corresponding duplicate of both sets exhibited a similar expression pattern (data not shown). Analysis of the hybridization results suggested a difference in the population of cDNAs that constitute the tester (FN) (Fig. 3C) and the driver (LN) (Fig. 3D), the cDNA pools of sperm transcripts prepared from the high- and low-fertile bulls, respectively. This divergence was found to be greater following the SSH procedure: The majority of NRR-SSHF clones that hybridized with the FS probe (Fig. 3A) did not hybridize with the LS probe (Fig. 3B). For each corresponding clone spotted on two membranes, a ratio expressing the difference in the intensity obtained with the FS/LS probes was calculated. This ratio was then used to select the clones to be sequenced and identified by a BLAST homology search. A total of 519 clones were selected for sequencing based on the cut-off ratio arbitrarily set at five (see Materials and Methods). A large number of these clones (n = 118) presented an on/off signal, producing a hybridization signal with the FS but not with LS, and vice versa.
Gene identification of the selected clones was carried out by conducting a homology search of the sequence against public databases using the BLAST algorithm (see Materials and Methods). The best hit for each clone was manually chosen through analysis of the BLAST results for each database interrogated. To be the best hit, a BLAST result had to have the highest score with the lowest E-value that covered the longer sequence length of the insert of the clone. In the absence of mammalian orthologous assignment, these uncharacterized sequences were linked to either the Bovine Genome (genomic only) or the Unknown category. All transcripts retained homology to a bovine contig. Sequences grouped in the Unknown category were linked to this group, however, because the best hit retained no highly similar mammalian sequence.
Figure 4A shows the distribution of the NRR-SSHF clones identified as differentially expressed through macroarray analysis and categorized based on the BLAST results. Table 1 shows a selection of BLAST results. Four main categories were identified: Mitochondrial, Ribosomal, Bovine Genome, and Unknown. The most redundant sequences fall into three categories: Unknown, Mitochondrial, and Ribosomal. The other categories represented in Figure 4A correspond to Bovine Genome, Metabolism, Mouse Genome, Chaperone, Electron Transport, Oxidative Stress, Protein Biosynthesis, and Signal Transduction. These sequences represent 10% of the clones identified. A total of 116 clones shared homology with mitochondrial sequences, which made up 9% of the NRR-SSHF library. The Bovine Genome category encompassed 54 clones (18%). An interesting feature of this category is that most of these clones were unique in the library. In fact, for most of the clones, the only reliable identity was attributed to a contig number (whole-genome shotgun sequence) derived from the Bovine Genome assembly. The Unknown category was represented by 144 clones (48%) with the prevalence of one sequence. These clones are homologous to cDNA sequences mostly associated with plants (K. Sato, personal communication). The closest mammalian hit was within the trace files of the NCBI depository (GI: AAFC03046011). A further BLAST homology search of these sequences, however, did not result in identification.
Low-Fertile Subtracted Library: NRR-SSHL
The NRR-SSHL library represented differentially expressed transcripts in spermatozoa from five sons of the same Canadian elite bull that exhibited the lowest fertility potential. This library consisted of 1248 clones. Of these, 459 clones were selected for sequencing based on macroarray analysis, given the 5-fold gene expression level threshold (see Materials and Methods). Single-pass sequencing and identification by a homology search against public databases yielded 315 clones with high scores; their distribution is presented in Figure 4B. Many of the clones from this NRR-SSHL library showed homology to sequences from a cDNA library constructed with male strobilus RNA prepared from pine trees. Research in the protein sequence collection of the NCBI database showed that these pine sequences received the highest score for a putative human protein predicted bioinformatically. Further investigation revealed that this Unknown gene shows some homology to a putative human-secreted phosphoprotein, ALLW1950 (GenBank accession no. AAQ89160.1). Clones that are homologous to this protein represent 37% of the total identifications within the NRR-SSHL library. In the 100 sequenced clones from the corresponding nonsubtracted library, this sequence was absent from the FN pool; however, it showed similarity with 14 of the 100 clones picked from the LS cDNA (data not shown). This indicates that the SSH procedure is efficient for the enrichment of differentially represented transcripts. Quantitative results related to the influence of some individuals and their gene abundance on the pool will be discussed below. Some categories that were associated with clones in the NRR-SSHF library are still represented. For example, sequences with strong homology to a mitochondrial sequence (7%) (Fig. 4B) were identified in the NRR-SSHF library (9%) (Fig. 4D). Table 2 shows a selection of BLAST results. In brief, the findings from the identity homology search suggest that the two SSH libraries are made up of different transcripts.
Validation of High-Fertile and Low-Fertile Subtracted Clones
We also assessed the level of specificity of each library by checking whether clones identified in one library were also present in the reverse-SSH library. A panel of clones were selected and tested by PCR using gene-specific primers. Supplemental Table 2 presents a set of designed primers. The four cDNA pools representing the subtracted and nonsubtracted cDNA prepared from high-fertile bulls (FS and FN) and the subtracted and nonsubtracted cDNA prepared from low-fertile bulls (LS and LN) were used to determine whether the presence of these sequences confirmed the differential expression analysis. End-point PCR detection using standard amplification PCR reaction (i.e. Advantage Polymerase Mixes or Qiagen Taq polymerase systems) did not always provide a relevant picture of the subtracting effect of the SSH technology unless a robust PCR quantification was performed (data not shown). Because some transcripts were present in the low-abundance group and qPCR was more robust and sensitive, a discrepancy existed between the qPCR and the standard PCR results in some instances. This aspect is discussed below. Using the appropriate primer design (Supplemental Table 2), qPCR detection of the selected clones was assayed for the FS, FN, LS, and LN pools. All clones tested gave a quantitative profile that was in accordance with the libraries in which they were identified (data not shown). A subset is presented in Figure 5 showing both qPCR measurement (Ct value; i.e., number of cycles required for significant detection) and PCR products of the qPCR assays. The standard curves were omitted in these assays, and no absolute (number of gene copy) quantification could be obtained. A precise but relative quantification, however, could be obtained from the qPCR assays. To obtain three positive qPCR detections of very-low-abundance genes, up to six replicates were required. Because of the selected threshold, some samples were detected as "absent." Nonetheless, we were able to obtain a triplicate with significant detection given the low SEM (Fig. 5). Whereas the appreciation of gene abundance by end-point PCR did not always match the qPCR results, the qPCR measurement was correlated with the macroarray analysis (data not shown). For the NRR-SSHF library, two distinct patterns were obtained, differentiated by the presence of products in the FS (subtracted) and FN (nonsubtracted) samples. In the first case, qPCR-detected products were limited to the FS sample. This category was represented by the pattern observed for four clones: F_03_B04, F_04_C06, F_04_F12, and F_04_D08 (see clone F_03_B04 in Fig. 5). The effect of subtraction could also be visualized in the second pattern, in which end-point PCR products were present in all cDNA samples but qPCR results confirmed that FS > LS (clones F_03_B08 and F_04_B07 are representative of this category) or that LS > FS (e.g., clones L_05_B10 and L_05_C06). A third pattern was observed based on presence/absence as detected in one or the other subtracted libraries. Representative of the on/off pattern are clones F_05_B12, L_01_B12, and L_01_G08.
|
To determine whether some candidate genes could be associated with the fertility index of a larger cohort, we used 40 bulls with different family origins, including 20 bulls each in both the high- and low-NRR index groups (see Materials and Methods). Quantitative measurements by qRT-PCR were performed on each bull, including the 10 bulls belonging to the Sire-Family used in constructing the libraries (see Materials and Methods). Results for 11 genes selected from both libraries are presented in Table 3, ordered by frequency (i.e., number of clones within the libraries with a sequence similarity as detected by BLAST). These included highly frequent clones and singletons. Also reported are the qRT-PCR results of two potential housekeeping genes, MYC and PRM2. It is customary to normalize qRT-PCR results using endogenous control genes to reduce individual variations. No internal calibrator was found, as shown by the results for the MYC (called "absent" in several qRT-PCR assays) and PRM2 (significant difference between high- and low-fertile bulls, P = 0.0019) genes. An additional calibrator was used for the qRT-PCR study to monitor the efficiency of the reverse transcription step, which is a potential source of variation. The exogenous EGFP transcripts were evenly distributed in all RT reactions (see Materials and Methods). The amount of EGFP cDNA was quantified in each RT sample as described previously [23]. The statistical analysis showed that although no difference was observed between high- and low-fertile groups for both the Sire-Family (P = 0.6416) and OTHER (P = 0.4454) groups, a highly significant difference was observed between the Sire-Family and OTHER group (P = 0.0352). This difference was also confirmed (ANOVA; see Materials and Methods) for the candidate L_05_B10 clone, suggesting that a comparison could not be performed by including all bulls in the analysis, only by comparing fertility status within the same group of bulls (data not shown).
|
We observed that in some cases, the presence of DNA polymorphisms had impaired qRT-PCR assays. Specifically, DNA polymorphisms were found in PRM2, L_03_B08, and L_04_G12, explaining the inconsistencies observed between qRT-PCR results (the TaqMan probe was not specific when a mismatch occurred in the sequence) and the presence of an amplified fragment as detected by gel electrophoresis (data not shown). This situation is also reflected in the qRT-PCR results reported for the Large Chain R rRNA candidate. Whereas two designs were established for the same target gene, using the sequence information for two different fragments of the gene, L_07_B02 and L_05_B10, the qRT-PCR results were inconsistent for the Sire-Family (P = 0.5065 vs. P = 0.0041, respectively; Table 3). Further analysis showed that DNA polymorphisms could explain this variation (data not shown), a phenomenon discussed extensively in another study [26]. Given the finding that DNA polymorphisms are associated with oligozoospermia (i.e., a functional mutation) [11], transcript profiling using oligomers specific to polymorphic regions of genes of importance for fertility represents a very promising avenue of research.
The differential gene expression pattern observed in the four cDNA pools (FS, FN, LS, and LN) was confirmed for the F_03_B04 clone by qRT-PCR (Table 3), in which the absolute gene copies were evaluated using appropriate standard curves (see Materials and Methods). Quantitative RT-PCR detection of the F_03_B04 sequence on the RT sample of each bull aRNA permitted assessment of the subtracting efficiency of the PCR-Select protocol (i.e., SSH method). Although only one bull in each Sire-Family group gave detectable transcript amounts, the absolute quantity of gene copies in the bull spermatozoal aRNA was 5.8-fold more abundant in the high-fertile bull than in the low-fertile group (Table 3). The effect of subtraction as detected in the cDNA pools (FS > FN and FS > LS) did not give significant results when qRT-PCR was performed on each individual in the Sire-Family group, which was made up of the animals used to construct the libraries. The clone F_03_B08 is representative of this category. Although not significant within the Sire-Family, mainly because of the wide variation among individuals (data not shown), significant differences of the means were confirmed within the OTHER group made up of 40 unrelated bulls (P = 0.0127) (Table 3). We confirmed the on/off pattern observed in an analysis of the four cDNA pools for three clones: F_05_B12, L_01_B12, and L_01_G08 (Table 3). Although these on/off genes appeared to be interesting based on the assumption that their presence/absence could have an important impact on fertility, only L_01_G08 was significant (P = 0.0279).
One major observation can be made based on the quantitative measurements performed. The abundance of the transcripts (detected cDNA copies) was correlated with the number of clones found to have similar sequences within the corresponding library (e.g., see Frequency, Table 3). The gene sequence was not found in all bulls. For example, the L_01_B12 sequence was detected in only one low-fertile bull in the Sire-Family, whereas the sequence was absent in all high-fertile bulls. This might explain why this clone was detected by the SSH approach. No significant difference, however, was observed within the OTHER group (P = 0.5530). It is noteworthy that most of the significant genes were relatively abundant, were part of the NRR-SSHL library, and were mitochondrial, whereas low-abundance genes were rarely significant. This aspect of transcript abundance will be discussed below.
To determine whether the unknown genes found to be significant under field fertility criteria (NRR index) could be identified in other tissues and whether different expression patterns could be observed among candidates, we used both end-point PCR and qPCR on multiple tissues (heart, liver, gut, muscle, ovary, spleen, kidney, and testis [see Material and Methods]). Gene abundance varied widely among the tissues, ranging from absence of detection (e.g., sequence F_03_B04 in liver) to a wide expression across all tissues examined. A major difference was observed for the sequence of the clone L_05_B10 (putative gene: Large Chain R, mitochondrial rRNA), which was the most abundant transcript of the panel (P < 0.05), as detected by a lower Ct value. Both F_03_B04 and L_01_G08 (Unknown gene) were very weakly expressed genes.
All the transcripts isolated from the sons of an elite bull sire, which were identified through the SSH procedure and confirmed by macroarray analysis, were also present in spermatozoa from other bull samples. Collectively, the subtracted libraries represented differentially expressed transcripts in both pools of spermatozoa based on fertility potential.
It was recently shown that spermatozoa carry not only paternal genomic DNA to the oocyte but also a complex repertoire of RNA, ensuring continuity of the paternal genome [6, 27]. Although the function of this RNA is a matter of debate, the identification of RNA molecules in mature spermatozoa may help to provide a better understanding of spermatogenesis and fertilization. Transcriptome profiling is a technique that can be used to identify messengers that are potential markers of fertility on the basis of their presence/absence in mature spermatozoa. We produced two libraries using the PCR-Select protocol. This powerful normalization method permits the identification of novel genes by enriching transcripts that are differentially expressed [25]; it also permits the detection of low-abundance transcripts as an integral part of gene discovery programs. Although the microarray technique has proven accuracy in global gene screening, detection is limited to the genes represented on the array, which makes this approach not very useful for studying a different species or a specific tissue, such as spermatozoa, that is not yet well characterized. For example, the most recent bovine genome assembly (the third) was released last year, whereas the mouse genome information has 37-fold coverage (http://www.ncbi.nlm.nih.gov/genome/guide/mouse/). It becomes obvious that commercial DNA chips could limit the Bos taurus transcriptome study given the particularity of male germinal cells. A second and equally important aspect is sensitivity. The microarray approach is not highly sensitive for cross-species hybridization studies and, therefore, constrains the possibility of novel gene discovery. Evans et al. [28] conducted a study that explored the similarities and differences in genes discovered using SSH compared to the Affymetrix GeneChip microarray platform. Those authors observed that 70%–75% of the genes on the GeneChip were detectable and were also identified with SSH. From 20% to 25% of the genes identified through SSH were not present on the DNA chip, however, and 10% of these were considered to be novel (not present in public databases). The microarray approach was also found to be reliable only for detecting medium- to high-abundance transcripts; its performance was inconsistent for the detection of low-abundance transcripts. Overall, an appreciable number of transcripts identified using SSH methodology could not be identified using GeneChip microarrays alone. Cao et al. [29] also described a bias in sensitivity with microarray probing as compared to SSH [29]. These studies demonstrate the importance of combining the use of SSH and microarray analysis, because they are complementary experimental approaches [5].
In the present study, we used the SSH approach to identify novel differentially expressed sequences, and we confirmed that this method can be effective for deriving the different profiles of nonsubtracted and subtracted genes (Fig. 4). We also found the technique to be very sensitive. For example, the clone F_05_B12, the frequency of which was very low (one clone) in the NRR-SSHF library, was also poorly represented in the aRNA of the high-fertile group, as detected by real-time PCR (Table 3). The SSH technique allowed us to register a gene that is only present as one copy per 10 ng equivalent of amplified mRNA, which explains why this low-abundance transcript made up less than 1% of the sequenced clones. This raises the question of whether this gene is present on a commercial DNA chip and, if so, whether it has been detected in spermatogenic cells. The F_05_B12 sequence, which is similar to the B. taurus LIM/homeobox gene (LHX2; GenBank accession no. XM_001251420), is also represented in the Mouse Genome 430 version 2.0 array (Affymetrix GeneChip 430vs2; identity no. 1418317_at) but is absent from the Murine Genome Array U74ABCv2 (GeneChip U74v2), the most widely used tool for the study of spermatogenesis (mouse germ cell reference samples [30] and unpublished results; Gene Expression Omnibus [GEO] accession no. GSE2736). The mouse ortholog has also been detected by microarray in other tissues (e.g., greater expression in ovary than in testis [31]; GEO accession no. GDS2223). We also found that some of the genes were not testis-specific but also detected in other tissues (Fig. 6).
|
The abundance of the gene and the divergence between species may influence the detection limit of an expressed gene. Indeed, because of cross-species divergence (e.g., seven mismatches detected between bovine and mouse for the LHX2; data not shown) combined with the extremely low copy number, detection may have been impaired by the use of cross-species microarrays. No bovine microarrays were available at the time we launched the present study. Therefore, the SSH approach was a good choice, and the results attest to its usefulness. We verified the similarity between the clones (both NRR-SSHF and NRR-SSHL combined) and the sequences present on the Affymetrix GeneChip 430v2 microarray. This repertoire of sequences were BLAST analyzed (local server [see Materials and Methods]), and 179 of the 813 sequenced clones gave "hit" calls, based on a threshold score of >50 and an E-value of <0.1, which is not considered to be stringent, suggesting that no positive signal could be expected for the remaining 634 clones.
We used the SSH protocol to enrich the libraries with genes that are differentially expressed in either spermatozoa from bulls of high fertility or spermatozoa from bulls of low fertility, and vice-versa for the reverse-subtracted library. The transcriptome profile reflecting high- and low-NRR status in bull semen is therefore represented by the respective libraries. The fertile bulls used in the present study were all half-sib, minimizing the wide differences attributable to individuals. Because the transcription profile of fertile men with equal semen quality is highly similar [8, 21], the transcriptome divergence identified by the SSH technique and corresponding to a difference of 8.2 days in the NRR is notable.
The gene representation obtained after screening the two subtractive libraries suggests that substantial differences exist in the transcript representation of spermatozoa from bulls with different levels of fertility. A weak similarity was observed between the two subtractive libraries, with these similar clones corresponding to sequences of mitochondrial origin. For example, the mitochondrial-related category accounts for the most entries in the high-NRR subtracted library (9%), but it represents only 7% of the clones identified in the NRR-SSHL library. The role and importance of mitochondrial-associated metabolism in fertility has been debated over the last few years [32, 33]. Studies regarding the pathophysiology of human sperm function have indicated that oxidative stress generated by mitochondrial activity in human sperm may be a potential cause of reduced human male fertility [26, 34]. In addition, a correlation between sperm motility and mitochondrial activity has been reported in studies with equine [35], human [36], and bovine spermatozoa [37]. Nonetheless, a correlation with NRRs could not be established in bull spermatozoa [38], possibly because mtDNA is maternally inherited and selective pressure would not affect the male lineage [39]. On the other hand, the present findings, validated on a relatively small cohort of unrelated individuals, provide evidence that the mitochondrial transcript abundance is associated with field fertility (clones L_05_B10, L_05_C09, and F_04_B07) (Table 3). These genes are abundant, with up to 1 000 000 copies present in 10 ng equivalent. If the microarray approach would have been used in the present study, only F_04_B07 would have been found to correlate with the NRR rate. It appears that of the three significant sequence genes, only F_04_B07 shows similarity to one of the references on the Affymetrix GeneChip 430vs2 (identity X00686_5_at), whereas none is present on the Affymetrix GeneChip U74vs2. The SSH investigation allowed us to identify highly significant (P < 0.05) differences between the two bull populations with extreme field fertility indexes (based on NRRs). Transcript abundance was associated with the NRR not only within half-sibs but also within 40 genetically unrelated bulls. The SSH allowed us to identify potential candidates that can serve as markers of good/poor fertility at the molecular level.
Another difference between the two subtracted libraries relates to the heterogeneity of the transcripts identified. For the high-NRR subtracted library, 80% of the clones could not be associated with a known mammalian transcript/protein, only with the trace files of bovine genomic sequences in the NCBI repository (GI: AAFC03046011). Another intriguing clone with unknown identity was frequently found in the low-NRR library. This nonmammalian sequence showed similarity to sequences from the male strobili of pine trees. Because strobili are the male reproductive organs, this finding is both intriguing and interesting. Searches for similar proteins (BLASTX against public databases) were undertaken; most showed some homology to a putative human protein called ALLW1950. This protein has been identified in the human genome through a bioinformatics approach [40], but its function is not known. The gene is present on the Agilent-011521 Human 1A Microarray platform. The protein ALLW1950 has been found to be expressed in malignant human testicular germ cell tumor (GEO accession no. GDS1742) [41]. In other words, a sizeable number of clones, reflecting the abundance of the gene as detected by qRT-PCR (Fig. 5 and Table 3), show some similarity to this putative protein. Furthermore, qRT-PCR products were detected in spermatozoal RNA (Table 3) and in other tissues (Fig. 6). Our findings confirm the presence of the gene in bovine spermatozoal mRNA fractions (Table 3). The transcript is 100-fold more expressed in one of the five low-fertile bulls, thus increasing the variance within this group and decreasing statistical significance. This finding suggests that the transcript is more abundant in the spermatozoa of low-fertile bulls and could be related to the Sire-Family.
Whereas some clones in the high-NRR subtracted library show homology to proteins already identified in spermatozoa or spermatogenic cells (Table 1), those clones are absent from the low-NRR subtracted library (Table 2). The transcripts associated with sperm proteins could be remnants of spermatogenesis that were excluded during the final steps of that process. Such transcripts could be useful as markers of the success of spermatogenesis. For example, TCP1 (clone F_07_B10) is a member of the cytosolic chaperonin-containing TCP1 complex, which is involved in the organization of the cytoskeleton [42]. Members of this complex have been detected at high levels in developing sperm cells [43] and in mature bovine spermatozoa [44]. Glutathione peroxidases (GPX) are proteins that help to protect cells from oxidative damage caused by reactive oxygen species. These molecules are very important in spermatic functions as well as in embryo development [45], so it is not surprising that two proteins of this family have been detected in spermatozoa: GPX4 [46], and GPX5 [47]. The sequence of the F_03_F11 clone shares similarity with another GPX, GPX2, which is specific to the gastrointestinal tract but also shows similarity, albeit to a lesser degree, to the B. taurus GPX4 gene. Further investigation is required to determine whether this is an isoform product of the gene. The clone F_03_B08 presents similarities with the B. taurus X-inactivation center region (Table 1), specifically the repetitive region, which belongs to one of the class/family LINE/L1, adjacent to the XIST gene. The XIST transcript does not encode for a protein but, rather, is a functional transcript. This noncoding RNA is involved in dosage compensation of genes on the X chromosome. The XIST transcript has been detected by RT-PCR in mouse spermatogenic cells up to the pachytene stage [48]; to our knowledge, that was the first study to detect this transcript in mature spermatozoa. The function of the transcript remains elusive, however, because it has been shown that in mice, meiotic sex chromosome inactivation still occurs in germ cells of males when the XIST gene has been disrupted and that meiotic sex chromosome inactivation might be independent of XIST/TSIX transcripts in spermatogenic cells [49]. Because it has been suggested that mice embryos may inherit a preinactivated, paternal X chromosome [50], this RNA might be transmitted to the embryo by the father.
Most of the clones from the NRR-SSHL library showed homology to proteins not identified in mammalian spermatozoa. Some of these proteins, however, might have important roles in spermatogenesis/sperm function. This is particularly true for two unknown proteins: expansins, and male strobili (late-male-developmental). Expansins are proteins that play a key role in cell wall extensibility during plant growth and development [51]. They act as primary cell wall-loosening agents by weakening noncovalent bonds between cellulose microfibrils and cross-linking glycans [52], without disrupting the cell wall. It has been suggested that expansins could play a role in plant reproductive development [53]. For example, these proteins could play a role in mammalian sperm cells during epididymal maturation, when membranous and intracellular rearrangements permit the incorporation of molecules from the surrounding environment [54]. The male strobili developmental sequences infer embryo orientation. In angiosperm embryogenesis, a number of genes play roles in the control of meristematic activity and differentiation [55, 56]. Many of these are putative transcription factor genes that profoundly influence embryo development by inducing [57, 58] and/or repressing [59, 60] a potentially diverse complement of genes. The presence of this transcript in low-fertile bulls is intriguing and warrants further investigation. Because the spermatozoon is a cell depleted of most of its cytoplasmic content, such a protein could become important in its proper functioning, and as is the case with expansins, the malfunctioning of this protein might not be detected in current examinations of semen.
The most challenging aspect of the quantitative study (qRT-PCR) concerned transcript abundance. To further validate the candidates as potential gene markers, we first selected genes with an unknown relevant identity, because they were more intriguing. These candidates, however, also corresponded to the low-frequency clones and low-abundance gene copies. It is particularly difficult to statistically identify differences at the detection limit of the qRT-PCR method in comparison with the mitochondrial clones, which were highly abundant. Because dilution of the RT is mandatory [23] and this affects the template as well, and because of the limited amount of RNA available, it was not possible to validate candidates with very low abundance. Indeed, combined with the limited number of semen straws available (bulls retaining low fertility index are no longer kept for the artificial insemination program, and limited semen samples are conserved for historical purposes), the amount of total RNA available was also restricted. Although with RNA amplification technology the amount of RNA can be increased by 1000-fold, starting with 5–25 ng and reaching 1–10 µg of aRNA after three amplification rounds, it appears that this procedure also produces amplified transcripts with a slight reduction in average molecular mass, as observed using the BioAnalyzer apparatus (data not shown). The success of a validation study of this type depends on the amount of RNA used, the quality of the RNA, the efficiency of the reverse transcriptase step, the sensitivity of the real-time PCR design, the normalization procedure, and the presence of DNA polymorphisms. Three of these parameters posed challenges: the quality of the RNA, the normalization procedure, and the presence of DNA polymorphisms. Whereas the RNA preparations from the OTHER bulls were fresh, the RNA samples for the Sire-Family were 2 yr old, and less material was available. In addition, although the same amplification procedure was performed in parallel, the differences between the Sire-Family and the OTHER group in terms of the efficiency of the RT step were significant (P = 0.0352). This difference cannot be attributed to the average molecular mass or to quantity, because accurate measurements were performed (i.e., using BioAnalyzer and NanoDrop apparatus [see Materials and Methods]). This difference also cannot be attributed to the presence of RT inhibitors, because RT samples were always diluted to prevent inhibition. We developed a robust qRT-PCR test using EGFP as an exogenous marker to monitor RT efficiency, and we demonstrated that dilution of the RT samples is mandatory for accurate qRT-PCR measurements [23]. Thus, quality may be part of the explanation.
The second drawback is the normalization procedure, which is required to reduce experimental variation among individuals. Although much effort was devoted to finding a calibration marker (housekeeping-like), none of the tested genes was suitable based on evaluations using BestKeeper [61] (data not shown). The genes ACTB, GAPDH, and PRM1 (data not shown) were not acceptable, nor were PRM2 and MYC (Table 3). Therefore, only the normalization for attenuating the variation in terms of RT efficiency (evaluated using the EGFP marker) could be applied. Spermatozoa are unique cells, so a need exists to identify housekeeping-like genes, if any exist, for this cell.
The last, but not least, challenge related to the presence of DNA polymorphisms. We found an inconsistency between the qRT-PCR (TaqMan) results and the amount of PCR product detected on gel. For example, we detected DNA polymorphisms in PRM2, F_03_B08 (unknown, XIST related), and L_04_G12 (unknown, ALLW1950) that impaired qPCR detection (TaqMan probe). Indeed, whereas the fragments were, in fact, amplified, as detected on gel electrophoresis, the presence of DNA polymorphisms impaired the probe annealing (confirmed by sequencing; data not shown). Therefore, unless the sequence information on the target gene is known for all individuals involved in a given study, the gene expression level can be wrongly interpreted. One problem encountered involved the design of primers for PCR validation. This problem arose because the only information available for some of the clones was limited to the sequences that we obtained, supplemented by a bovine contig homologous sequence. Therefore, the full transcript sequence (or more information at least) is required to design acceptable primer sets for most of the clones detected. Nevertheless, the majority of clones for which we designed effective primers were detected in spermatozoal and/or testicular RNA samples (Fig. 6). Some clones were detected only in the testicular RNA samples. In these cases, the transcript may be more abundant during earlier stages of spermatogenesis and be present at very low levels in mature spermatozoa. Two NRR-SSHL clones (L_01_G08 and L_02_F08) could not be detected by end-point PCR in either the testis or the spermatozoal RNA samples (not shown) unless qRT-PCR was performed. This suggests that either the abundance of the messengers was too low to be detected by RT-PCR or that these mRNAs were not present in all sperm samples, a possibility consistent with our qRT-PCR results for individual bull aRNA samples. Further investigation using different sire families is necessary to test the latter hypothesis, because such transcripts may be true markers of good/poor fertility. A genetic association study using this candidate gene approach would be necessary to confirm the potential of a marker for fertility. Several other clones would be interesting subjects to study, especially those for which a bovine genomic contig (i.e., Unknown candidate) (Fig. 4, A and B) is the only information available. Validation of this type will be possible when more information is obtained regarding the transcripts through a technique such as gene characterization (exon-intron emplacement, effective promoter, splice variants, and presence of DNA polymorphisms).
Another potential application of this RNA, as already suggested [8, 10], is in the fertilization/embryo development steps. In view of these potential functions, spermatozoal RNA may belong (but is not limited to) to the following categories: 1) RNAs that are essential for spermatogenesis but are no longer useful (i.e., relics that have no impact on fertilization/embryo development); 2) RNAs that are necessary for fertilization/embryo development; 3) RNAs that should not have been retained, because they could have a detrimental effect on fertilization/embryo development; and 4) an abnormal form of the transcript (e.g., splicing defect, presence of functional mutations, etc).
Spermatozoal RNAs corresponding to the first category should be present in either of the SSH libraries described in the present paper, because they would not have an impact on fertilization/embryo development. In fact, most of them may represent remnants of spermatogenesis. However, we also found that the PRM2 transcript was associated with low-fertile bulls (P = 0.0019) (Table 3). This gene might fall into the third category. Although we did not do extensive research concerning the impact of the specific DNA polymorphisms detected in the present study, Yatsenko et al. [11] suggested that functional mutations could be associated with fertility and detected in spermatozoa. Whether the identified RNAs fall into any of these categories is difficult to establish, but we have demonstrated that some transcripts are highly correlated with field fertility in bulls with a high fertility index compared to those with a subfertile status.
To our knowledge, the present study is the first to compare gene expression among spermatozoa from individuals with different field fertility potential. Gene expression in spermatozoa from fertile human males has been examined by microarray [8] and macroarray [62] analyses. The comparison of transcripts present in a pool of 19 ejaculates with transcripts from one ejaculate showed that the RNA profiles for these men were essentially the same: Only four expressed sequence tags out of a total of 2780 differed between the two groups [8]. The arrays used in these studies, however, were commercial and not geared to fertility of male semen. Although further research concerning the identification of the genes we detected is required, our results suggest that differences can be found between samples with different fertility potential. Strong correlations between the field fertility index and the presence of rRNA genes (18S, 12S, and Large Chain R) were confirmed using the most sensitive molecular detection assay (qRT-PCR). Whereas some other identified genes were statistically marginal in the present study, possibly because of the limited number of bulls and material (too few semen straws were available), they could be better tracked within a structured design with different family-related groups studied over two or three generations. With the sustained needs for consensus about relevant molecular and functional tests [63], mitochondria might become markers at the molecular and functional levels that can be used to better identify bulls with desired fertility potential. Whereas conventional semen analyses are poor predictors of fertility, these markers should be used, in the long term, in cases where the current visual examinations are insufficient for distinguishing samples of different reproductive quality.
|
ACKNOWLEDGMENTS
We would like to thank Mr. Steve Methot (Statistician, Agriculture and Agri-Food Canada) for the data analysis. Our thanks also go to Dr. Guylain Boissonneault of the Faculty of Medicine (human reproduction) at Sherbrooke University and to Dr. Claude Robert of the Animal Reproduction Department at Laval University for the critical review and suggestions for the manuscript. We thank Semex Alliance, Inc., a Canadian-owned artificial insemination company, for providing semen samples.
FOOTNOTES
1Supported by the Agriculture and Agri-Food Canada and by a grant from the Dairy Cattle Genetics Research and Development (DairyGen) Council of the Canadian Dairy Network. ![]()
Correspondence: 2N. Bissonnette, Dairy and Swine Research and Development Centre, Agriculture and Agri-Food Canada, 2000 College Road, Sherbrooke, Quebec, Canada J1M 1Z3. FAX: 819 564 5507; e-mail: bissonnettenath{at}agr.gc.ca
Received: 28 November 2006.
First decision: 20 January 2007.
Accepted: 2 November 2007.
REFERENCES
This article has been cited by other articles:
![]() |
N. Bissonnette, J.-P. Levesque-Sergerie, C. Thibault, and G. Boissonneault Spermatozoal transcriptome profiling for bull sperm motility: a potential tool to evaluate semen quality Reproduction, July 1, 2009; 138(1): 65 - 80. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |