Está en la página 1de 15

Vias metabolicas, funciones y estructuras microbianas en Bioreactores

de Tratamiento de Aguas Residuales revelados a travez del uso de


Secuenciacion de Alto Rendimiento.
ABSTRACT: The objective of this study was to explore microbial community
structures, functional proles, and metabolic pathways in a lab-scale
and a full-scale wastewater treatment bioreactors. In order to do this,
over 12 gigabases of metagenomic sequence data and 600,000 paired-end
sequences of bacterial 16S rRNA gene were generated with the Illumina
HiSeq 2000 platform, using DNA extracted from activated sludge in the
two bioreactors. Three kinds of sequences (16S rRNA gene amplicons, 16S
rRNA gene sequences obtained from metagenomic sequencing, and predicted
proteins) were used to conduct taxonomic assignments. Specially,
relative abundances of ammonia-oxidizing archaea (AOA) and ammonia-
oxidizing bacteria (AOB) were analyzed. Compared with quantitative
real-time PCR (qPCR), metagenomic sequencing was demonstrated to be a
better approach to quantify AOA and AOB in activated sludge samples.
It was found that AOB were more abundant than AOA in both reactors.
Furthermore, the analysis of the metabolic proles indicated that the
overall patterns of metabolic pathways in the two reactors were quite
similar (73.3% of functions shared). However, for some pathways (such
as carbohydrate metabolism and membrane transport), the two reactors
diered in the number of pathway-specic genes.
INTRODUCTION.
Activated sludge systems have been regarded as the highly ecient and
widely used biological wastewater treatment process and are an
important approach for treating both municipal and industrial
wastewater. Microorganisms in the activated sludge are the main
contributors for degradation and removal of these pollutants. Many
methods have been developed to investigate and characterize the
microbes in activated sludge. The cultivation-based methods were
traditionally used to isolate and investigate the microorganisms in
activated sludge. However, due to the high selectivity of the culture
medium, most microorganisms cannot be successfully cultivated or
isolated using these methods. Thus cultivation-based methods may
introduce a large bias when being used to explore microbial community
structure and function. About two decades ago, molecular methods were
introduced to analyze microbial communities in a culture-independent
way. These methods greatly increased our understanding of the microbial
communities in activated sludge. However, these molecular methods are
limited due to the PCR bias or low throughput.
The next generation high-throughput sequencing technologies developed
recently introduced several new approaches to analyze the microbial
communities. Sequencing a huge number of 16S rRNA gene PCR amplicons
has allowed exploration of the diversities and abundances of various
microbial populations in activated sludge, ocean water, and other
systems at extremely high resolutions. However, this approach still
cannot eliminate PCR bias. In contrast, high-throughput metagenomic
sequencing, i.e. direct sequencing of the genomic DNA extracted from
the environmental samples, provides a comprehensive approach without
PCR bias and can be used to explore both taxonomic and functional
diversity of a microbial community simultaneously. It has been used to
analyze microbial communities in human gut, lake water, and food. So
far, both 454 pyrosequencing and Illumina sequencing have been applied
for metagenomic sequencing. Illumina sequencing can achieve much higher
throughput thus more cost-eective for samples with high microbial
diversities.
454 pyrosequencing was used to analyze activated sludge previously.
However, due to the low throughput, the information in the activated
sludge was not resolved adequately. Recently, a metagenome of activated
sludge from a full-scale wastewater treatment plant (WWTP) generated
by Illumina sequencing was also reported. The results showed that high
diversities of microbial species and functional genes exist in
activated sludge.
In the present study, we conducted two genotyping methodologies
(sequencing of bacterial 16S rRNA gene PCR amplicons, and metagenomic
sequencing) on the Illumina HiSeq2000 platform to analyze the microbial
community structures and functional proles in a lab-scale and a full-
scale wastewater treatment reactors. The two genotyping approaches were
compared for their applicability to reveal the taxonomic complexity of
activated sludge. The functions and metabolic pathways of the two
reactors were compared. In addition, metagenomic sequencing and qPCR
were used to quantify ammonia-oxidizing archaea (AOA) and ammonia-
oxidizing bacteria (AOB) in activated sludge samples. The overall
results in our study demonstrated the application of high-throughput
sequencing was a powerful tool for comprehensive characterization of
taxonomic and functional proles of the microbial community in
activated sludge.
Table 1. Illumina Sequencing Data Analyzed in This Study
SL
category R batch 1 batch 2 batch 3
read 16S rRNA gene PCR 150,000 150,000 -- --
s amplicon reads (100 16,350,59 16,663,9 15,000,0 15,000,
bp)a metagenomic reads
percentage of 16S rRNA 9 2d
0.097% 46 2
0.167% 00 2e
0.072% 000 2
0.066%
(100 bp)
tags
cont gene in tags
assembled contigs 54,114
13,996,19 14,363,7
67,759 71,708
11,048,5 11,057,
71,760
b
igs (>500 bp) of
tags (180
percentage bp)reads 54.7
7 52
16.8 28.2
59 977
28.2
genes predicted
used for assembly (%)c
from 104,496 106,885 126,676 126,596
the contigs
aThe actual reads numbers of R and SL were 213,720 and 152,448, respectively.
In order to facilitate the comparison between the two samples, the numbers of
reads were normalized to the same value (sequencing depth). bTags mean the
sequences obtained from overlapping the paired-end metagenomic reads. cThe
percentage of reads used for assembly is only counted for contigs but not
for tags. d 2 means paired-end reads. eFor batch 2 and batch 3, we used
the same sequencing depth for comparison.
MATERIALS AND METHODS
Sample Collection and DNA Extraction. In the present study, DNA was
extracted using FastDNA SPIN Kit for Soil (Qbiogene, Carlsbad, CA, USA)
from activated sludge samples collected from a lab-scale reactor and a
full-scale WWTP at Stanley, Hong Kong, represented by R and SL,
respectively. The biomass concentrations in both of the reactors were
about 2000 mg/L. For each sample, 2 mL of activated sludge was used to
conduct DNA extraction. The DNA concentrations, which were quantied
by Nanodrop spectrophotometer (NanoDrop Technologies, Wilmington, DE,
USA), of samples R and SL were 167.4 ng/L and 87.7 ng/L, respectively.
The most obvious dierences of the two reactors were the concentrations
of ammonium nitrogen and organic matter. The lab-scale reactor, with a
working volume of 2.6 L, was operated in continuous mode to investigate
nitrication process. The inuent was made with deionized water (67%)
and seawater (33%). NH4Cl was added to the inuent to get the ammonia
nitrogen concentration of 200400 mg/L. In addition, 20 mg/L of KH2PO4
was added into the inuent to provide sucient phosphorus for the
growth of microorganisms. The reactor in the WWTP applied the combined
biolm and activated sludge systems. The owchart of Stanley WWTP was
shown in Figure S1. The activated sludge sample analyzed in the present
study was taken from the aerobic zone. The ammonium concentration in
the inuent of reactor SL was about 1222 mg NH +-N/L. The organic
matter was about 196481 mg/L (measured by chemical oxygen demand).
Nitrication rate in the reactors R and SL were about 0.14 kg(NH4+-
N)/kg(Biomass)/day and 0.02 kg(NH4+-N)/kg- (Biomass)/day. Figure S2
shows the performance of the lab-scale reactor at the sampling point
for the present study. Table S1 summarizes the general parameters of
Stanley WWTP, which has nitrication process.
DNA Library Construction, PCR and Sequencing. For the metagenomic
sequencing, libraries with insert size of 180 bp were constructed
according to the manufacturers instructions (Illumina) for the two
samples.
For the 16S rRNA gene PCR amplicon sequencing, the extracted DNA was
amplied by 16S rRNA gene V6 region primer set 985F (5-CAA CGC GAA GAA
CCT TAC C-3) and 1046R (5-CGA CAG CCA TGC ANC ACC T-3). Six-nucleotide
barcodes were added to the 5 end of 958F to allow multiplexing. The
PCR was conducted in a 30 L mixture containing 0.2 L of Ex Taq
(TaKaRa, Dalian, China), 3 L of 10 Ex Taq Buer, 3 L of dNTP mixture,
0.2 M of each primer, and 2050 ng of DNA under the following
conditions: 94 C for 2 min; 25 cycles of 94 C 30 s, 57 C 30 s, and
72 C 30 s; and a nal extension at 72 C for 5 min. PCR products were
puried using PCRquick-spin PCR Product Purication Kit (iNtRON
Biotechnology, Korea). After purication, the PCR products were
quantied by Nanodrop spectrophotometer, and the PCR products
concentrations of R and SL were 46.2 ng/L and 40.3 ng/L, respectively.
Both of the metagenomic sequencing and the 16S rRNA gene PCR amplicon
sequencing were conducted using Illumina Hiseq 2000 platform by
applying the 101bp paired-end (PE) strategy. A base-calling pipeline
(Sequencing Control Software, Illumina) was used to process the raw
uorescent images and call sequences. As shown in Table 1, one batch
of metagenomic sequencing data were generated with Sample R, and three
batches of metagenomic sequencing data were generated with Sample SL.
All the metagenomic sequencing data have been deposited into NCBIs
Short Read Achieve (Accession Number: SRA060680).
Validation of Repeatability of DNA Extraction, PCR and Sequencing. To
investigate the repeatability of DNA extraction, PCR and sequencing
used in the present study, three DNA samples (A, B, and C) obtained
from the activated sludge of Stanley WWTP were sequenced on the Illumina
platform as described above. DNA of samples A and B were extracted as
duplicates from the same activated sludge, while C was a PCR duplicate
for sample A using the same DNA extract. Quality Filtering of the Raw
Reads. For all the data sets, reads containing one or more uncalled
bases and reads containing bases with quality score lower than 30 were
removed. All the PE reads of 16S rRNA gene PCR amplicons were overlapped
to decrease the sequencing errors. The operational taxonomic unit (OTU)
analysis for the 16S rRNAgene sequences was conducted using uclust
embedded in Qiime. For the metagenomic reads, both overlapping and
assembly were conducted. The parameters adopted for overlapping were
as follows: at least 20 nt length of the overlap region was required,
and at most two mismatches were allowed. The sequences generated by
overlapping of the PE metagenomic reads were hereinafter referred to
as tags in the present study. The reads that cannot be overlapped
were not included in the analysis of tags.
Illumina Paired-End Reads De Novo Assembly. After quality ltering,
clean reads of the metagenomic data set were assembled into contigs by
using SOAPdenovo assembler. A range of values (2761) for the parameter
K (k-mer size) were investigated, and it was found that the optimal
values were 35 and 33 for samples R and SL, respectively. An optimal
value of K was that which yielded the maximum contig number and the
maximum value of N50, which is the length of the smallest contig in the
set that contains the fewest contigs whose combined length represents
at least 50% of the assembly. Only the contigs longer than 500 nt were
used for further analysis.
SOAPaligner in the SOAP package was used to align all metagenomic
sequencing reads to the contigs. The value of the number of reads
successfully aligned to the contigs divided by the total number of
reads was referred to as percentage of reads used for assembly in the
present study.
Identication of 16S rRNA Genes. To identify 16S rRNA genes, all
assembled tags and contigs were aligned with sequences in the RDP
database (Release 10.26) using BLASTN. Tags and contigs with an E-value
<105 and alignment length >50 nt were extracted as 16S rRNA gene
sequences and used for further analysis. The DNA sequences identied
from the tags were referred to as 16S rRNA gene tags in the present
study.
Gene Prediction. In the present study, MetaGene was used to nd open
reading frames (ORFs) from the contigs of each sample. Only the
predicted ORFs longer than 100 nt were translated into protein sequences
using the NCBI Genetic Codes 11. The obtained protein sequences were
referred to as predicted proteins in the present study.
Taxonomic Assignment. The 16S rRNA gene PCR amplicon reads and the 16S
rRNA gene tags were aligned with the Silva database using the BLASTN
tool. Then the results were imported into the program MEGAN to conduct
the taxonomic assignment with the lowest common ancestor algorithm
using the program default parameters (i.e., min score 50, top percent
10, win score 0.0, and min complexity 0.3). We specied that MEGAN used
the synonyms le (silva2ncbi.map) downloaded from MEGAN Web site
(http://ab.inf.uni- tuebingen.de/software/megan/) to map Silva
accession num- bers to NCBI taxa.
The predicted proteins were aligned with protein sequences in the NCBI-
NR database using the BLASTP tool. The data generated by BLASTP were
also imported into MEGAN for taxonomic assignment as described above.
AOA and AOB Abundance Analysis. Following the methodology published
previously, two approaches were applied to quantify the abundances of
AOA and AOB using 16S rRNA genes. Method 1: BLASTN was used to align
all the reads to four complete crenarchaeal AOA 16S rRNA genes
(uncultured Crenarchaeote 54d9 [GI: 42557759], Candidatus
Nitrosopumilus maritimus [GI:71668096], Uncultured Cren- archaeote 4B7
[GI:14548123], Candidatus Nitrosocaldus yellow-stonii
[GI:166164468]), and six AOB 16S rRNA genes (Nitrosomonas europaea
[GI:30248031], Nitrosomonas commu- nis [GI:1592803], Nitrosomonas
oligotropha [GI:11545282], Nitrosomonas multiformis [GI:82701135],
Nitrosospira sp B6 [GI:1236775], and Nitrosococcus oceani
[GI:77163561]). Then, reads were assigned as AOA-like or AOB-like at
97% similarity and 90 bp alignment length cuto referring to the above
reference.
Method 2: The reads that matched (allowing one or two mismatches) the
probes specic for AOA (Cren1.1b_537) and AOB (CTO189, CTO 654, Nsm156,
Nso1225, and Nsp436) 16S rRNA gene were counted as AOA-like and AOB-
like sequences, respectively. This in silico search was conducted using
BLASTN.
Similarly, potential amoA gene sequences were also quantied by the
methods as described above. The references are amoA genes of AOA
(Nitrosopumilus maritimus SCM1- (GI:71668108), Candidatus
Nitrosocaldus yellowstonii (GI:166164469)), and AOB (Nitrosomonas
europaea (GI:3252912), Nitrosococcus oceanus (GI:2104719), Nitro-
somonas communis (GI:11545288), Nitrosomonas oligotropha
(GI:11545302), Nitrosolobus multiformis (GI:1085094), and Nitrosospira
sp. B6 (GI:18996142)). The cut-os applied for amoA genes were 80%
identity and 70bp alignment length, because the amoA gene is shorter
and more diverse than 16S rRNA genes of AOA and AOB. The probes used
for searching are Arch-amoAF/Arch-amoAR for AOA and amoA- 1F/amoA-2R
for AOB.
Gene Functional Classication. The contigs were annotated using MG-
RAST server with the SEED Subsystem database. The maximum E-value cuto
and the minimum alignment length cuto were set as 105 and 50 bp,
respectively. Most of the genes were successfully classied into the
hierarchical metabolic categories.
Metabolic Pathway Analysis. BLASTP was used to align the protein
sequences translated from ORFs to the NCBI-NR database with E-value
<105. By importing the BLASTP results into MEGAN, the sequences in
each sample were mapped to KEGG pathways. The module KEGGviewer in
MEGAN was used to analyze pathways. In the present study, we mainly
focused on nitrogen metabolic pathways, which is an important process
for wastewater treatment.
Quantitative Real-Time PCR. Archaeal and bacterial amoA gene copy
numbers were quantied by using an iCycler IQ System (Bio-Rad, Hercules,
CA) in triplicate with primer sets Arch-amoAF (5-STA ATG GTC TGG CTT
AGA CG-3)/ Arch-amoAR (5-GCG GCC ATC CAT CTG TAT GT-3) and amoA-1F
(5-GGG GTT TCT ACT GGT GGT-3)/ amoA-2R (5-CCC CTC KGS AAA GCC TTC
TTC-3), respectively. PCR amplication was performed in a total volume
of 30 L containing 15 L of iQ SYBR Green Supermix (Bio-Rad, Hercules,
CA), 5 L of DNA template and 0.3 M of each primer. The thermocycling
steps for qPCR amplication were as follows: 95 C for 7 min, followed
by 40 cycles at 95 C for 60 s, 56 C for 60 s, and 72 C for 60 s. The
PCR products were visualized and checked by agarose (1%) gel electro-
phoresis in the presence of suitable size markers.
RESULTS AND DISCUSSION
Validation of Repeatability of DNA Extraction, PCR and Sequencing.
Three samples (A, B, and C) were sequenced to investigate the
repeatability of DNA extraction, PCR amplication, and sequencing. In
order to keep the same sequencing depth, 82,519 sequences (16S V6
region) were randomly selected from each of the three samples and
combined together (247,557 sequences in total) for OTU analysis. As
shown in Figure 1-I, most of the sequences (over 97%) were classied
into the OTUs shared by three samples. Only very few sequences (lower
than 0.5%) were classied into the OTUs shared by two samples or only
in one sample, which were probably caused by sequencing errors or some
other random factors introduced during the DNA extraction and PCR.
Figure 1-II showed that the percentages of major OTUs obtained from the
three parallel samples were quite consistent, with low standard
deviations (0.090.85). Both Figure 1-I and 1-II indicated that the
repeatability of DNA extraction, PCR and Illumina sequencing applied
in this study was pretty acceptable.
Sequence Assembly Overview. In this study, 600,000 16S rRNA gene reads
and more than 120 million metagenomic reads were obtained for the
taxonomic and functional analyses.

Figure 1. Repeatability of DNA extraction, PCR and sequencing for activated


sludge samples. I: Number of sequences in shared and unique OTUs. A, B, and
C represent three samples sent for Illumina sequencing, DNA of samples A and
B were DNA extraction duplicates from the same activated sludge, while sample
C was the PCR duplicate of the sample A. For each sample, 82,519 sequences
were extracted to conduct OTU analysis at 3% distance cuto. The number in
the red circle represents the number of sequences that are present in the
core OTUs shared by the three samples. The numbers in the green circles
represent the sequences that are present in the OTUs shared by two samples.
The numbers in the blue circles represent sequences that are present in the
OTUs observed in only one sample. II: The percentages of the top-ten abundant
OTUs in the three samples. The numbers on the bars indicate the standard
deviations of percentages of the corresponding OTUs in the three samples.
About 5070 thousand contigs longer than 500 bp were assembled from
metagenomic reads of R and SL, respectively (Table 1). The percentage
of reads that could be assembled into contigs (54.7% for Sample R and
16.828.2% for Sample SL) and the number of contigs (54,114 for Sample
R and 67,75971,760 for Sample SL) indicated that the assembly results
in this study were much better than a previous study on activated sludge
conducted using 454 pyrosequencing, which obtained only 117 contigs
with 0.3% of sequences used in the assembly. A recent paper reported
that, for a sample taken from enhanced biological phosphorus removal
WWTP, the percentage of reads that can be assembled to contigs (>300
bp) was 16%, which was comparable to Sample SL in the present study.
The percentage of sequences (54.7%) assembled into contigs in the Sample
R was higher than that of SL and even higher than a recent study on
human gut microbes using the same approach, which assembled 42.7% of
Illumina reads. That should be attributed to the low microbial diversity
in Sample R. According to Table 1, there were 0.066%0.167% potential
16S rRNA gene sequences obtained from the assembled gene tags, which
were generally in agreement with a previous study analyzing microbes
in a biogas reactor using metagenomic approach.
Taxonomic Complexity of the Microbial Diversity. As described in the
Materials and Methods section, three kinds of sequences (16S rRNA
amplicons, 16S rRNA gene tags, and predicted proteins) obtained by
dierent approaches were used to conduct taxonomic assignments for
bacteria. However, it was found that even at the phylum level the
results of these methods were not consistent with each other (Figure
2). For Sample SL, the results of 16S rRNA gene PCR amplicons had a
good linear relationship with the results of 16S rRNA gene tags.
However, in other cases (the subplots in Figure 2), no good linear
relationships were observed. Based on current knowl- edge, due to the
methodological limitation, it is dicult to accurately reveal the ratio
of each kind of bacteria in the activated sludge systems for their
extremely high diversities. Albertsen et al. compared results from
metagenomic sequencing and FISH analyses, which indicated that some
phyla (such as Actinobacteria, Chlorof lexi, TM7, and Bacter- oidetes)
varied greatly when comparing the results of these two methods. The
method based on 16S rRNA gene amplication suers from PCR bias, which
may lead to underestimation or overestimation of species abundance or
diversity. Besides 16S rRNA genes, in some studies the predicted
proteins were used to analyze bacterial abundance, the results were
biased because 1) the assembly process generates nonredundant contigs;
and, 2) the bacterial protein databases are far from completed, 3)
species with a lot of strain variations will not be represented well
in the resulting contigs. Theoretically, the results of 16S rRNA gene
tags were more reliable owning to the lack of PCR bias and a relatively
complete 16S rRNA sequence database. Furthermore, in the future it can
be expected that the reliability will improve with the increase of
sequencing read length. In addition, it is easy to understand that the
severity of the PCR bias in dierent systems may not be the same. If
PCR bias was not so severe, the result of 16S RNA gene amplicons would
be close to the result of the 16S RNA gene tags, which was probably the
reason why the 16S rRNA gene PCR amplicons correlated well with the
results of 16S rRNA gene tags in Sample SL in Figure 2.
We have also used 454 pyrosequencing for the two bioreactors in our
previous studies. For the steadily operated full-scale reactor (Sample
SL), it was conrmed that the most dominant phylum was Proteobacteria,
which accounted for about 4050% of the total bacteria. This nding is
generally consistent with the activated sludge of WWTP in other studies.
However, the percentages of other phyla were dierent between the
results of metagenomic sequencing (the present study) and
pyrosequencing. The similar dierences were also observed between FISH
and metagenomic sequencing. For the lab-scale reactor (Sample R), large
amounts of Nitrospirae were observed (about 30% of the sequences,
according to results based on gene tags; Figure 2), which was quite
dierent from that in the previous study (about 7%) when the reactor
was operated in the partial nitrication state. This indicates that in
autotrophic nitrication reactors the abundance of Nitrospirae may be
comparable with (or even outnumber) the abundance of Proteobacteria.
To investigate and compare the microbial complexity in the two reactors
of this study, rarefaction analysis was conducted on the data sets of
16S rRNA gene tags and 16S rRNA gene PCR amplicon sequences using MEGAN
by repeatedly sampling subsets of sequences. Then, at each sampling
depth, the number of nodes (genus level) obtained based on the subsets
of sequences was counted. The comparative rarefaction analysis results
of 16S rRNA gene tags and 16S rRNA amplicon sequences (Figure S3) showed
that the bacterial diversity in Sample SL was much higher than in Sample
R. The number of nodes at genus level obtained by using 16S rRNA gene
tags was about twice more than that using 16S rRNA PCR amplicon
sequences (Figure S3). This was probably due to two reasons:
1) the primer pair for 16S rRNA gene V6 region is not conserved enough
to amplify products from all bacteria in the activated sludge; and 2)
a 16S rRNA gene tag is about two times longer than a 16S rRNA gene PCR
amplicon sequence in this study and could be assigned to a genus with
more condence.
Figure 2. Abundances of bacterial phyla in the two samples determined by three
dierent data sets. The subplots show correlation between dierent data sets.
The Genes, 16S Tags, and 16S Amplicons represent genes predicted from
ORFs, 16S rRNA gene tags identied from metagenomic sequencing, and 16S rRNA
gene PCR amplicon sequencing reads, respectively.
Table S2 illustrates the taxonomic assignment results of orders in the
two samples, which was based on the taxonomic assignment results of 16S
rRNA gene tags. The dominant orders in Samples R and SL were Nitrospirae
and Actino- mycetales, accounting for about 40% and 20% of the
sequences, respectively. Among the ten most abundant orders of each
sample, ve orders were shared by the two samples, which were
Flavobacteriales, Rhizobiales, Enterobacteriales, Actinomycetales, and
Planctomycetales. This indicated that these bacteria are adaptable to
both of the reactors.
Abundances of AOA and AOB. In this study, the abundances of potential
AOA and AOB were quantied based on the amoA gene and 16S rRNA gene
using both metagenomic (alignment and probe search) and qPCR approaches
(Figure 3). Due to the nonspecic amplication and/or the formation of
primer dimer, the quantication of AOA amoA gene in Sample SL was not
successful. The same failures were also encountered in the cases of
other full-scale wastewater treatment bioreactors in our previous
study. This may be attributed to the extremely higher microbial
diversities and PCR inhibitors in these full-scale reactors. The
results obtained from metagenomic data (Figure 3-I) on both amoA gene
and 16S rRNA gene suggested that AOB were much more abundant than AOA
in both of the two samples. The amoA gene copy number of AOA and AOB
were 1 and 2.5, respectively, and most species of AOA and AOB have only
one copy of 16S rRNA gene. Therefore, even these copy numbers were
taken into account, AOB still dominated over AOA in the two reactors.
These results were consistent with our previous report and others
results of activated sludge systems. Although the quantication of AOA
amoA gene in Sample SL failed, the qPCR results in Figure 3-II conrmed
that AOB amoA gene dominated over AOA amoA gene in Sample R. However,
in other environments, such as soil, seawater, and sediments, AOA are
usually more abundant than AOB. The fact that AOB exceeded AOA in the
activated sludge systems was probably due to the high ammonia
concentration and high ammonia loading in the wastewater treatment
bioreactors. As reported by Martens-Habbena et al., AOAs specic
anity for ammonia was higher than that of AOB. Usually AOA are
abundantly present in environments with low ammonia concentrations.
Based on our above results and others reports, it could be inferred
that AOB may be more competitive than AOA in activated sludge system
and function as the main contributors for nitrication. Figure 3 also
indicates that the absolute abundance of the AOB amoA sequences and 16S
rRNA sequences were not consistent. This could be caused by a number
of reasons, including the dierent copy numbers of amoA gene and 16S
rRNA in various species of AOB, the dierent lengths of AOB amoA gene
and 16S rRNA gene, etc. However, both amoA gene and 16S rRNA gene
demonstrated that AOB were more abundant than AOA in the two reactors.
Gene Functional Classication and Metabolic Path- way. All the contigs
of the two samples were annotated by MG-RAST against the SEED Subsystem
database. Using the maximum E-value cuto of 105 and the minimum
alignment length cuto of 50 bp, about 87.6% of the ORFs on contigs of
Sample R and 78.7% of the ORFs on contigs of Sample SL were successfully
assigned to hierarchical categories. This was higher than the
percentage of sequences successfully assigned (40%) in a previous study
conducted 4 years ago using 454 pyrosequencing. The possible reasons
for the higher assignment percentages could be the more updated
databases, dierent annotation methodologies used, and dierent kinds
of sequences used for annotation. For example, reads were used for
annotation by the previous study, while the genes predicted from
assembled contigs were used for annotation in the present study.
73.3% of the functions annotated from Sample R and SL based on SEED
subsystems were shared by two samples, indicating the similar metabolic
pathways present in the two bioreactors. Figure 4 illustrates the
abundance dierences for the most common categories of functional genes
for the two samples. The genes involved in carbohydrate metabolism were
lower in Sample R than in Sample SL. This may be due to the low
concentration of organic matter (TOC < 0.5 mg/L) in the lab-scale
nitrication reactor. Besides, the genes assigned into Fatty Acids,
Lipids, and Isoprenoids and Metabolism of Aromatic Compounds categories
had signicantly higher abundances in Sample SL than Sample R,
indicating that chemical composition of wastewater in the municipal
WWTP were much more complicated than that in the lab-scale reactor. It
was also found that Sample SL harbored slightly more genes related to
the whole nitrogen metabolism, while the total number of nitrication
genes in Sample R was more than that in Sample SL (Figure 5, from
ammonia to nitrate). Another dierence between the two samples was
genes associated with membrane transport. The fact that the
nitrication reactor contained about 1% salinity could be a reason for
the high abundance of membrane transport genes in Sample R. Furthermore,
a nding of few photosynthesis-related genes and low abundance of genes
involved in phosphorus metabolism (compared with nitrogen metabolism)
are similar to results reported by Sanapareddy et al.
Figure 3. Abundances of AOA and AOB in Sample R and Sample SL. I: AOA and AOB
sequences identied from metagenomic sequencing data. A: The number of reads
that aligned to AOA/AOB 16S rRNA reference sequences with similarity cuto
97% and alignment length cuto 90 bps. B: Number of reads that matched AOA/AOB
16S rRNA specic probes, allowing up to one mismatch. C: The number of reads
that aligned to AOA/AOB amoA gene reference sequences with identity cuto 80%
and alignment length cuto 70 bps. D: Number of reads that matched AOA/AOB
amoA gene specic probes, allowing up to two mismatches. II: Abundances of
AOA and AOB amoA gene copy numbers quantied by qPCR. The quantication of
AOA amoA gene in sample SL failed due to the serious smear and nonspecic PCR
amplication.

Figure 4. Functional categories of Sample R and Sample SL according to the


SEED Subsystems database. The data of this gure were generated based on the
ORFs predicted from the contigs of the two samples. The percentage refers to
the percent of ORFs assigned into each category in the total ORFs.

Figure 5. The nitrication and denitrication KEGG pathway. The red and blue
numbers represent the abundances of the corresponding genes predicted from
ORFs in Sample R and Sample SL, respectively. The blue numbers before and
after / indicate the minimum and max values in the repeated sequencing
batches. The numbers in the box represent EC numbers for each of the enzymes,
whose denitions are listed in Table S3.
In addition to the metabolic categories, the gene projections of the
two samples on KEGG pathways were shown in Figure S4, which illustrates
an integrated picture of the potential metabolic pathways that were
present in the two activated sludge systems. As can be seen in Figure
S4, there was a complex assemblage of metabolic pathways, which was
consistent with a high biodiversity and complexity of activated sludge
systems. Overall, most of the pathways were common to both reactors
(e.g., fatty acid metabolism, urea cycle, citrate cycle, etc.), which
was consistent with the high percentage (73.3%) of functions shared by
the two reactors. The main dierences between the reactors were the
abundances of genes involved in some specic pathways. It is also
obvious that there were quite a few pathways in full-scale reactor not
detected in the lab-scale reactor (such as steroid biosynthesis, dioxin
degradation, xylene degradation, etc.), consistent with the higher
microbial diversity in Sample SL than in Sample R. The nitrogen
metabolism pathways, which contain the important nitrication and
denitrication processes in wastewater treatment, were a focus of the
study.
Figure 5 shows the metabolic pathways of nitrication and
denitrication. The enzyme commission (EC) numbers in Figure 5 are
dened in Table S3. Figure 5 and Table S3 showed that most enzymes
existed in both of the reactors. One exception is enzyme 1.7.7.2
(ferredoxin-nitrate reductase), which is an enzyme catalyzing nitrite
oxidation to nitrate. The main dierence between the two samples is the
dierent abundance of each gene. As shown in Table S3, the numbers of
genes coding enzyme 1.13.12.-(ammonia monooxygenase) and 1.7.3.4
(hydroxylamine oxidase) in Sample R were about 34 times higher than
those in Sample SL, indicating the high nitrication capability in the
nitrication reactor. This was consistent with the higher inuent
ammonium concentration and higher nitrication rate in R than SL. The
genes coding enzymes (such as 1.7.99.4, 1.7.2.1, and 1.7.99.7) involved
in denitrication were more abundant in Sample SL than in Sample R.
Although almost no denitrication was observed in the lab-scale
nitrication reactor (Figure S1), there were still quite a lot of genes
related to denitrication in Sample R. It seems that these
denitrication genes were not expressed in the lab-scale nitrication
reactor because of the low concentration of organic matter and high
concentration of dissolved oxygen.
Although it is still at the very beginning, high-throughput sequencing
has provided a powerful tool to explore complicated ecosystems like the
activated sludge. In this study, we compared dierent kinds of sequences
(16S rRNA gene amplicons, 16S rRNA gene tags, and predicted proteins)
for taxonomic assignments and found that 16S rRNA gene tags had less
bias for investigating the microbial community. Also, metagenomic
sequencing was demonstrated to be a better approach to quantify AOA
and AOB in activated sludge samples than qPCR which often leads to
nonspecic PCR amplication and even unsuccessful PCR. Based on both
metagenomic sequencing and qPCR results, it was concluded that AOB were
more abundant than AOA in the lab-scale nitrication reactor and the
full-scale municipal wastewater treatment reactor. Furthermore, the
analysis of the metabolic proles indicated that the overall patterns
of metabolic pathways were quite similar (73.3% of functions shared)
in the two dierent reactors. It is important to note that the metabolic
pathways analyzed in the present study were only potential ones because
the current analyses were based on DNA instead of RNA. Further studies
based on RNA are deserved to explore the active functions in dierent
activated sludge systems.

También podría gustarte