Environmental Enteric Dysfunction Includes a Broad Spectrum of Inflammatory Responses and Epithelial Repair Processes

Background & Aims Environmental enteric dysfunction (EED), a chronic diffuse inflammation of the small intestine, is associated with stunting in children in the developing world. The pathobiology of EED is poorly understood because of the lack of a method to elucidate the host response. This study tested a novel microarray method to overcome limitation of RNA sequencing to interrogate the host transcriptome in feces in Malawian children with EED. Methods In 259 children, EED was measured by lactulose permeability (%L). After isolating low copy numbers of host messenger RNA, the transcriptome was reliably and reproducibly profiled, validated by polymerase chain reaction. Messenger RNA copy number then was correlated with %L and differential expression in EED. The transcripts identified were mapped to biological pathways and processes. The children studied had a range of %L values, consistent with a spectrum of EED from none to severe. Results We identified 12 transcripts associated with the severity of EED, including chemokines that stimulate T-cell proliferation, Fc fragments of multiple immunoglobulin families, interferon-induced proteins, activators of neutrophils and B cells, and mediators that dampen cellular responses to hormones. EED-associated transcripts mapped to pathways related to cell adhesion, and responses to a broad spectrum of viral, bacterial, and parasitic microbes. Several mucins, regulatory factors, and protein kinases associated with the maintenance of the mucous layer were expressed less in children with EED than in normal children. Conclusions EED represents the activation of diverse elements of the immune system and is associated with widespread intestinal barrier disruption. Differentially expressed transcripts, appropriately enumerated, should be explored as potential biomarkers.

S tunting, defined as a height-for-age z score (HAZ) of less than -2, affects 26% of all children younger than the age of 5 years worldwide. 1,2 Stunting is associated with reduced neurocognitive capability, diminished immunocompetence, 20% of disability-adjusted life years lost in this age group, and more than 2.1 million deaths annually. 2 Optimal gut health encompasses effective dietary nutrient absorption and a mucosal immune response that confines microbes to the lumen without inducing chronic tissue inflammation. Environmental enteric dysfunction (EED) is an asymptomatic, diffuse villous atrophy of the small bowel associated with chronic mucosal T-cell infiltration and reduced paracellular integrity. 3 EED is highly prevalent, often without gastrointestinal symptoms, in poor children in the developing world. 4,5 EED typically is assessed with a dual sugar permeability test, whereby mannitol (molecular weight, 182 daltons) and lactulose (molecular weight, 342 daltons) are ingested under controlled conditions and quantified in the urine. Both sugars are neither degraded in the upper gastrointestinal tract nor systemically metabolized after absorption, and are excreted rapidly in the urine. 6 Lactulose is a disaccharide, which can be absorbed only by passively crossing disrupted cell junctions, and thus the amount of this sugar in the urine reflects small-bowel permeability. 6,7 Mannitol, a monosaccharide, is absorbed across cell membranes and between cell junctions and is included to normalize lactulose uptake and excretion to the mucosal surface area, and to control for variations in gastric emptying time. Both the ratio of urinary lactulose to mannitol and the fraction of lactulose that is excreted in the urine (lactulose permeability [%L]) have been used to assess gut health. The dual sugar assay, although imperfect, is a theoretically sound measurement test of gut health. 7 Much has been learned about health and disease in the past decade by using agnostic surveys of the human transcriptome, 8 including, in recent years, reliance on RNA deep sequencing to profile transcriptional response to injury. Unfortunately, these methods have required RNA samples larger than 1 mg that have been processed to remove inhibitors of nucleic acid hybridization and nonhuman RNA. This requirement has limited our understanding of the host transcriptome analyses of feces from individuals.
This report details the development and application to a human cohort of a novel RNA selective isolation procedure from human feces, coupled with high-density, whole human transcriptome microarray technology to interrogate samples from 259 rural Malawian children with varying states of EED.

Study Design
This was a prospective cohort observational study of rural African children at high risk for EED. The primary outcomes were the correlation between %L and expression levels of protein coding genes, based on data that %L correlates with linear growth in this population. 9,10 Secondary outcomes were associations with Kyoto Encyclopedia of Genes and Genomes (KEGG) and canonical pathways in EED.

Eligible Subjects
The study was conducted in rural Malawi, where populations practice subsistence farming (corn and beans), and reside in mud and thatch homes. Water is collected from boreholes and wells; electricity is unavailable. Inclusion criteria consisted of subjects between 12 and 61 months of age who reside in 1 of 6 rural communities under research surveillance, and included 810 children in total. [11][12][13] This included a spectrum of children with EED, from no EED to severe EED. Children were excluded if they had a chronic disability or disease, severe acute malnutrition, or were receiving therapy for tuberculosis. All subjects were interviewed and examined by a physician and found to be free of pathologic conditions. Weight, length, and mid-upper-arm circumferences were measured by trained and monitored staff to determine nutritional status.

Dual Sugar Absorption Testing
Dual sugar permeability testing was conducted in a supervised setting, and complete consumption of the sugars and collection of all urine during the subsequent 6 hours was verified. 12 Children consumed no food or drink for 8 hours before drinking 20 mL of water into which 1 g of mannitol and 5 g of lactulose were dissolved. This was administered immediately after children voided. A dual sugar permeability test was considered successfully completed when all urine was collected for at least 4 hours after ingestion of the sugars, without spillage of dosing sugars or urine. Urine volumes were measured using a graduated cylinder, and a 2-mL aliquot was flash-frozen and shipped to the Baylor College of Medicine (Houston, TX) where urinary lactulose was measured using high-pressure liquid chromatography. 14,15 EED severity was assigned using population data from a larger clinical study such that the children with %L less than 0.2 were designated as not having EED, and those with %L greater than 0.2 and less than 0.7 were designated as having intermediate EED, and those with %L greater than 0.7 were designated as having severe EED. 9 The transformation log 2 (%L*100) was used to determine linear correlations between %L and microarray data.

Stool Collection
Fresh stools were collected before the dual sugar absorption testing using a small, clean, nonabsorbent, plastic diaper. The stools were transferred immediately to cryovials and flash-frozen in liquid nitrogen. Samples were transferred to a -80 C freezer and transported to Washington University (St. Louis, MO), where they then were processed and analyzed for the human fecal transcriptome as outlined in Figure 1 and detailed later.

Samples Chosen for Transcriptome Analyses
We chose 259 children for whole-transcriptome analysis on the basis of a mannitol excretion greater than 3%, a total urine volume greater than 15 mL, and a broad distribution of urinary %L values, including normal children. The mannitol was used as a test validation criterion because very small amounts of mannitol absorption indicate very rapid transit intestinal transit times, which distorts the validity of %L as a measure of gut integrity.

Enriching Fecal Samples for Exfoliated Enterocytes by Differential Centrifugation
Fecal samples were enriched for human cells by differential centrifugation before RNA extraction. Approximately 300-500 mg of frozen stool was transferred to a 15-mL conical tube with 10-15 zirconium/silica beads (2.3 mm) and 3 mL of Hank's balanced salt solution (Gibco/Life Technologies, Grand Island, NY) with 0.05% Tween-20 (Sigma, St Louis, MO). The samples were vortexed gently for 5 minutes to suspend aggregates. The buffer volume was increased to 10 mL and incubated at 4 C on a rotator for 10 minutes, followed by centrifugation at 1000 rpm (500g) for 10 minutes. The supernatant was removed and the pellet was resuspended in 10 mL of Hank's balanced salt solution/ Tween-20 buffer and incubated and centrifuged as before.

Assaying the Human Fecal Transcriptome on a High-Density Microarray
Human-enriched RNA extracted from differential centrifugation pellets with a minimum of 15 glyceraldehyde 3-phosphate dehydrogenase copies/ng was used for microarray assays. At least 100 ng of DNase-free fecal RNA was amplified with the Ambion WT-plus kit (Ambion/Life Technologies, Grand Island, NY) and hybridized to the GeneChip Human Transcriptome Array 2.0 from Affymetrix (Santa Clara, CA) following the manufacturer's protocols. 24 In total, 263 arrays were analyzed, consisting of samples from 259 different individuals and 4 technical replicates.

Processing Microarray Signals Into Robust Multi-Array Average, Iterative Rank Order Normalization, and Factor Analyses For Robust Microarray Summarization Data Sets
Raw off-scanner microarray intensity data were normalized by 3 standard methods. The 3 methods differ in their assumptions of data distribution and in the method used for background processing, signal normalization, and summarization.
First, robust multi-array average (RMA), the default method, was performed in Affymetrix Expression Console, and involves 3 steps: background correction, quantile normalization, and median-polish summarization. 25 RMA output includes signal intensity values as well as probe-set level detection P values, which filter out individual transcripts with noisy low intensity level. Fecal samples were collected fresh from subjects, immediately flash-frozen in liquid nitrogen, and transported to the laboratory. In the laboratory the cells were suspended in buffer with inert beads and centrifuged at 500g. The resulting pellet was kept, resuspended in lysis buffer, and used for total nucleic acids extraction. DNase was added to the nucleic acids mixture, and RNA was separated from the suspension using a bead-based affinity method. The RNA then was amplified and hybridized to a chip containing 25mers covering the entire human genome. The signals corresponding to luminescence for each 25mer were aggregated into genes, and normalized using 3 standard methods. Those transcripts that showed significant correlation with % L, a marker of EED, and differential expression with subsets of increased and normal %L were identified. All transcripts then were used to determine pathway expression for all canonical and KEGG pathways. Transcripts that were correlated with %L, differentially expressed between children with no EED and severe EED, and present in pathways also associated with EED were considered to be of biological significance for EED. HTA, Human Transcriptome Array 2.0 (Affymetrix).
Second, iterative rank order normalization (IRON) using libaffy version 2.1.5 (http://gene.moffitt.org/libaffy), which consists of RMA background correction, probe-level IRON, the Tukey bi-weight summarization, and a final probe-set or transcript-level IRON. 26 IRON normalizes through a gradually adjusted subset of invariant features (probe, probe-set, or transcript/gene) in a pair-wise fashion; each individual chip against the reference median chip, the one with the smallest root-mean-square deviation in the data set. IRON output includes only signal intensity values, and detection calls rely upon the RMA method.
Third, factor analyses for robust microarray summarization (FARMS) was performed using the R package FARMS, and does not correct for background but does normalize to quantiles. 27 Because of an allocation memory issue inherited in the FARMS software, we ran FARMS 10 times for each of 3 randomly grouped subgroups of 259 microarray samples (ie, 30 runs in total). FARMS output includes informative/ noninformative calls for genes and probe-sets, in addition to intensity values. The informative/noninformative calls can be used to filter out poorly performing probe-sets and transcripts in the data set.
RMA, IRON, and FARMS data sets each were filtered to exclude microRNA, open reading frame, nonprotein coding, pseudogene, antisense, small nucleolar RNA, and uncharacterized RNA. Transcript clusters for high variable regions of some genes localized on haplotype chromosomes and unplaced contigs such as HLA antigen also were excluded from the analysis. Final analysis thus was performed on 3 transcript-level data sets that each contained logtransformed signal intensities for 18,646 known genes that have a well-annotated official gene symbol.

Identifying Transcripts Associated With EED by Correlation and Differential Expression
Transcripts correlated to the continuous variable %L were identified by analysis of covariance to 257 normally distributed log 2 -transformed %L values (2 outlier %L values were removed from the total of 259 subjects) using Partek Genomic Suite software, version 6.6 (Partek, Inc, St Louis, MO). Differentially expressed transcripts were identified by analysis of variance between 60 healthy subjects (%L < 0.2) and 42 with severe EED (%L > 0.7) using the R package limma. 28

Identifying KEGG and Canonical Pathways Associated With EED
Transcripts that were correlated significantly with %L (analysis of covariance, P < .01) were used to identify canonical pathways associated with EED using the GeneGO web tool MetaCore (Thomson Reuters version 6.21, build 66768, Philadelphia, PA). 29 Fold-change data from differential expression analyses of all transcripts were used to identify enriched KEGG pathways using an R package generally applicable gene set/ pathway enrichment. 30 All significant pathways were defined minimally at P < .01, and a false discovery rate less than 0.25.

Interpreting Biologically Significant Transcripts and Pathways
Biological significance was defined as statistically significant associations between %L and the normalized luminescence measurements in both IRON and RMA data sets. Common transcripts associated with EED were identified by significance in both correlation and differential expression analyses in both IRON and RMA data sets, and then filtered to include only protein-coding genes detected in more than 10% of the 259 arrays. Common pathways associated with EED were defined by enrichment in both RMA and IRON data sets.
Transcripts that were associated with %L also were tested for association with change in HAZ (dHAZ) over the next 3 months, because one of the primary clinical interests of EED is that it is associated with poor linear growth. Growth data were available for 213 of the 259 subjects, and Spearman correlation analysis was performed on 211 normally distributed dHAZ values (2 outlier dHAZ values were removed) using Partek Genomic Suite software, version 6.6 (Partek, Inc).

Validation of Fecal Transcriptome Results
The reproducibility of microarray signals from fecal extractions was validated with the 4 replicate arrays using the Pearson correlation test and illustrated in scatter plots. Furthermore, signal distribution was compared between fecal microarray data and publicly available colon tissue microarray data (Affymetrix Sample Data) 31 using the Kolmogorov-Smirnov test and shown in histogram (Supplementary Figure 1). Prior qPCR data for 42 genes were available for at least 50 of the 259 subjects, and Pearson correlation analysis was performed between qPCR and transcript level microarray signals for RMA, IRON, and FARMS data sets to validate normalization methods. 21 Additional qPCR assays were performed on 24 of the 51 transcripts identified by the microarray as associated with EED to validate analysis results.

Subjects
A total of 259 rural, asymptomatic, Malawian children at risk for EED were studied ( Table 1). The %L was associated with reduced linear growth, expressed as dHAZ in the subsequent 3-month period ( Figure 2).

Human Fecal Messenger RNA Is Reproducibly and Reliably Measured by Microarray
Expression of all transcripts was highly correlated (Pearson r > 0.95) in replicate arrays regardless of the normalization method (mean ± SD for RMA, 0.98 ± 0.00; IRON, 0.96 ± 0.01; and FARMS, 1.00 ± 0.00), which is comparable with Affymetrix reference microarray data from colon biopsy specimens ( Figure 3). Approximately 80% of the 18,646 transcripts were detectable in at least 10% of 259 samples (Figure 3). More similarity between fecal and colon tissue microarrays was observed in the distribution of signal in RMA and IRON data, than in FARMS (Kolmogorov-Smirnov D values: RMA, 0.482; IRON, 0.462; and FARMS, 0.654) (Figure 3). Microarray and qPCR signals also were correlated highly in RMA and IRON normalized data, with significant correlations (P < .05) in 79% and 69% of 42 genes tested, respectively, whereas FARMS signals were less correlated to qPCR (Figure 4).

Microarray Identified Biologically Relevant Transcripts and Pathways Associated With EED
The numbers of transcripts that were correlated significantly to %L (P < .01) using either RMA or IRON signal normalized data or those transcripts that were expressed differentially (P < .05 and absolute value of fold-change > 1.1) between healthy subjects and those with severe EED are summarized in Table 2. Further interpretation of biological significance focused on those transcripts correlated and expressed differentially in both IRON and RMA normalized data sets because both were well validated by qPCR, and indicated a higher similarity in data distribution between good-quality colon RNA and degraded fecal RNA. Fifty-one common significant transcripts were identified as correlated and expressed differentially in EED ( Table 3). The gene symbols are defined and further descriptors of these transcripts are listed in Supplementary Table 1. Twenty-four of these also were tested by qPCR, and the 18 that were detectable all correlated highly to microarray signals (Table 4).
Almost all of the 51 transcripts code for immunologically active proteins, such as IgG or IgE, or for cytokines that modulate the immune response. The molecules encoded include proteins that are made in response to members of various microbial kingdoms, including parasites, bacteria, and viruses (Table 5). Among the 51 transcripts are 6 that code for proteins that affect cell adhesion between epithelial cells. There was a paucity of transcripts that code for structural proteins or enzymes believed to be unique to the small intestine.
Common pathways associated with EED were identified by enrichment in both IRON and RMA data sets, and consist of 6 GeneGO canonical pathways and 15 KEGG signaling pathways (P < .01, false-discovery rate < 0.25) that are related predominantly to cell adhesion and immunologic responses ( Figure 5 and Supplementary Tables 2 and 3). A subset of 12 of the 51 common transcripts associated with EED map to significantly enriched common KEGG pathways and include chemokines that stimulate T-cell proliferation, Fc fragments of multiple immunoglobulin families, interferon-induced proteins, activators of neutrophils and B cells, and mediators that dampen cellular responses to hormones (Table 6 and Figure 6).
Four mucins (MUC2, MUC4, MUC12, and MUC20), epidermal growth factor receptor, and 3 mitogen-activated protein kinases (MAPK7, MAPK8IP1, and MAPK8IP2), were correlated negatively with %L (P < .05 for the Pearson correlation coefficient and the Spearman correlation coefficient using either the IRON or the RMA data set). These 8 proteins, each of which are relevant to mucous biology, are remarkable in that almost all of the other HIV, human immunodeficiency virus.  Of the 42 transcripts that were assessed by both microarray and PCR, significant correlations were found in 36 of them by one or more normalization methods. The transcripts were not chosen simply for their association with EED, because some were not associated, but to assess the accuracy of the microarray across the spectrum of protein coding genes. Scatter plots of technical replicates showing high reproducibility of the microarray data generated using fecal RNAs. These data were quite comparable with that generated using high-quality colon RNA (colon tissue RNAs were adopted from Affymetrix publicly available Sample Data). Note that the FARMS summarized data appears to show a substantial level of compression. Conversely, a somewhat higher degree of variation was noted within the IRON normalized data. (B) Histogram of signal detection level showing that microarray technology is reliable in the detection of low copy numbers of fecal RNAs. We calculated the detection of transcript clusters (genes) based on the P values reported at probe-set level for intensity data (there were no P values reported for the transcript cluster level of intensity data). At first, the total number of detected multiple probesets for a given transcript cluster was counted across the entire 259 chips at P < .05, then this number was divided by the total number of multiple probe-sets on a chip for this given transcript cluster. Approximately 80% of the 18,646 known genes were detectable in at least 10% of 259 samples.
correlations with %L are positive, thereby denoting transcripts that are more abundant with EED. In addition, a negative regulatory transcription factor in goblet cells (recombination signal binding protein for immunoglobulin kappa J region) shows a highly significant correlation with %L (P < .01).
Linear growth data over the subsequent 3 months after stool sampling were available from 213 of the 259 children, and expressed as dHAZ to normalize for age. Among the 51 common transcripts associated with %L, 17 also were correlated with dHAZ after normalization with either RMA or IRON (AQP9, CLEC7A, FCGR2A, FCGR3B, IFTM1, IFITM2, IFTM3, LYN, LYZ, MNDA, MSN, NCF2, PLEK, PROK2, S100A8, SAMSN1, and SELL) ( Table 3). Among the 12 genes that reside in KEGG pathways that are overexpressed, 6 of these correlated with dHAZ.

Discussion
In this study the human transcriptome was assessed in individual fecal samples. Previously, host fecal RNA analyses had been performed in samples that were aggregated from similar subjects, to allow for larger amounts of RNA available for analyses 32,33 or in analyses that targeted specific loci. 21,34 Our findings suggest that the 25mer, high-density microarray technology coupled with careful fecal specimen collection and a conservative RNA isolation method allows for interrogation of the gut transcriptome. The extensive, whole human transcriptome, nature of the read-outs, and the ability to quantify signals and discern nonrandom pathways lends credence to using transcript capture, rather than sequencing or more tedious and potentially biased specific transcript quantitative PCR, for host organ analysis in stool.
The primary limitation of our methodology was that we did not directly validate the read-outs with transcriptome analyses from biopsy specimens. This would have been impossible in rural Africa, and incompatible with the amplification we used on the fecal specimens. We did not observe many reductions in genes and pathway expression that were associated with EED; this might suggest that our analyses were operating at the edge of the detection limit.
Greater sensitivity might have allowed us to detect changes in hormones and cytokines that adversely affect linear growth, although it is not clear that the gut epithelium is the site from which growth-affecting transcriptional responses are generated. We also recognize that out findings are from rural African children consuming a plant-based diet without public sanitation services, and we do not know if these data and this technology can be applied to other populations.
The greatest challenge in this work was to identify a technique that would quantify components of the human fecal transcriptome accurately, given the paucity of specific human messenger RNA (mRNA) in any one sample. Host mRNA is overwhelmed by a much larger population of bacterial transcripts. Further RNA enrichment after extraction, using polyA selection and ribodepletion, did not increase the sensitivity in our microarray method, and risked the loss of target. We assume in addition to being present in low numbers, human host transcripts were likely to be fragmented, having passed through a potentially harsh milieu in the gut. Perhaps by avoiding capture-based enrichment early in the process, and using arrays as the sole, end-preparation, hybridization step before enumeration, we retained more of the fragments of human mRNAs. This yield therefore might produce sufficient human mRNAs in our samples to anneal to the high-density microarray and to provide reproducible transcript-level signals.
The bias introduced by sequence amplification was a limitation of our data. The low copy number present in fecal samples requires amplification to detect them reproducibly by microarray. It is well known that amplification introduces bias on the basis of the probe length, GC content of the probe, and hybridization preference for certain sequences. 35 These biases prevent us from comparing our data with that from samples that do not require amplification, such as bowel biopsy specimens. However, amplification was applied equally to our entire data set, thus comparisons between samples from children with and without EED are likely to be informative. RNA sequencing and microarray hybridization are methods that both use amplification, and thus both incur these biases. ANCOVA, analysis of covariance; ANOVA, analysis of variance; FDR, false-discovery rate; GAGE, generally applicable gene set enrichment for pathway analysis; MetaCore, integrated software for functional analysis. We used the Affymetrix default data processing and normalization algorithm (RMA-quantile) for our microarray analysis. In light of the presumed degraded nature of fecal RNAs, we also used 2 alternative, but complementary, signal intensity normalization methods (IRON and FARMS) in our analyses. RMA quantile normalization relies on the assumption of Gaussian distributions of data, IRON performs pair-wise intensity normalization without the assumption of Gaussian distribution, and FARMS does not apply background subtraction but summarizes intensities of probe-level based on a linear model with Gaussian noise and Bayesian maximum a posteriori assumptions. FARMS identified just 12 significant transcripts in the differential analysis and no significant pathways in association with EED, suggesting that when samples with low quantities of poorquality RNA are analyzed, meaningful data are lost when summarizing signals based solely on a linear model. NOTE. Transcripts with differential expression: healthy %L less than 0.2 vs severe EED %L greater than 0.7; Pearson correlations with %L; and Spearman correlation with change in height-for-age Z score in the subsequent 3-month period. interpretation of analyses from specimens with low transcript copy numbers need to be developed. A framework to understand EED emerges from these data, which is summarized in Figure 7. A disrupted mucous layer allows luminal microbes to inflame the mucosa, creating a chronic inflammatory state. The host response is perpetuated by the steady stream of microbes present in the contaminated environments of these rural African children.

Methods of signal normalization better suited for
A relatively high proportion of the transcripts showing a correlation with EED (Table 6) are associated with myeloid (monocyte, macrophage, dendritic cell, and neutrophil) function. More specifically, a substantial number of these genes are linked to granulocyte colony-stimulating factor (G-CSF) signaling within these and other cell types. CSF 3 receptor is the primary high-affinity receptor for G-CSF and uses the Janus kinase/signal transducer and activator of transcription and V-Yes-1 Yamaguchi Sarcoma Viral Related Oncogene Homolog signal transduction pathways. [36][37][38][39] Furthermore, SOCS3, BCL2A1, and CXCR2 are induced in response to G-CSF activity. G-CSF pathway activation includes dendritic cell differentiation, neutrophil mobilization, and, more generally, cell survival. G-CSF might protect against infection and also serve survival/repair functions in tissues including the intestine. Given the compromised barrier cell junctions as well as the diffuse villous atrophy observed in EED, it is quite plausible that the G-CSF increase is a compensatory and appropriate response to microbial threat because it augments defense against bacterial translocation and promotes villous repair.
The KEGG pathway analyses suggest that there are immune responses to a diversity of microbes in EED. The role of host-microbial interactions in the duodenum and jejunum currently may be underappreciated because laboratory methods of assessing the bacterial component of the microbiota, predominant in the colon, are so widely used. The expression of genes and the activation of pathways, which promote cell adhesion and phagocytosis, suggest that in EED the host is endeavoring to clean up and repair damaged paracellular junctions in the duodenum and jejunum. The diversity of genes and pathways activated in EED support the speculation that the etiology is multifactorial. 40 The diversity of immune responses found concurrently in the transcriptome analyses suggests that the mucosa of the small intestine is not shielded from the many microbes in the gut lumen; this is in contrast to gut infections caused by a particular microbe, during which we might expect to see a stronger, more specific immune response. Goblet cells typically respond to inflammatory stimuli with increased mucus secretion. We observed that there was a reduction in transcripts coding for mucin, the protein core of the mucus layer, and the protein kinases that confer the barrier properties to the mucous layer. All of the mucin genes identified with reduced expression are present in the small bowel. 41,42 The failure of rifaximin to ameliorate EED 15 suggests that EED is not simply a condition of overgrowth of bacteria in the small intestine, or of infection with organisms susceptible to this antibiotic. Taken together, these data support speculation that EED is the result of, or at least accompanied by, inadequate mucus secretion of the duodenum and jejunum. Our transcriptome findings are summarized in a cartoon in Figure 7. In addition, penetration of the epithelial paracellular junctions by nonviable vesicles secreted by bacteria evokes a strong inflammatory response, and this possibility also should be considered in EED. 43 Of the 51 transcripts in which there is increased expression in EED, 29 are reported to respond to viral infection. We speculate that virus presence and infection might play a role in EED, and as our ability to characterize and understand the role of viruses in the duodenum and jejunum increases, researchers should investigate this possibility further. Indeed, children from resource-poor regions have a much more diverse intestinal virome than those from high-income settings, and our data suggest that the childhood gut might be responding to these agents. 44 The association of 33%-50% of the EED-associated transcripts with subsequent linear growth is consistent given the association with %L and linear growth in this population.
Finally, the list of transcripts differentially expressed in EED may provide direction to those seeking a better biomarker for this elusive, formidable, scourge of children in the developing world. The opportunity to interrogate the host transcriptome of the gastrointestinal tract through a fecal specimen may be exploited to develop biomarkers for other inflammatory and carcinogenic diseases of the gut in the future. Our data also showed the feasibility and potential superiority of direct from extract (ie, does not require additional treatment or manipulation as in previous protocols) immobilization of RNA on a support platform, on which transcripts can be quantified directly. It is possible that this methodology optimizes yield, and purity, of the molecules of interest (ie, human mRNAs) in the complex milieu of the fecal biomass. Thirteen common KEGG pathways identified within both RMA-and IRON-normalized microarray intensity data. Numbers after the titles of pathways in parentheses are the number of genes in the data set that were mapped to the given pathways. The significant genes shown are those with an absolute fold-change greater than 1.1 at P < .05 in differential analysis. The percentage of up-regulation was calculated using mean fold-change values of significant genes divided by mean fold-change values of nonsignificant genes on the pathways. The -log10 (P value) was from pathway analysis, indicating the statistical significance. (B) There are 6 common canonical pathways identified using both IRON-and RMA-normalized microarray data. The analysis was performed on genes with a significant correlation between signal intensity and %L value at P < .01. The numbers following the titles of pathways are the number of genes in the maps of given pathways. These pathways were significant at P < .01 and a false-discovery rate less than 0.25, and the log (P values) are shown in the dotted red lines. The genes with positive correlation coefficients are shown in gold, and the genes with negative correlation coefficients are shown in blue. COPD, chronic obstructive pulmonary disease; Fc, fragment crystallizable region; HIF-1, hypoxia-inducible factor 1; NF, nuclear factor; NOD, nucleotide-binding oligomerization domain; RI, Fc epsilon RI or high-affinity IgE receptor; TNF, tumor necrosis factor.   . Heat map for 12 common differentially expressed significant genes, also mapped to significant KEGG pathways, correlated to %L in both IRON-and RMAnormalized microarray expression data. Raw microarray data are at Geo Link upon request from the corresponding author.

Conflicts of interest
This author discloses the following: Since this work was completed, Erica Barnell started a company, Geneoscopy, which uses RNA detection in feces. The remaining authors disclose no conflicts.

Funding
Supported by the Bill & Melinda Gates Foundation and Affymetrix (Santa Clara, CA); by National Institutes of Health grant P30DK052574 (Administrative and Resource Access Care) (P.T.), and Phillip Tarr is a coinventor on a device to noninvasively measure gut permeability. The findings and conclusions contained within are those of the authors and do not necessarily reflect the positions or policies of the Bill & Melinda Gates Foundation or Affymetrix.