Genome-Wide Methylation Profiling in 229 Patients With Crohn’s Disease Requiring Intestinal Resection: Epigenetic Analysis of the Trial of Prevention of Post-operative Crohn’s Disease (TOPPIC)

Background & Aims DNA methylation alterations may provide important insights into gene-environment interaction in cancer, aging, and complex diseases, such as inflammatory bowel disease (IBD). We aim first to determine whether the circulating DNA methylome in patients requiring surgery may predict Crohn’s disease (CD) recurrence following intestinal resection; and second to compare the circulating methylome seen in patients with established CD with that we had reported in a series of inception cohorts. Methods TOPPIC was a placebo-controlled, randomized controlled trial of 6-mercaptopurine at 29 UK centers in patients with CD undergoing ileocolic resection between 2008 and 2012. Genomic DNA was extracted from whole blood samples from 229 of the 240 patients taken before intestinal surgery and analyzed using 450KHumanMethylation and Infinium Omni Express Exome arrays (Illumina, San Diego, CA). Coprimary objectives were to determine whether methylation alterations may predict clinical disease recurrence; and to assess whether the epigenetic alterations previously reported in newly diagnosed IBD were present in the patients with CD recruited into the TOPPIC study. Differential methylation and variance analysis was performed comparing patients with and without clinical evidence of recurrence. Secondary analyses included investigation of methylation associations with smoking, genotype (MeQTLs), and chronologic age. Validation of our previously published case-control observation of the methylome was performed using historical control data (CD, n = 123; Control, n = 198). Results CD recurrence in patients following surgery is associated with 5 differentially methylated positions (Holm P < .05), including probes mapping to WHSC1 (P = 4.1 × 10-9, Holm P = .002) and EFNA3 (P = 4.9 × 10-8, Holm P = .02). Five differentially variable positions are demonstrated in the group of patients with evidence of disease recurrence including a probe mapping to MAD1L1 (P = 6.4 × 10-5). DNA methylation clock analyses demonstrated significant age acceleration in CD compared with control subjects (GrimAge + 2 years; 95% confidence interval, 1.2–2.7 years), with some evidence for accelerated aging in patients with CD with disease recurrence following surgery (GrimAge +1.04 years; 95% confidence interval, -0.04 to 2.22). Significant methylation differences between CD cases and control subjects were seen by comparing this cohort in conjunction with previously published control data, including validation of our previously described differentially methylated positions (RPS6KA2 P = 1.2 × 10-19, SBNO2 = 1.2 × 10-11) and regions (TXK [false discovery rate, P = 3.6 × 10-14], WRAP73 [false discovery rate, P = 1.9 × 10-9], VMP1 [false discovery rate, P = 1.7 × 10-7], and ITGB2 [false discovery rate, P = 1.4 × 10-7]). Conclusions We demonstrate differential methylation and differentially variable methylation in patients developing clinical recurrence within 3 years of surgery. Moreover, we report replication of the CD-associated methylome, previously characterized only in adult and pediatric inception cohorts, in patients with medically refractory disease needing surgery.

BACKGROUND & AIMS: DNA methylation alterations may provide important insights into gene-environment interaction in cancer, aging, and complex diseases, such as inflammatory bowel disease (IBD). We aim first to determine whether the circulating DNA methylome in patients requiring surgery may predict Crohn's disease (CD) recurrence following intestinal resection; and second to compare the circulating methylome seen in patients with established CD with that we had reported in a series of inception cohorts.
METHODS: TOPPIC was a placebo-controlled, randomized controlled trial of 6-mercaptopurine at 29 UK centers in patients with CD undergoing ileocolic resection between 2008 and 2012. Genomic DNA was extracted from whole blood samples from 229 of the 240 patients taken before intestinal surgery and analyzed using 450KHumanMethylation and Infinium Omni Express Exome arrays (Illumina, San Diego, CA). Coprimary objectives were to determine whether methylation alterations may predict clinical disease recurrence; and to assess whether the epigenetic alterations previously reported in newly diagnosed IBD were present in the patients with CD recruited into the TOPPIC study. Differential methylation and variance analysis was performed comparing patients with and without clinical evidence of recurrence. Secondary analyses included investigation of methylation associations with smoking, genotype (MeQTLs), and chronologic age. Validation of our previously published case-control observation of the methylome was performed using historical control data (CD, n ¼ 123; Control, n ¼ 198).
RESULTS: CD recurrence in patients following surgery is associated with 5 differentially methylated positions (Holm P < .05), including probes mapping to WHSC1 (P ¼ 4.1 Â 10 -9 , Holm P ¼ .002) and EFNA3 (P ¼ 4.9 Â 10 -8 , Holm P ¼ .02). Five differentially variable positions are demonstrated in the group of patients with evidence of disease recurrence including a probe mapping to MAD1L1 (P ¼ 6.4 Â 10 -5 ). DNA methylation clock analyses demonstrated significant age acceleration in CD D NA methylation is an important epigenetic mechanism that associates with alteration in gene expression with no underlying change in the genetic code. DNA methylation changes have been implicated in cancer; aging [1][2][3][4] ; and many complex diseases, including inflammatory bowel disease (IBD). 5,6 In our original studies, we described the circulating "methylome" in patients with IBD and control subjects, 7,8 including in a large inception cohort of newly diagnosed patients. 9 These methylation differences across the genome in peripheral blood leucocyte DNA correlate with known clinical parameters of inflammation, but importantly relate to underlying genotype. A key potential importance of DNA methylation changes relates to an association with alteration of gene expression. We were able to demonstrate the appropriate inverse relationship between methylation and gene expression, in a cell-specific manner in separated circulating leukocytes. 9 Most recently, we have provided strong replication of these methylation signals in a large inception cohort of patients with IBD recruited across Northern Europe, and replication of some signals in Southern Europe. 10 Although genome-wide methylation differences have been demonstrated between IBD cases and control subjects, identifying methylomic differences between IBD subphenotypes is more nuanced. Multiomic data have also been used to prognosticate in IBD, attempting to delineate patients at risk of severe disease phenotype requiring surgery or more intensive drug regimens. [11][12][13][14][15] Using an unsupervised clustering method in our index study of an inception cohort of patients with Crohn's disease (CD) and ulcerative colitis, we identified groups of patients potentially at higher risk of surgery or treatment escalation. 9 In a large treatment-naive inception cohort in Europe, we identified 3 methylation probes (TAP1, TESPA1, RPTOR) that associated with the need for treatment escalation to biologic agents or surgery. 10 Patients with CD have a high lifetime risk of surgery for refractory or complicated disease. Approximately half of patients undergo surgery within 10 years of diagnosis; 16 however, with the introduction of newer biologic treatment, surgery rates seem to be falling. 17 The TOPPIC trial sought to determine the efficacy of 6-mercaptopurine  in prevention of the recurrence of disease following ileocolic resection. 18 Two-hundred and forty patients were randomized across 29 UK centers to receive 6-MP or placebo following ileocolic resection for CD. The primary end point was a composite clinical end point that included an increase in Crohn's disease activity index score, requirement for treatment escalation, or further surgery. The trial showed a modest benefit with 6-MP treatment versus placebo for the primary clinical end point (hazard ratio, 0.54; 95% confidence interval [CI], 0.27-1.06). There was a more pronounced benefit for 6-MP for smokers (hazard ratio, 0.13; 95% CI, 0.04-0.46). 18 The coprimary aims of the present study were to determine whether circulating DNA methylation differences in patients before surgery differ between patients with and without evidence of clinical or endoscopic recurrence following surgical resection; and to extend our observations of methylation alterations made in inception cohorts of newly diagnosed patients by studying an independent cohort of patients with established CD requiring surgery ( Figure 1).

Participants, Demographics, Data Processing, and Quality Control
There were 233 TOPPIC samples available for analysis with no samples failing quality control. Patient demographic information is presented for TOPPIC participants in Table 1. Data processing procedures demonstrated visually improved characteristics on density plots Figure 2A-D) and multidimensional scaling (MDS) plots ( Figure 2E-G). After filtering, 429,944 probes were available for analysis. No samples failed sex check ( Figure 2F). QQ plots, Lambda values, and clustering of cohorts on MDS plots improved following combat correction (for both array number and intra-array position; Figure 3). Four TOPPIC patients had Abbreviations used in this paper: 6-MP, 6-mercaptopurine; AHRR, aryl hydrocarbon receptor repressor; CD, Crohn's disease; CI, confidence interval; DMP, differentially methylated position; DMR, differential methylated region; DVP, differentially variable position; FDR, false discovery rate; IBD, inflammatory bowel disease; MDS, multidimensional scaling; meQTL, methylated quantitative trait loci; SNP, singlenucleotide polymorphism. missing outcome data and were excluded from disease recurrence analyses.
The biologic and functional relevance of DMPs and DVPs associated with CD recurrence following surgery are outlined in Table 5.
Methylated Quantitative Trait Loci. There were 216 samples with paired methylation and genotype data available for methylated quantitative trait loci (meQTL) analysis. The 5 DMPs and 5 DVP methylation probes were investigated for genotype association (meQTLs) using age, sex, and smoking status as covariates. There were 35 cis meQTLs with a false discovery rate (FDR; P < .05), consisting of 35 different single-nucleotide polymorphisms (SNPs) and 7 of Defined as increase in CDAI of more than 150 and an increase of 100 points from baseline measurement and institution of immunosuppressive treatment or further surgery. the 10 CpGs ( Figure 9, Differentially Variable Positions (CD vs BIOM Control Subjects). Differential variability was performed comparing CD cases (BIOM CD and TOPPIC) versus control subjects (BIOM control subjects) using the iEVORA method. 28 There were 18,993 DVPs hypervariable in CD compared with BIOM control subjects. Previously described IBD-associated DMPs were included as DVPs (SBNO2, var

Smoking and Epigenetic Age
Smoking. We performed a methylation analysis of smokers versus exsmokers and nonsmokers using the combined cohort (n ¼ 554, regardless of case or control status). There were 169 methylation probes that associated with smoking (Holm corrected <0.05). Aryl hydrocarbon receptor repressor (AHRR) methylation has been strongly associated with smoking status and we confirm hypomethylation in current smokers (cg05575921, beta difference -10.8, Holm adjusted P ¼ 5.46 Â 10 -45 ; Figure 7A) with 5 AHRR probes in the top 20 most significant probes (cg05575921, cg21161138, cg26703534, cg14817490, cg25648203; Table 10). Of the 169 significant probes, 137 (81%) have previously been described by Gao et al 29 in a meta-analysis of smoking-related probes. There was a modest but significant correlation in log fold difference in beta values here and published by Gao et al 29 Figure 7B).
To delineate CD-specific smoking associated methylation we then analyzed smoking-related methylation in CD cases (n ¼ 356) and control subjects (n ¼ 198) separately. There were 9 CpGs associated with smoking in patients with CD that did not overlap with the control or combined cohort or CpGs that had previously been described by Gao et al 29 (cg24497361,  cg17777683, cg03088955, cg01218206, cg05895711, cg08006672, cg09273683, cg18688062, cg21963318; Figure 7C). When comparing these CD-specific smoking-associated CpGs, there were 4 overlapping probes compared with the CD case control DMPs described in the replication analyses later (cg14753356, cg00295485, cg21963318, cg130338858; Figure 7D). The functional relevance of these smoking-related methylation probes is detailed in Table 11.
Epigenetic age acceleration is demonstrated in patients with CD compared with control subjects using all clocks ( Figure 8B). When comparing age acceleration newly diagnosed patients with CD in the BIOM cohort with those with established disease requiring surgery in the TOPPIC cohort, there was some evidence of age acceleration in those requiring surgery using the DNAmAge clock, deceleration using the GrimAge clock, and no difference when using the other 3 clocks. The GrimAge clock also demonstrated some evidence of age acceleration in patients with disease recurrence following surgery compared with those without recurrence (þ1.04 years; 95% CI, -0.04 to 2.22; P ¼ .09; Figure 8C). GrimAge acceleration strongly associated with smoking status ( Figure 8D), but not inflammatory markers (Creactive protein: r ¼ 0.03, P ¼ .6; albumin: r ¼ 0.08, P ¼ .2).

Discussion
This study presents a detailed DNA methylation from a multicenter UK randomized controlled trial. We demonstrate differential methylation and differentially variable methylation in patients developing CD recurrence following surgery. Furthermore, the results strongly validate our previous studies, 7-10 describing methylation differences in IBD cases versus control subjects, which had involved newly diagnosed patients, rather than those with established disease.

Prediction of CD Recurrence Following Surgery
DMPs. The present study includes a unique and homogenous cohort of patients with CD sampled before surgical resection and followed up within the rigorous confines of a randomized controlled trial with accurate clinical and endoscopic follow-up data. A smaller study with a similar cohort of patients postresection for ileal CD did not demonstrate systemic differences in DNA methylation in those experiencing a recurrence. 42 We demonstrate 5 significant DMPs following stringent correction for multiple testing. The significant DMPs include EFNA3, a tyrosine kinase receptor that plays a role in maintaining gut epithelial integrity and T-cell activation 21 and has been implicated in CD 28 and ulcerative colitis. 23 The ephrines have been postulated as potential therapeutic targets in CD. 24 WHSC1/NSD2 is a nuclear binding domain associated with the condition Wolf-Hirschhorn syndrome. Notably the methylation probe exists close to a proinflammatory microRNA (mir-943). 19 DVPs. Most epigenome-wide association studies have focused on case-control quantitative differences in DNA methylation at specific sites (DMPs). In the context of complex diseases such as IBD, the absolute differences in mean DNA methylation are often small (<5%), with unclear biologic consequence. There has been interest in measuring DNA methylation variability, or the pattern of variance at these sites. DVPs have been described as heterogeneous outlier events and first described in cancer but increasingly described in complex diseases including T1 diabetes mellitus and rheumatoid arthritis twin studies. 43,44 We have identified 5 DVPs associated with disease recurrence following surgery. The most interesting DVP is MAD1L1, a mitotic arrest deficient 1 that represents a spindle assembly checkpoint between anaphase and metaphase. MAD1L1 was a key finding in our previous work as a DMP that demonstrates IBD-specific appropriate inverse correlation between methylation and gene expression. 10 MAD1L1 differential methylation has additionally been seen at the gut level,   within intraepithelial cells in ulcerative colitis. 25 The biologic significance of differentially variability of methylation has not been well delineated. Unlike DMPs, DVPs lack clinical utility biomarkers because this technique relates to groups rather than individual patients.
meQTLs. Our group and others have previously demonstrated that genetic variation between IBD cases and control subjects relate to differential methylation, 9,10,13 raising the possibility that methylation may be a mediator of genetic susceptibility. Key DMRs including VMP1 and ITGB2 have been shown to be meQTLs. 8,10 In the present study, there was a cis-genetic association in 8 of 10 methylation sites of interest (5 DMPs and 5 DVPs). Three meQTLs were associated with disease outcome (cg00475456, cg18068256, cg24864518; Figure 10); however, it is likely that differences are driven by small differences in allele frequency in patients with or without disease recurrence.
Smoking. There is a very strong relationship between smoking and CD susceptibility, 45 behavior, 46 and with postsurgical recurrence; 47 indeed in the TOPPIC trial, smoking habit was not only a determinant of recurrence; but also was unexpectedly associated with the efficacy of thiopurine therapy. 18 The mechanism is uncertain, but given the significant effects of smoking on DNA methylation, 29,30,32 the relationship between smoking, CD, CD recurrence after surgery, and DNA methylation is of particular interest. Using the entire cohort (CD and control subjects), we were able to replicate the previously published smoking-related methylation probes 29,30,32 and correlate beta fold differences between smokers and nonsmokers in ours and published series. 38 AHRR methylation has been strongly associated with smoking status and we confirm hypomethylation in current smokers (beta difference -10.8; Holm adjusted P ¼ 5.46 Â 10 -45 ) with 5 AHRR probes in the top 20 most significant probes. We then looked to identify smoking-associated probes that were present in patients with CD (and not control subjects). There were 3 CD-specific smoking-related probes that had not been associated with smoking in other published series. One probe mapped to JOSD1 (cg03088955), a disubiquination enzyme with a role in autophagy, 33 and another mapped to PIP4KA2 (cg09273683), a gene with an SNP that was found to be an environmental interactor between smoking and colorectal cancer. 35 Epigenetic Clock. DNA methylation data can be used to predict the biologic age of patients/samples and DNAm age acceleration is associated with mortality and a poorer prognosis in a range of conditions. 48,49 In the present dataset, we have used an online tool (Clock foundation) to calculate epigenetic age using a range of more recently developed methylation clocks. We observe DNAm age acceleration in patients with CD compared with control subjects, replicating the same finding in our previous work. 10 Using the GrimAge clock we also demonstrate some evidence of epigenetic age acceleration in patients with CD recurrence following surgery, a finding not observed when using the other clocks. GrimAge may outperform the other clocks when predicting all-cause mortality and other agerelated morbidity (healthspan). 50 The GrimAge clock was developed to include DNAm-based surrogate markers for smoking and other plasma proteins. 41 Epigenetic age acceleration occurs following major surgery, in particular following emergency hip fracture surgery, but returns to baseline 4-7 days following surgery. 51,52 Of more relevance, elective colorectal surgery was not associated with epigenetic age acceleration. 51,52 GrimAge acceleration associating with smoking and CD recurrence, but not traditional markers of inflammation, is particularly interesting given that smoking was found to be an important factor for disease recurrence in the original TOPPIC study.

Replication of CD Versus Control Subjects (Case vs Control)
A significant strength of this large DNA methylation dataset was the ability to validate our previous findings of differential methylation occurring in IBD cases and control subjects. 9 Critically, this demonstrates validation in a distinct cohort of patients recruited across multiple sites Feature, location of methylation probe in relation to nearby gene on the 450K annotation manifest; logFC, log fold change; sym, gene symbol associated with methylation probe on the 450K annotation manifest. Defined as increase in Crohn's disease activity index of more than 150 and an increase of 100 points from baseline measurement and institution of immunosuppressive treatment or further surgery. cg24864518 * This probe maps to an intergenic region close to the TSS of RASGEF1b, a guanine nucleotide exchange factor for Rap2, a member of the family of Rap G-protein signallers. 20 cg06058618 EFNA3 Ephrine A3. Tyrosine kinase family of receptors. Ephrine-mediated repulsion of cells have a role in maintaining the integrity of the gut epithelial layer and may modulate T-cell activation. 21 Have previously been implicated in Crohn's disease 22 and ulcerative colitis, 23 and have been postulated as a potential therapeutic target in Crohn's disease. 24 Also extensively implicated in gastric and hepatocellular cancers. Target of miR-210-3p.
cg23939096 * Maps to a noncoding area. cg25981920 * Maps to a noncoding area close to LY6L lymphocyte antigen6 family member.
Differentially variable probes cg24696067 MAD1L1 Mitotic arrest deficient like 1 acts as a spindle assembly checkpoint between metaphase and anaphase. MAD1L1 was a key finding in our previous work that demonstrated IBD-specific correlation between DNA methylation and gene expression. 10 A different probe mapping to MAD1L1 was differentially methylated in colonic intraepithelial cells in UC. 25 Because of its role in regulating the cell cycle, MAD1L1 is also implicated in a variety of cancers.  across the United Kingdom. Whereas our previously published case-control analyses involved newly diagnosed patients, 9 the TOPPIC cohort consists of patients with established disease. Using TOPPIC data we replicated our previous key DMPs TXK (FDR P ¼ 3.6 Â 10 14 ), WRAP73 (FDR P ¼ 1.9 Â 10 9 ), VMP1 (FDR P ¼ 1.7 Â 10 7 ), and ITGB2 (FDR P ¼ 1.4 Â 10 7 ). Data from the RISK cohort, a treatment-naive pediatric inception cohort, demonstrated a tendency for most methylation signals to revert following treatment, 13 notably with the exception of IBD-associated RPS6KA2 hypomethylation, a finding replicated in using this novel cohort (Holm adjusted P ¼ 1.2 Â 10 -19 ). Data from this present study suggest either that these methylation findings may either endure from diagnosis or, alternatively, be present; resolve in remission; and recur in patients with uncontrolled disease reflecting active inflammation at time of sampling. Although the present study cannot address these issues, longitudinal analysis suggest that for most the loci, resolution may occur with disease control; in a small proportion, including notably RPSKA2, the changes may be constant regardless of inflammatory status. 13 This area is under further analysis.

Differential Variable Positions
In this study, we describe CD-associated differentially variable methylation for the first time in IBD versus control subjects. The enrichment of DMPs and DMRs is an artefact of the analytical technique, with the iEVORA method ranking DVPs higher if a DMP at genome-wide significance level or as close to possible to a DMP. 28 Variable methylation has been hypothesized to account for differences in disease susceptibility among individuals and between ethinicites. 53 It has been noted in healthy individuals that there is higher variability in specific regions of genome, and in particular in immune-related pathways, and low variability in highly conserved regions associated with basic cellular functions. 54 The pathobiologic significance of the DVPs described here warrants further investigation.  40 Hannum, 41 tissue specific (skin and blood clock), 21 phenoAge, 42 and GRIMage clocks. 22 (A) Correlation plot of methylation age (y-axis) and biologic age (x-axis) using methods above, inset, density plot of methylation age). Cor, Pearsons R Correlation estimate. (B) Boxplots of age acceleration using methods above in patients with Crohn's disease requiring surgery (CD_TOPPIC), newly diagnosed Crohn's disease patients (CD_BIOM), and control subjects. (C) Boxplots of age acceleration in patients included in the TOPPIC trial who went on to develop recurrence or no recurrence following surgery. (C) Box plot for each methylation clock age acceleration and smoking status, current, exsmoking (recorded in the BIOM cohort), exsmoker/never smoker (grouped together as part of the TOPPIC cohort), and never smoked (recorded in the BIOM cohort). Ns ¼ P > .05, *P < .05, **P < .01, ***P < .001, ****P < .0001 (Wilcox test).

Strengths and Limitations
This is a large dataset of phenotypically homogenous patients with IBD with established disease and provides complementary information to our previously published work in newly diagnosed patients. The combined datasets provide one of the largest series genome-wide DNA methylation data in CD to date and provides compelling replication of our previous key findings in a novel dataset of patients with established disease. The TOPPIC trial was a well-conducted randomized controlled trial performed across multiple sites across the United Kingdom with well-phenotyped data and accurate follow-up data to 3 years. Raw data were normalized together and included more than 40 technical replicate samples performed across chip positions and across separate methylation runs for each separate cohort (TOPPIC, Figure 9. cis meQTLs of DVP/DMP probes. Top SNP shown. Age, sex, smoking status used as covariates. MAF of <10% filtered. Cis distance 1 Â 10 6 , P value threshold <2 Â 10 -6 . BIOM), with appropriate clustering on MDS plots increasing the confidence of performing analyses across cohorts ( Figure 5), limiting the impact of the control samples arising from 1 of the 2 datasets. Notwithstanding this, novel DMPs described in the TOPPIC CD versus BIOM control subjects require further replication. Despite rigorous correction and technical replicates, results from this analysis are likely to be overinflated, as noted by the number of positive DMPs in the TOPPIC CD versus control subjects being higher than in the combined analysis. The blood sample used for methylation analysis was taken before administration of the study treatment (6-MP) or placebo and will not affect the methylation data itself but may impact the studied outcome of disease recurrence (despite nonstatistically significant findings in original randomized controlled trial). RNA was not available to attempt to associate differential methylation variance and expression.

Conclusions
We identify methylation changes present at the time of surgery that are associated with future CD recurrence within 3 years. Probes within the 5 site-specific (DMPs) and 5 DVPs associate with the underlying genotype and relate to genes with biologic relevance to CD. Given the relationship between smoking, methylation, and IBD, we have identified CD-specific smoking-related methylation sites. Replication of the CD-associated methylation alterations is achieved, having previously characterized only in adult and pediatric inception cohorts, in patients with well-established disease requiring surgery.

Datasets
TOPPIC was a placebo-controlled, randomized controlled trial of 6-MP at 29 UK centers in patients with CD   undergoing ileocolic resection between 2008 and 2012. 18 Genomic DNA was extracted from whole blood samples from 229 of the 240 patients taken before intestinal surgery. The IBD-BIOM cohort consists of 123 patients with newly diagnosed CD and 198 control subjects, further details of which are described in the original paper ( Figure 1). 9

Samples
Peripheral blood leukocyte DNA was bisulphite converted and DNA methylation profiling was performed using the Illumina HumanMethylation450K platform (Illumina, San Diego, CA). Samples from patients treated with 6-MP or placebo were randomly distributed across chips. A total 41 technical replicates were distributed across chips, runs, and cohorts. Genotype analysis was performed using the Illumina Omni Express Exome (500k SNPs) array for the TOPPIC cohort and the Illumina CoreExome Beadchip array.

DNA Methylation Analysis
Data Preprocessing. DNA methylation data was read from iDats using the R package minfi. 55 Estimated cell proportion admixture 56 was obtained using estimate-CellCounts function of the same package. The minfi processing stream was then followed: quantile normalization (preprocessQuantile); probes on sex chromosomes were removed (11458 probes), samples with >1% with detection P values >5% (0 samples) were filtered; and methylation probes containing SNPs (dropLociWithSnps, 17,541 probes) and cross-reactive probes (26,569 probes) were also removed. 57 Batch correction was performed using ComBat for array (72 batches) and subsequently chip position (12 batches). Processing steps were visualized in ShinyMethyl interface. 58 There were no sex mismatches. Forty technical replicates were used across different clips and runs. Technical variation was assessed using MDS plots and intraclass correlation of the top 1000 most variable methylation probes. Technical replicates were removed before downstream analyses.

DNA Methylation and Risk of Disease Recurrence
Following Surgery in Patients With CD. The composite clinical outcome used in the original TOPPIC trial consisting of an increase in Crohn's disease activity index of more than 150 and an increase of 100 points from baseline measurement together with the institution of immunosuppressive treatment, or further surgery. Secondary outcomes of CD disease recurrence included the highest endoscopic scores (CDEIS, Rutgeerts) measured at 49 and 157 weeks following randomization. TOPPIC data alone were read into R and processed using the previously mentioned steps. DMP analysis (recurrence vs no recurrence) was performed as mentioned with the following covariates: age, sex, smoking status, treatment/placebo, and cell proportions. DVPs were assessed using the iEVORA package using the row_ievora() function in the matrixTests package with default parameter of a raw t-test threshold of P < .05 and FDR corrected P threshold of Bartlett test step <0.001. 28 To adjust for covariates, a matrix of the residual values from a linear model of the covariates (age, gender, smoking status, cell proportions) was used as the input for the DVP iEVORA method. Data were submitted to the DNA methylation Clock Foundation (https://dnamage.clockfoundation.org/) for estimation of epigenetic age scores using methods by Horvath, 37 Hannum, 38 phenoAge, 39 tissue specific (skin and blood clock), 40 and GRIMage. 41 Correlation was made with actual biologic age and estimates of age acceleration were made (methylation agebiologic age). Smoking-associated probes (DMPs) were identified using a linear model of smoking as the outcome (current vs exsmoker/never smoked) with cell proportions as covariates. Smokingassociated probes were correlated with previously published smoking-related probes. 29,59 Genotype and meQTL Analysis Genotypes were called by GenomeStudio and data were processed using plink. 60 Data assessed for sex mismatches. meQTLs were identified using the matrixEQTL package. 61 meQTLs were identified using significant DMP and DVP methylation probes using the modelLinear function with age, sex, and smoking status as covariates to identify meQTLs (MAF >0.1, cis distance of 1 Â 10 6 , min P value 1 Â 10 6 ) P values were FDR corrected. For disease-specific meQTLs the modelLinearCross function was used including only significant DMP and DVP methylation probes with the following covariates (age, sex, smoking status) to identify meQTLs associated with disease recurrence in the entire TOPPIC dataset (MAF 0.1, cis distance of 1 Â 10 6 , min P value 1 Â 10 6 ).

Validation of DNA Methylation Changes in IBD Cases and Control Subjects
Raw 450K HumanMethylation iDats from IBD BIOM and TOPPIC cohorts were read into R using minfi and both datasets were normalized together using the previously mentioned steps. Batch correction was performed using ComBat for array (72 batches) and chip position (12 batches). 62,63 DMP analysis was performed using limma comparing CD cases (BIOM and TOPPIC separately) with control subjects (BIOM only 9 ). 64 The 2 CD cohorts (BIOM, TOPPIC) were analyzed together against control subjects (BIOM only) (Figure 1). The following covariates were used in linear models (age, sex, smoking status, cell deconvolution values). 65 Correction for multiple testing was performed using the Holm adjusted P value. 66 Overlap with previously published DMP lists was assessed for overrepresentation using phyper test for hypergeometric distribution. 67 DMR analysis was performed using DMRcate with an FDR threshold of P < .001, Gaussian Kernel Bandwidth lamda of 500, and scaling factor C of 5. 68,69 DVP analysis was performed using the residual matrix of a linear model of covariates (1w age þ sex þ smoking status þ cell counts) with the iEVORA algorithm using the row_ievora() function in the matrixTests package with default parameter of a raw t-test threshold of P < .05 and FDR corrected P threshold of Bartlett test step <0.001. 39