Chromatin accessibility in canine stromal cells and its implications for canine somatic cell reprogramming

Abstract Naturally occurring disease in pet dogs is an untapped and unique resource for stem cell‐based regenerative medicine translational research, given the many similarities and complexity such disease shares with their human counterparts. Canine‐specific regulators of somatic cell reprogramming and pluripotency maintenance are poorly understood. While retroviral delivery of the four Yamanaka factors successfully reprogrammed canine embryonic fibroblasts, adult stromal cells remained resistant to reprogramming in spite of effective viral transduction and transgene expression. We hypothesized that adult stromal cells fail to reprogram due to an epigenetic barrier. Here, we performed assay for transposase‐accessible chromatin using sequencing (ATAC‐seq) on canine stromal and pluripotent stem cells, analyzing 51 samples in total, and establishing the global landscape of chromatin accessibility before and after reprogramming to induced pluripotent stem cells (iPSC). We also studied adult stromal cells that do not yield iPSC colonies to identify potential reprogramming barriers. ATAC‐seq analysis identified distinct cell type clustering patterns and chromatin remodeling during embryonic fibroblast reprogramming. Compared with embryonic fibroblasts, adult stromal cells had a chromatin accessibility landscape that reflects phenotypic differentiation and somatic cell‐fate stability. We ultimately identified 76 candidate genes and several transcription factor binding motifs that may be impeding somatic cell reprogramming to iPSC, and could be targeted for inhibition or activation, in order to improve the process in canines. These results provide a vast resource for better understanding of pluripotency regulators in dogs and provide an unbiased rationale for novel canine‐specific reprogramming approaches.


| INTRODUCTION
Induced pluripotent stem cells (iPSC) are pluripotent stem cells derived from somatic cells that have adopted an embryonic stem celllike phenotype. Somatic cells of murine, human, and other species can be reprogrammed by forced expression of pluripotency transcription factors (TF). 1,2 Cellular reprogramming technology has revolutionized the field of stem cell biology and regenerative medicine research, both by shifting our perspective of cellular development and differentiation, and by providing unprecedented opportunities for the creation of human disease models in a dish, 3,4 in vitro pharmacological, functional, and toxicity studies in laboratory-derived human tissues, 5 and regenerative medicine applications. 6,7 Time is ripe for the application of iPSC technology in regenerative medicine, although numerous translational challenges still need to be addressed before such innovative therapies can reach patients at a larger scale. The unique challenges associated with the translation of iPSC-based cellular products to the clinic, along with the historically low approval rates of new candidate drugs that go into human clinical trials, call for a paradigm shift in translational biomedical research.
Such a paradigm should be specifically designed to be able to test critical aspects such as safety, 8 product scale-up and delivery, 9 and longterm engraftment and immune compatibility, 10 which are poorly modeled by traditional animal models.
Naturally occurring diseases in companion dogs are an extremely valuable and readily available resource as a preclinical model. Similarly to humans, dogs suffer from various complex multifactorial diseases, such as cancer, diabetes mellitus, cardiomyopathies, age-related cognitive dysfunction, and neural damage, that are target for the development of iPSC-derived cellular therapies. 11 Dog population genetics mirror human population genetics with great variation across breeds and increasing homozygosity within breeds. 12 The extended longevity of dogs is useful to test longer-term immune compatibility and safety of stem cell derived cellular products. 13 Furthermore, pet dogs in modern societies often receive excellent health care, hence allowing the use of veterinary medicine as a platform to conduct clinical trials, which mirror human clinical trials.
We have successfully used previously described canine reprogramming protocols to reprogram canine embryonic fibroblasts (CEF) into stable ciPSC lines. 21,29,30 On the other hand, adult cell types such as canine dermal fibroblasts (CDF) or canine adipose-derived mesenchymal stem cells (cASC) remained resistant to reprogramming, despite effective transgene delivery, and multiple attempts, donors, and protocol modifications. Dynamic global chromatin remodeling underlies the reprogramming of somatic cells to iPSC and restructures chromatin accessibility across the entire genome. Such chromatin remodeling enables the inactivation of somatic loci and activation of pluripotency ones. 31 We hypothesize that resistance to reprogramming in canine adult cells is due to a failure to close somatic-fate loci, a failure to open pluripotency loci, or both, during the reprogramming process.
To test our hypothesis, we have determined global chromatin accessibility by assay for transposase-accessible chromatin using sequencing (ATAC-seq) 32 34 We have identified genomic loci that "open" or "close" during the reprogramming of CEF into ciPSC, and further identified loci that are differentially accessible between the different cell types. Finally, we have identified 76 genes as potential canine-specific somatic cell reprogramming barriers.

| MATERIALS AND METHODS
Additional materials and methods can be found in Data S1.

| Cell lines and culture
All animals and protocols in this study were approved by the Institutional Animal Care and Use Committee at the University of California (UCD) Davis, and all experiments conform to the relevant regulatory standards.
CEF were derived from elective spays of pregnant uteri obtained from the Community Surgery Service at the UCD, School of Veterinary Medicine (SVM). Fetal age was determined prior to the spay by ultrasonographic evaluation by the attending clinician, or posteriorly by pulmonary maturation evaluation in hematoxylin and eosin-stained slides, as previously described. 35 After removal of the uterus and the ovaries, the

Significance statement
Poor understanding of canine-specific regulators of cellular reprogramming and pluripotency maintenance impedes the use of naturally occurring disease in companion dogs for translational regenerative medicine research. In this work, the authors identify differences in global chromatin landscape that occur during cellular reprogramming and further define candidate barriers that prevent adult canine cells from reprogramming. Specifically, the genes and loci reported can be the basis of novel canine-specific reprogramming protocols in future research efforts. The authors believe that these findings will be of great interest to the stem cell research community, especially to those seeking novel and realistic animal models for translational research. uterine horns and the embryonic sacs were cut open with scalpels and the embryos were released by severing the umbilical cords. The head and viscera were removed, and the remaining stromal tissue was minced with scalpels and digested in 0.05% trypsin/EDTA (Gibco, Gaithersburg, MD, USA) at 37 C for 45 minutes. After washing the digested tissue with phosphate-buffered saline, the pellet was plated and incubated in complete Dulbecco's modified Eagle's medium (DMEM), consisting of DMEM with 20% fetal bovine serum (Corning, Corning, NY, USA), 0.1 mM nonessential aminoacids, 2 mM GlutaMax, 1 mM sodium pyruvate, 100 U/mL penicillin, and 100 μg/mL streptomycin (Pen/Strep) (all Gibco).
CDF were derived from skin samples from deceased dogs obtained from the pathology service at the UCD SVM. All dog owners consented to unrestricted use of their dog's remains. With scalpel and scissors, fat and capillaries were scraped away from the dermis and skin was cut into small 2 to 4 mm 2 sections and digested with collagenase type II (Worthington, Lakewood, NJ, USA) 1 mg/mL, at 37 C for 1 hour, with agitation. The digested cell suspension was centrifuged and the pellet filtered through a 100-μm cell strainer; flowthrough was plated in complete DMEM.
The remnant tissue sections were also plated in complete DMEM with the dermis side down.
cASC were a gift from Dr. Borjesson's laboratory and cultured as previously described. 36   HTStream was used for data preprocessing, and fragments were mapped to the CanFam3.1 canine genome. Biological replicates were merged before peak calling. Differential openness analyses were conducted using the limma-voom Bioconductor pipeline and peaks were annotated using the Bioconductor package ChIPseeker, version 1.20.0. Correlation plots, read-depth heat maps, and profile plots were generated with deeptools.

| Generation of canine-induced pluripotent stem cells
Hierarchical clustering was performed by distance calculation with Cluster 3.0 and visualized in Java TreeView. Gene ontology (GO) term classification was performed with PANTHER. TF motif enrichment analysis was performed with HOMER.

| Statistical analysis
Pairwise Student's t tests and analysis of variance (ANOVA) tests were used to analyze statistical differences in all cases, except for the differential openness analysis (see Supplemental Information -ATACseq data analysis section). GraphPad Prism v8 37 tools were used for both statistical analysis and graphical representation of results. A P value of <.05 was considered statistically significant.

| Canine embryonic fibroblasts but not adult stromal cells can be reprogrammed to ciPSC
We began our study with the transduction of the four Yamanaka factors (OCT4, SOX2, KLF4, and MYC; OSKM) into low passage CEF, cASC, and CDF. The characteristics of the dogs from which the cell lines were derived are detailed in Table S1. We show here that reprogrammed CEF formed colonies of ciPSC with stem cell-like morphology and high alkaline phosphatase (AP) activity ( Figure 1A). ciPSC colonies showed induced expression of core pluripotency genes nucleofected into CEF-derived ciPSC, and luc activity assays were conducted, indicating the OCT4 locus in ciPSC is controlled by the PE and not the DE, suggesting a primed state for these iPSC ( Figure 1F). cASC and CDF, as well as testicular and ovarian fibroblasts, did not yield stable colonies, even though the efficiency of infection was comparable to that of infected CEF ( Figure S4).

| ATAC-seq analysis identifies distinct cell type clustering pattern
We hypothesized that resistance of canine adult stromal cells (ie, CDF and cASC) to OSKM-mediated cellular reprogramming is due to a failure to close somatic-fate loci, a failure to open pluripotency loci, or F I G U R E 2 ATAC-Seq shows chromatin accessibility differences between adult and embryonic stromal cells, and canine pluripotent cells. A, Representative histogram of the frequency distribution of DNA library fragment size from ATAC-sequencing. B, Venn diagram of the distribution of ATAC-seq peaks for the cell types studied. C, Two-dimensional scaling plot of the relative distances for ATAC-seq peaks between cell types CDF, cASC, CEF, ciPSC, and cESC. D, Read-depth heat map of the whole data set of ATAC-seq peaks. Representation from 2 kb upstream to 2 kb downstream of each loci. All genes represented from TSS to TES. E, Pearson correlation heat map of a representative sample ofall peaks, downsized to enhance processing efficiency, for all the cell type data sets. Each column is a cell type and each row is an ATAC-seq peak. Color scale shows relative ATAC-seq peak signal, on a log 2 base. ATAC-Seq, assay for transposase-accessible chromatin using sequencing; cASC, canine adipose-derived mesenchymal stem cells; CDF, canine dermal fibroblasts; CEF, canine embryonic fibroblasts; cESC, canine embryonic stem cells; ciPSC, canine-induced pluripotent stem cells; TES, transcription end site; TSS, transcription start site F I G U R E 3 Legend on next page. both, during the reprogramming process. To test our hypothesis, we studied the chromatin accessibility landscape of canine stromal and pluripotent cells by ATAC-seq. Specifically, we studied two adult stromal cell types (ie, CDF and cASC), CEF, CEF-syngeneic ciPSC, and cESC. We sequenced four different cell lines of each cell type mentioned, except for cESC of which we only had one cell line available.
The following are results from the analysis of all the peaks in the complete data set. The frequency of sequenced fragment size followed the expected pattern for an ATAC-seq library with periodical peaks corresponding to the nucleosome-free regions (NFR, 100 bp and under) and mononucleosome, dinucleosome, and trinucleosome (200, 400, and 600 bp approximately, respectively), 33 Figure 2D). Finally, we used Pearson correlation unsupervised hierarchical clustering of all peaks obtained, to identify cell type clustering and distances, which indicate that the primary node of separation was between pluripotent stem cells and stromal cells, and a secondary node separated CEF from the adult stromal cells (ie, ASC and CDF) ( Figure 2E).
When we studied the genomic element distribution of differential peaks we found that proper regulatory areas (promoter and 2 kb upstream and downstream), exons and introns, were enriched in peaks with the highest fold change ( Figure S7), indicating that using this subdata set is more effectual for the current study of chromatin accessibility. This subset composed of peaks annotated as promoter, upstream 0 to 2 kb, downstream 0 to 2 kb, exons and introns was used in all the analyses that follow.

| Chromatin remodeling during CEF reprogramming
In order to define the global chromatin remodeling that occurs during CEF reprogramming into ciPSC, we compared ATAC-seq results from CEF vs CEF-syngeneic ciPSC, by unsupervised hierarchical clustering and heat map plotting ( Figure 3A). Two clusters emerged from this analysis: the "Open to Closed" (OC) peaks cluster, and the "Closed to CEF OC genes are more closed than CO genes, especially in the coding areas. Next, enrichment of GO terms for the OC and CO groups ( Figure 3C) revealed that amongst the pathways that are enriched in the OC group are immunity-related genes, and the Wnt, VEGF, PDGF, and FGF pathways, all related to cell identity establishment or maintenance, among many others. This is in alignment with expected chromatin remodeling during reprogramming, but the fact that we did not observe an opening or enrichment of classic stemness genes between F I G U R E 3 Genomic accessibility landscape remodeling during canine stromal cell reprogramming. A, Pearson correlation heat map of 2403 peaks with differential openness when comparing embryonic stromal cells (ie, CEF) vs ciPSC, representing the CEF-ciPSC transition, for all cell type data sets. Each column is a cell type and each row is an ATAC-seq peak. Color scale shows relative ATAC-seq peak signal, on a log 2 base. B, Read-depth heat map of OC and CO gene clusters, for the CEF vs ciPSC data set. Genes represented from TSS to TES. C, GO term enrichment by PANTHER overrepresentation test with fold enrichment >2, P < .05, and FDR <0.05. Showing fold enrichment and corresponding FDR, for the CO and OC gene groups. D, Selected Integrative Genome Viewer (IGV) genomic views of ATAC-seq data for stemness genesSOX2, NANOG, OCT4, KLF4, and MYC. All genome view vertical scales were group autoscaled to normalize for read-depth. Genes are oriented 5 0 to 3 0 and graphed from 2 kb upstream of the TSS to 2 kb downstream of the TES. E, Open TF motif discovery for pairwise comparison CDF-cASC vs CEF-ciPSC. TF families and motifs are indicated on the right of the fold-change heat map. E, Open TF motif discovery for pairwise comparison CEF vs ciPSC. TF families and motifs are indicated on the right of the fold-change heat map. ATAC-Seq, assay for transposase-accessible chromatin using sequencing; cASC, canine adipose-derived mesenchymal stem cells; CDF, canine dermal fibroblasts; CEF, canine embryonic fibroblasts; ciPSC, canine-induced pluripotent stem cells; CO, closed to open; FDR, false discovery rate; GO, gene ontology; OC, open to closed; TES, transcription end site; TSS, transcription start site F I G U R E 4 Genomic accessibility landscape in canine adult and embryonic stromal cells. A, Pearson correlation heat map of 976 peaks with differential openness when comparing embryonic stromal cells (ie,: CEF) vs adult stromal cells, for all the cell type data sets. Each column is a cell type and each row is an ATAC-seq peak. Color scale shows relative ATAC-seq peak signal, on a log 2 base. B, Read-depth heat map of clusters III and IV, for the CDF-cASC vs CEF data set. C, GO term enrichment by PANTHER overrepresentation test with fold enrichment >2,P < .05, and FDR <0.05. Showing fold enrichment and corresponding FDR, for the "Adult-open Embryonic-closed" group. No enrichment found for "CEF-ciPSC-open" group. D, Selected Integrative Genome Viewer (IGV) genomic views of ATAC-seq data for representative genes from each cluster. All genome view vertical scales were autoscaled to normalize for read-depth. Genes are oriented 5 0 to 3 0 and graphed from 2 kb upstream of the TSS to 2 kb downstream of the TES. E, Open TF motif discovery for pairwise comparison CDF-cASC vs CEF. TF families and motifs are indicated on the right of the fold-change heat map. ATAC-Seq, assay for transposase-accessible chromatin using sequencing; cASC, canine adipose-derived mesenchymal stem cells; CDF, canine dermal fibroblasts; CEF, canine embryonic fibroblasts; ciPSC, canine-induced pluripotent stem cells; FDR, false discovery rate; GO, gene ontology; TES, transcription end site; TF, transcription factors; TSS, transcription start site F I G U R E 5 Candidate genetic barriers to the reprogramming of adult canine stromal cells. A, Pearson correlation heat map of 935 peaks with differential openness when comparing adult stromal cells vs embryonic stromal cells (CEF) and ciPSC, representing the reprogramming barriers for generation of ciPSC from CDF and cASC but not from CEF, for all the cell type data sets. Each column is a cell type and each row is an ATAC-seq peak. Color scale shows relative ATAC-seq peak signal, on a log 2 base. B, Read-depth heat map of adult and undifferentiated-specific gene clusters, for the CDF-cASC vs CEF-ciPSC data set. C, GO Figure 3D). This might not be the case when the donor cells are CDF or cASC, where stemness genes are not accessible, but it was impossible to assess this at the ciPSC level, since CDF and cASC did not form established ciPSC lines that we could extract material from to study. Genes enriched in the CO group include oxidoreductase and reactive oxygen species metabolismrelated genes, which underlines the importance of oxygen metabolism in stem cells, as well as cadherin, the p53 pathway and cytokines.
Finally, we studied the TF motif patterns in CEF before and after reprogramming to ciPSC ( Figure 3E) and found that TF motifs for stemness factors such as OCT4, SOX2, and NANOG, as well as other

| Chromatin landscape differences between adult and embryonic stromal cells
We analyzed ATAC-seq peaks that were significantly more open or closed when comparing CDF-cASC to CEF, by unsupervised hierarchical clustering ( Figure 4A). Genomic patterns derived from this analysis were grouped into clusters, which when overlaid with all the cell lines studied, resulted in similar hierarchical organization to the one observed before (Figures 2C-E). Furthermore, the read-depth heat map for the genes derived from these peaks cluster groups ( Figure 4B) shows the same peak enrichment flanking the TSS and TES, as previously observed. In addition, we have identified genes that are more open in adult stromal cells as opposed to CEF, especially in gene-flanking areas, as observed in Figure 4B, top panel graphs. This group was labeled "Adult open/Embryonic closed". This can also be seen in Figure 4A where these peaks appear in green for CDF and cASC, and mostly black for CEF. Meanwhile, CEF present an intermediate chromatin openness level in regulatory areas, which is between that of CEF and cASC, but are clearly more closed in coding areas ( Figure 4B, top panel graphs). Moreover, chromatin accessibility in pluripotency core genes was diminished in CDF, when compared with CEF, especially in SOX2, NANOG, and OCT4 ( Figure S6), proving that CEF indeed might be primed for pluripotency, at least more so than CDF. GO term analysis of this CEF vs CDF-cASC comparison   Here, we demonstrate the ability to reprogram CEF to ciPSC, and report the inability of canine adult stromal cells to generate stable ciPSC colonies. Cellular age could have an effect on reprogramming efficiency, but this was not the case in the current study. CEF had a tendency to proliferate faster, and while higher proliferative rate is associated with increased reprogramming efficiency, which could explain the tendency of CEF to be reprogrammed more efficiently, there were no statistical differences in PDT among the cell types studied ( Figure S5). Furthermore, as is the case with murine strains, 42 there are no literature reports that suggest there is a canine breed dependence for somatic cell reprogramming, and although we did not observe any influence of breed in reprogramming, this cannot be ruled out.
Through the generation of an OCT4-eGFP reporter system, we show that the endogenous OCT4 locus is engaged during reprogramming of CEF, and that this is controlled mainly from the PE enhancer, 43,44 which suggests that our reprogrammed ciPSC are in a primed pluripotency state. Furthermore, to study the difference in reprogramming capabilities between embryonic and adult canine stromal cells, we performed effective and reproducible ATAC-seq sequencing and data analysis on different types of canine cells. Our data analysis approach allowed us to identify groups of genes that may be involved in the inability to reprogram adult stromal cells.
We initially studied the process of reprogramming of CEF to ciPSC, identifying major pathways that change during this process.
Among them, the Wnt, VEGF, FGF, and PDGF pathways were of interest. [45][46][47] Notable loci expected to close that indeed did were AFP, ANGPT1, AR, and KDR, all markers of committed lineages. 48 previously reported to be involved in somatic cell reprogramming [66][67][68] and could be barriers to canine specific cell reprogramming.
Interestingly, some TF motifs that were enriched in CEF and ciPSC as opposed to adult stromal cells were FOSL2, Jun-AP1, and JUNB, which have been shown to be involved in identity maintenance, proliferation, immune response, and cell death. 69,70 Interestingly, several members of the AP-1 family have previously been shown to inhibit reprogramming, 71 and together with the fact that data showed no increased chromatin accessibility for these loci, we hypothesize that the TF motif enrichment is not related to increased AP-1 formation and/or activity. These motifs might confer different TF functions in canines, as opposed to the human/murine models. In conclusion, these genes constitute possible reprogramming barriers, whose expression should be investigated within this model, and can be targets in interference experiments, to the end of identifying a pathway that when blocked can serve as a strong reprogramming enhancer.
TF motif discovery for CEF vs ciPSC showed surprising results, with motifs for OCT4, SOX2, and NANOG enriched in CEF instead of ciPSC, as expected, OCT4 being one of the essential pioneering factors needed for cellular reprogramming in most models studied. 72 When we performed TF motif discovery for adult stromal cells against embryonic fibroblasts we found, as expected, that TF family motifs enriched in adult stromal cell peaks are related to somatic loci activation and lineage identity development and maintenance, such as MyoD/G, involved in muscle gene expression and cell determination; and Tcf21, associated with AP-1 binding. 73 On the other hand, the motifs enriched in CEF included Klf4 and NFY motifs, both associated with pluripotency. 74 TF motif discovery for CEF vs ciPSC showed surprising results, with motifs for OCT4, SOX2, and NANOG enriched in CEF instead of ciPSC, as expected. We hypothesized that the latter findings reflect the "reprogram primed" state of CEF, though additional functional validation is required. Lastly, when we studied the motifs enriched in adult stromal cells against the motifs shared by CEF and ciPSC, we found that motifs associated with pluripotency maintenance were enriched in sequences differentially open in ciPSC, such as Nrf2, 75 Bach1 74 and Atf3, a target of Myc, that could be related to pluripotency maintenance through cell proliferation effects. These motifs could be targeted for chromatin aperture, in order to improve a canine-specific adult stromal cell reprogramming protocol.
Moreover, genome annotation in a non-model species like the dog is still suboptimal. An astounding 84 Mb of canine transcribed sequence is not found in the existing Ensembl canine reference. In addition, one study reports finding most ATAC-seq peaks in promoter areas, 76 whereas we found most peaks in distal intergenic areas. The inaccuracy of the genome annotation contributes to difficulty in assignment of genomic areas to appropriate genes or genomic functions. In particular for the dog genome, annotation for a high amount of loci and regulatory regions is only predicted, not precise, or directly inexistent, generating not only the need for manual data analysis or curation, but also the inability to detect associations. Continued work on the species will generate more interest, more research, and more investment in resequencing the species' genome and performing more functional annotation. In addition, different genomic areas of a given gene suffered bidirectional chromatin accessibility changes; this speaks about a very complex chromatin rearrangement during reprogramming that is probably based on both positive and negative enhancer areas being differentially accessible.

| CONCLUSIONS
Our data provide a deeper understanding of nuclear chromatin remodeling during cellular reprogramming in dog cells, and define candidate barriers for somatic cell reprogramming. Such candidate reprogramming barriers could be the target of future studies that aim to generate a better understanding of the regulatory networks that govern canine pluripotency. The uncovering of particular targets to facilitate cellular reprogramming to ciPSC will support the generation of robust ciPSC for translational research in a more efficient manner.

ACKNOWLEDGMENTS
We would like to thank the UCD DNA Technologies & Expression Analysis Core and Bioinformatics Core, for their services and support in sequencing and bioinformatics analysis. We would also like to thank Michelle Halstead for her help regarding bioinformatics tools and analysis methods. Funding for this project was provided by the Center for Companion Animal Health and by the Veterinary Institute for Regenerative Cures, at the UCD School of Veterinary Medicine.