Cell Stem CellResourceInduced Pluripotent Stem Cellsand Embryonic Stem Cells Are Distinguishedby Gene Expression SignaturesMark H. Chin,1 Mike J. Mason,1,8 Wei Xie,1 Stefano Volinia,10 Mike Singer,11 Cory Peterson,3 Gayane Ambartsumyan,2Otaren Aimiuwu,2 Laura Richter,2 Jin Zhang,4 Ivan Khvorostov,4 Vanessa Ott,11 Michael Grunstein,1 Neta Lavon,9Nissim Benvenisty,9 Carlo M. Croce,10 Amander T. Clark,2,5,6,7 Tim Baxter,11 April D. Pyle,3,5,6,7 Mike A. Teitell,4,5,6,7Matteo Pelegrini,2,6 Kathrin Plath,1,5,6,7,* and William E. Lowry2,5,6,7,*1Departmentof Biological Chemistry, David Geffen School of Medicineof Molecular, Cell and Developmental Biology3Department of Microbiology, Immunology, and Molecular Genetics4Departments of Pathology and Pediatrics, David Geffen School of Medicine5Broad Stem Cell Center6Molecular Biology Institute7Jonsson Comprehensive Cancer Center8Department of StatisticsUniversity of California, Los Angeles, CA 90095, USA9Department of Genetics, Hebrew University, Jerusalem 91904, Israel10Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University, Columbus, OH 43210, USA11Roche NimbleGen, Inc., Madison, WI 53719, USA*Correspondence: [email protected] (K.P.), [email protected] (W.E.L.)DOI d pluripotent stem cells (iPSCs) outwardlyappear to be indistinguishable from embryonicstem cells (ESCs). A study of gene expressionprofiles of mouse and human ESCs and iPSCssuggests that, while iPSCs are quite similar to theirembryonic counterparts, a recurrent gene expression signature appears in iPSCs regardless of theirorigin or the method by which they were generated.Upon extended culture, hiPSCs adopt a geneexpression profile more similar to hESCs; however,they still retain a gene expression signature uniquefrom hESCs that extends to miRNA expression.Genome-wide data suggested that the iPSC signature gene expression differences are due to differential promoter binding by the reprogramming factors.High-resolution array profiling demonstrated thatthere is no common specific subkaryotypic alterationthat is required for reprogramming and that reprogramming does not lead to genomic instability.Together, these data suggest that iPSCs should beconsidered a unique subtype of pluripotent cell.INTRODUCTIONEmbryonic stem cells (ESCs) are in vitro representations of theinner cell mass of developing embryos (Gokhale and Andrews,2006) and therefore present a valuable tool for regenerativemedicine and serve as models of embryonic developmentin vitro (Keller, 2005). Induced pluripotent stem cells (iPSCs)are not derived from embryos but are in vitro constructs thoughtto mimic ESCs (Hochedlinger and Plath, 2009; Nishikawa et al.,2008; Takahashi and Yamanaka, 2006). Therefore, a number ofissues must be addressed before iPSC technology can beapplied to regenerative medicine or in vitro modeling of diseaseor development. Are iPSCs as good as ESCs at replicating thestate of bona fide embryonic cells? Do iPSCs generate differentiated progeny as efficiently as ESCs? Do the methods employedto generate iPSCs confound their use in a clinical or experimentalsetting? These questions should be at the forefront when considering whether iPSCs will serve as useful models of human development and disease. However, before these questions can beanswered, it is critical to understand any molecular differencesbetween iPSCs and ESCs in their undifferentiated state.Even though many groups have now shown that both human(h) and mouse (m) somatic cells can be reprogrammed by overexpression of variable sets of a few transcription factors to whatappears to be an embryonic state (Lowry et al., 2008; Maheraliet al., 2007; Wernig et al., 2007; Okita et al., 2007; Park et al.,2008; Takahashi et al., 2007; Takahashi and Yamanaka, 2006;Yu et al., 2007), the degree of molecular similarity between iPSCsand ESCs has not been completely elucidated. Every studysuggests that iPSCs are ‘‘nearly identical’’ to their embryoderived counterparts, but it remains unclear whether the smallpercentage of genes that are differentially expressed betweeniPSCs and ESCs are shared among different iPSC lines andwhether this difference is biologically significant. Careful studyis warranted to discern whether these small differencesobserved between iPSC and ESC lines are particular to individual experiments or whether reprogramming of somatic cellsgenerates a state that is common among iPSCs and uniquefrom ESCs. Because of the methods used to reprogram somaticcells to an embryonic state, iPSCs could possess significantCell Stem Cell 5, 111–123, July 2, 2009 ª2009 Elsevier Inc. 111

Cell Stem CellExpression Signatures Distinguish iPSCs from hESCsFigure 1. Expression Profiling Demonstrates Differences between hiPSCs and hESCs(A) Unsupervised hierarchical clustering of global gene expression data in human fibroblasts (hFibr), early-passage hiPSCs (e-hiPSC), late-passage hiPSCs(l-hiPSC), and hESCs. Expression values are presented as the log2 ratio of the given gene divided by the average of the ESC lines (all subsequent expressionheatmaps are presented in this manner). Individual cell lines used are indicated below the heatmap.112 Cell Stem Cell 5, 111–123, July 2, 2009 ª2009 Elsevier Inc.

Cell Stem CellExpression Signatures Distinguish iPSCs from hESCsdifferences at various molecular levels, including the following:genomic integrity; epigenetic stability; noncoding, and perhapseven coding, RNA expression. To date, no one has describedthe full extent of differences between iPSCs and ESCs, andwhether these differences are shared among reprogrammedlines derived by various methods and labs.Here, we applied genome-wide methods to compare mouseand human iPSCs with ESCs by array CGH, to uncover subkaryotypic genome alterations; coding RNA profiling, to uncover geneexpression changes; miRNA profiling, to determine changes inexpression of small noncoding RNAs; and histone modificationprofiling, to determine whether epigenetic changes correlatewith gene expression differences. The sum of these analysesuncovers a novel gene expression signature that is uniquefrom ESCs and shared among iPSC lines generated fromdifferent species and in different reprogramming experiments.Whether the iPSC signature described here plays a functionalrole in self-renewal or differentiation warrants extensive furtherinvestigation.RESULTSDistinct Gene Expression Signatures Are Associatedwith hiPSCs at Different PassagesTo determine whether gene expression differences observedbetween ESCs and iPSCs are stochastic or indicative of differences between these pluripotent cells types, a detailed genomewide expression analysis was carried out between three hESClines that we routinely maintain in the lab (HSF1, H9, and CSES4)and hiPSC clones at different passages (Table S1, a summary ofcell lines and passages used in this study, is available online). ThehiPSC clones used here were obtained in a single fibroblastreprogramming experiment published previously (Lowry et al.,2008) through retroviral expression of OCT4, SOX2, NANOG,KLF4, and C-MYC. Five hiPSC clones (#1, 2, 5, 7, and 18), twoof which had integrated the NANOG virus in addition to the otherfour factors (clones 1 and 5), were expanded for further analysis ofpluripotency, including teratoma formation and in vitro differentiation (Karumbayaram et al., 2009; Lowry et al., 2008; Park et al.,2009). These clones were all profiled at early passage (passages[p] 5–9) and clones 1, 2, and 18 were also analyzed at latepassage (p54–61). Unsupervised hierarchical clustering of theexpression data across hESCs, early- and late-passage hiPSCs,and fibroblasts highlighted interesting patterns of gene expression between these cell types (Figure 1A). First, even thoughhiPSCs are considered highly similar to hESCs, they are moresimilar to each other than to hESCs, as shown previously (Lowryet al., 2008). Second, late-passage hiPSCs cluster more closelywith hESCs than with early-passage hiPSCs. In agreement withthese findings, Pearson correlation analysis also demonstratedthat the gene expression profile of late-passage hiPSCs is moreclosely related to hESCs than to early-passage hiPSCs usingFisher’s z0 -transformation comparison of correlations (z 0,Figure S1).A Unique Expression Signature for Early-PassagehiPSCsAnalyzing the expression differences between hESC lines andour early-passage hiPSC lines, we found 3,947 (out of 17,620)genes that are significantly different between all hiPSC linesand hESC lines as determined by a Student’s t test (p 0.05)and requiring an at least a 1.5-fold expression change betweenhiPSCs and hESCs (Figure 1B; termed early-passage hiPSCsignature genes; Table S2). Since these expression differencesto hESCs are shared among all five independent hiPSC clones,the data suggest that hiPSCs represent a common cell typethat is similar to but distinct from hESCs. Within this expressionsignature, 79% of the genes are expressed at a lower level iniPSCs than ESCs (Figure 1B0 ). Gene Ontology analysis suggeststhat these genes have a role in basic processes (energy production, RNA processing, DNA repair, mitosis), while genes relatedto differentiation (organ development and signal/secreted glycoprotein) are more abundantly expressed in hiPSCs than ESCs(Figure 1B000 ).These findings suggest that hiPSCs have not efficientlysilenced the expression pattern of the somatic cell from whichthey are derived and failed to induce genes important for undifferentiated, highly proliferative hESCs. Indeed, a classificationof the early-passage hiPSC signature genes according to theirexpression difference between fibroblasts and hESCs showsthat 82% of the genes that are expressed at a higher level inhESCs versus hiPSCs are also more highly expressed in hESCsversus fibroblasts (Figure 1B0 ), indicating that an important difference between hESC and early-passage hiPSCs is the lack of thecomplete induction of these genes. When analyzing the geneswith more abundant transcripts in early-passage hiPSC thanhESCs, 71% appear to be inefficiently silenced from the fibroblast state (Figure 1B0 ). The remaining smaller portion of earlyhiPSC signature genes can be explained by excessive inductionof an ESC-specific expression program or suppression of thefibroblast pattern (Figure 1B0 ).While the expression differences between early-passagehiPSC and hESC lines appear to be reprogramming dependent,one obvious explanation for the difference could be that we(B) e-hiPSC signature genes. As in (A) except for the 3947 genes significantly different between e-hiPSC and hESC based on two criteria, Student’s t test (p 0.05)and at least a 1.5-fold change. Genes are ordered according to the decreasing average expression ratio between hESCs and e-hiPSCs. Expression data forpassage 5 and 28 hESC (P5 and P28) for this set of genes are added to the right. (B0 ) e-hiPSC genes were divided into those expressed at higher levels in hESCsthan e-hiPSCs and vice versa. Each of these two groups was further subclassified into two groups, either more highly expressed in hESCs than fibroblasts (red) ormore highly expressed in fibroblasts than hESCs (blue). (B00 ) Boxplots of the absolute value of the log2 fold change between hESC and the following: e-hiPSC,l-hiPSC, P5 or P28 hESC (*all p 0). (B000 ) Gene ontology (GO) analysis for e-hiPSC signature genes upon division into those for which hESC expression wasgreater than e-hiPSC and vice versa. Only significant GO-terms with an enrichment value 3 (p 0.001) are presented.(C) l-hiPSC signature genes. As in (B) except for the 860 genes significantly different between hESCs and late-passage-hiPSCs. (C0 ) Barplot similar to (B0 ) usingl-hiPSC signature genes. (C00 ) Boxplot similar to (B00 ) for l-hiPSC signature genes.(D) Common hiPSC signature genes were determined as the overlap between early- and late-passage signature genes from (B) and (C), respectively, as shown inthe Venn diagram. The expression heatmap below is presented for the 318 genes in the overlap as in (B). (D0 ) Barplot similar to (B0 ) using common hiPSC signaturegenes. (D00 ) Boxplot similar to (B00 ) for common hiPSC signature genes.Cell Stem Cell 5, 111–123, July 2, 2009 ª2009 Elsevier Inc. 113

Cell Stem CellExpression Signatures Distinguish iPSCs from hESCscompared early-passage hiPSCs with hESCs at higher passage(p37, 41, and 51) since the availability of hESCs at earlypassage is limited. Thus, the distinct expression pattern ofearly-passage hiPSCs versus late-passage hESCs could simplybe due to differences induced by extended culturing. To estimatethe contribution of culture-induced transcriptional changes, early(p5)- and middle (p28)-passage hESCs were obtained, profiled,and compared to our cell lines. This analysis suggested thatthe vast majority of the genes consistently differentially expressed between early-passage hiPSCs and hESCs do notdiffer dramatically between early-, middle-, and late-passagehESCs (Figure 1B, far right, and Figure S2). Together, thesedata indicate that the early-passage hiPSC signature is nota common feature of low-passage pluripotent stem cells but isspecific to hiPSCs.Expression Differences between Early- andLate-Passage hiPSCsUpon extended passaging, the gene expression profile ofhiPSCs appears to become more similar to hESCs (Figures 1A,1B, and S1). In agreement with this conclusion, late-passagehiPSCs have a significantly decreased amplitude of expressiondifferences for early-passage hiPSC signature genes (Figures1B00 and S3A). As expected, the same was true when comparingearly-, middle-, and late-passage hESCs (Figure 1B00 ). Looking at48 genes that are specifically expressed in hESCs (taken fromLowry et al., 2008), it is clear that hESC signature genes areexpressed at lower levels in all early hiPSC lines but recover afterextended culture to a level commensurate with that found inhESCs (Figure S4). These data indicate that many of the expression differences that occur between early-passage hiPSCs andhESCs get resolved upon extended passaging.However, Figure 1A shows that late-passage iPSC still differfrom ESCs. The differential expression between late-passagehiPSCs and hESCs of some of these genes was validated atthe protein level (Figure S5). We therefore defined genes thatare differentially expressed more than 1.5-fold between latepassage hiPSCs and hESCs and found 860 genes that fit thesecriteria (Figure 1C; termed late-passage hiPSC signature genes;Student’s t test, p 0.05; Table S3). Gene ontology analysisfailed to uncover enrichment for any particular functional category among late-passage hiPSC signature genes, in agreementwith the finding that at late passage the majority of expressiondifferences of hiPSCs with hESCs that exist at early passageare resolved. Comparing fibroblast, hESC, and hiPSC expression, we found that 80% of the late-passage hiPSC signaturecan be attributed to inefficient silencing of the fibroblast expression pattern and lack of full induction of hESC-specific genes,similar to what was found for the early-passage hiPSC geneexpression signature (Figure 1C0 ). In agreement with this notion,318 genes (37%) are shared between early- and late-passagehiPSCs versus hESCs (Figure 1D; Table S4). This enduring(also termed common) hiPSC signature is clearly a result ofdifferences between cells generated by the reprogrammingprocess versus those derived from human embryos and doesnot appear to differ dramatically in expression between earlyand later passages of hiPSC (Figures 1D00 and S3C). Nearly allof the genes in this group insufficiently induce hESC-specificgenes and suppress fibroblast-specific genes (Figure 1D0 ).114 Cell Stem Cell 5, 111–123, July 2, 2009 ª2009 Elsevier Inc.Furthermore, the common hiPSC signature genes exhibit themost dramatic change in gene expression between fibroblastsand hESCs among all signature expression groups (Figure S7).Surprisingly, many late-passage hiPSC signature genes aremore similarly expressed between early hiPSCs and hESCsthan in late hiPSCs (Figures 1C, 1C00 , and S3B). This is consistentwith the notion that an overall readjustment of the expressionsignature occurs upon passaging of hiPSCs, rather than simplyclosing in on the ESC expression. Taken together, the comparison of hiPSC and hESC expression patterns indicates that(1) at early passage hiPSC lines are incompletely reset toa hESC-like expression pattern and (2) even at late passagedifferences between hESCs and hiPSCs persist and reflect animperfect resetting of somatic cell expression to an ESC-likestate.To exclude the possibility that gene expression differencesbetween hESCs and hiPSCs at late passage could be due todifferential proliferation of hiPSCs and hESCs, cell-cycle analysiswas performed by FACS. This analysis demonstrated that latepassage hiPSCs and hESCs do not proceed through the cellcycle at different rates, and thus the late hiPSC signature is notdue to varying proliferation capacity (Figure S6).Conservation of the hiPSC Expression Signature acrossIndependent Reprogramming ExperimentsNext, we determined if expression signatures observed betweenestablished hESC lines and hiPSCs from our lab also occur inreprogramming experiments by different labs to establishwhether these differences are shared among reprogrammedlines derived by various methods and labs. To this end, we performed a similar analysis as described above with data availablefrom other laboratories (NIH, Gene Expression Omnibus) andcompared the overlapping signatures with those signaturesderived from our early and late hiPSCs. In Maherali et al.(2008), neonatal foreskin fibroblasts were reprogrammed to theiPSC state by expressing OCT4, SOX2, KLF4, NANOG, and CMYC using tetracycline-inducible lentiviral vectors (Maheraliet al., 2008). Gene expression profiling from this experiment revealed 1653 genes at least 1.5-fold differentially expressedwhen comparing their hiPSCs to their hESCs (Figure 2). Of these,618 overlapped with the 3947 early-passage hiPSC signaturegenes found in our hiPSC clones (p 10 47; Figure 2).The same analysis was performed with data from Soldneret al., who reprogrammed dermal fibroblasts obtained frompatients with Parkinson’s disease using a single doxycyclineinducible lentivirus carrying either four (OCT4, SOX2, c-MYC,and KLF4) or three (OCT4, SOX2, and KLF4) reprogrammingfactors (Soldner et al., 2009). Importantly, in this study, thereprogramming factors were removed after establishment ofhiPSC lines because the viral sequences encoding the factorswere Cre-recombinase excisable. We found a 1.5-fold differential expression of 899 genes between their hiPSCs and theirhESCs before excision of the reprogramming factors (2loxhiPSCs). Of these genes, 329 overlapped with our early-passagehiPSC signature (p 10 22; Figure 2). Following Cre-mediateddepletion of the factors and subcloning of the iPSCs (1loxhiPSCs), 553 genes remained differentially expressed followingour criteria, and 222 of these genes overlapped our earlypassage hiPSCs (p 10 20).

Cell Stem CellExpression Signatures Distinguish iPSCs from hESCsFigure 2. Differential Expression Patterns betweenhESC and hiPSC Are Conserved among IndependentReprogramming ExperimentsiPSC signature genes defined as genes differentially expressedbetween hiPSC and hESC lines (Student’s t test [p 0.05] andat least a 1.5-fold change) were obtained from Figure 1 of thisstudy (Chin et al.) and from additional published reprogramming experiments. The matrix summarizes the overlap of hiPSCsignature genes between the different experiments. The valueson the diagonal designate the total number of genes identifiedas significantly different in expression between hESCs and theindicated hiPSCs. The intersection of the rows and columnsgive the number of genes that are in common between thetwo respective experiments and the corresponding significance (p value) as determine using Fisher’s exact test. Forthe Soldner et al. experiment (Soldner et al., 2009), data wereanalyzed before (2lox) and after (1lox) excision of the reprogramming factors. In Yu et al. (2009), iPSCs were generatedwith episomal vectors and analyzed before (episomal) and aftersubcloning. Genomic integrations were not detected for any ofthese subclones. The Maherali iPSCs (Maherali et al., 2008)were reprogrammed with integrating lentiviruses.Yu et al. reprogrammed neonatal foreskin fibroblasts usingnonintegrating episomal vectors encoding OCT4, SOX2,NANOG, c-Myc, KLF4, LIN28, and SV40LT (episomal hiPSC)(Yu et al., 2009). Upon continuous passaging, the episomalvectors are lost and hiPSC subclones without any ectopic DNAcould be isolated (subcloned hiPSCs). An analysis of theirexpression data again revealed a set of genes that are differentially expressed between hESCs and hiPSCs and a highly significant overlap of these differentially expressed genes with thosefound differentially expressed between our hiPSCs and hESCs(Figure 2). This finding was particularly relevant not only becausethe Yu et al. lines never experienced integration, but alsobecause the combination of reprogramming factors useddiffered slightly from that in other reprogramming experiments.These analyses described the degree of similarity of differential expression between hESCs and hiPSCs generated in independent experiments. To determine the extent by which thesame genes are differentially regulated in the same directionamong independent experiments, a similar analysis was performed with the added requirement that direction of the expression change between hESCs and hiPSCs must be conservedin both experiments being compared. These data suggest thatmany of the genes shown in Figure 2 to be differentially expressed in multiple experiments were also changed in thesame direction (Figure S8).Further analysis to demonstrate the degree of overlap betweenany three hiPSC signatures also suggests a highly significantoverlap. Between the Chin, Maherali, and Soldner signatures,79 genes were shared (p 10 44); between the Chin, Maherali,and Yu signatures, 106 genes were shared (p 10 96); betweenChin, Soldner, and Yu, 48 genes were shared (p 10 34). Amongall the experiments of all four laboratories, 15 genes are differentially expressed between early-passage hiPSCs and hESCs (p 10 54; Table S5). The highly significant overlap between each ofall four of these completely independent reprogramming experiments suggests that the hiPSC state is not stochastic. Confirmingthis conclusion, a gene ontology analysis of the genes differentially expressed in the experiments from the four groups againsuggest that the signatures that arise in each reprogrammingexperiment share a functional similarity (Figure S9).Conservation of an iPSC Expression Signature betweenMouse and Human Reprogramming ExperimentsTo determine if the early hiPSC phenotype is specific to humanreprogramming or a general iPSC phenomenon, a comparisonof mouse iPSCs and ESCs was performed. Hierarchical clustering of mESCs and different miPSC cell lines that wereobtained in a fibroblast reprogramming experiment with retrovirally delivered Oct4, Sox2, Klf4, and c-Myc was performed. Thisanalysis demonstrated that, although highly similar, miPSCsand mESCs also differ in their expression (Figure 3A). As withthe human reprogramming data, the sample tree of the hierarchical clustering revealed that mESCs and miPSCs cluster separately. Specifically, 1388 genes significantly differ in expressionlevels between miPSCs and their embryonic equivalents asdetermined by a Student’s t test (p 0.05) and have at least a1.5-fold difference. Many of these genes are functionallyinvolved in transcriptional regulation and organ development(Figure S9B), as observed with human iPSCs signature genes.To further assess the coregulation of genes in mouse and humaniPSC reprogramming experiments, the subsequent analysis waslimited to only those with identifiable homologs between mouseand human transcriptomes (HomoloGene database Release 63).Twenty-nine percent of the trimmed-down miPSC signaturegenes were also differentially expressed in our early hiPSCs(p 10 7), suggesting that the iPSC state is remarkably robustacross species (Figure 3B; Table S6).Similar to our observation with the human iPSC signaturegenes, the majority of miPSC signature genes appeared to beESC-specific genes that were insufficiently induced andfibroblast-specific genes that were not repressed completely(Figure 3C). To determine whether differential regulation of targetgenes by the reprogramming factors themselves could drive thedifferential expression of genes between iPSCs and ESCs, wetested whether expression differences between miPSC andmESCs correlate with binding differences of the reprogrammingCell Stem Cell 5, 111–123, July 2, 2009 ª2009 Elsevier Inc. 115

Cell Stem CellExpression Signatures Distinguish iPSCs from hESCsFigure 3. Reprogramming of Human and Mouse Fibroblasts Results in Conserved Expression Differences between ESC and iPSC(A) Unsupervised hierarchical clustering of global gene expression data from mouse (m) ESCs, miPSCs, and mouse embryonic fibroblasts (MEFs) obtained fromthe Maherali et al. data set (Maherali et al., 2007). The lines included are biological replicates of V6.5 and E14 mESC lines and of 1D4 and 2DA iPSC lines,respectively. All log2 ratios are relative to averaged expression in mESCs.(B) Comparison of iPSC signatures between human and mouse reprogramming experiments. Human early-passage iPSC signature genes as defined in Figure 1Bof this study (Chin et al.) and mouse iPSC signature genes defined with the Student’s t test (p 0.05) and an at least 1.5-fold change between miPSCs and mESCswere further annotated to only include genes that have homologous partners across the two species according to the HomoloGene database, resulting in 2834genes for the early hiPSC signature and 1018 genes for the miPSC signature. The Venn diagram shows the overlap between these two groups of genes, andsignificance was determined using the Fisher’s exact test. The heatmap above displays the expression values for the 294 iPSC signature genes conservedbetween mouse and human reprogramming experiments across the different cell types and species, as log2 ratio relative to the average ESC expression foreach species.(C) Similar to Figure 1B0 , miPSC signature genes were divided into those expressed at higher levels in mESCs than miPSCs and vice versa. Each of these twogroups was further subclassified into two groups, either more highly expressed in mESCs than fibroblasts (red color) or those more highly expressed in fibroblaststhan hESCs (blue color).(D) The heatmap depicts the log2 expression ratio between mESCs and miPSCs for miPSC signature genes ordered according to decreasing ratios. For thesegenes, the binding strength ( log10pXbar) of each reprogramming factor (Oct4 [O], Klf4 [K], Sox2 [S], or c-Myc [C]; data obtained from Sridharan et al., 2009) inmiPSCs was subtracted from the binding strength in mESCs. Higher binding strength in mESCs is represented in yellow and in miPSCs in blue. The intensity of thecolors increases as differential binding increases. The Pearson correlation of the differential binding strength relative to differential gene expression is given in theattached table.116 Cell Stem Cell 5, 111–123, July 2, 2009 ª2009 Elsevier Inc.

Cell Stem CellExpression Signatures Distinguish iPSCs from hESCsfactors between the two cell types. This analysis took advantageof genome-wide location data of the target genes of c-Myc, Klf4,Sox2, and Oct4 proteins in the mESCs and miPSC lines thatwere used for the expression analysis described here (Sridharanet al., 2009). We previously reported that binding patterns of thereprogramming factors are highly similar in iPSCs and ESCs butthat subtle differences exist, which we did not analyze further.Reanalysis of these minor differences in binding demonstratesthat the promoter regions of those genes that are expressed ata higher level in mESCs than miPSCs are correlated withstronger binding by each of the reprogramming factors, particularly by c-Myc and Klf4 (Figure 3D). Conversely, the promoterregions of those genes that are expressed at a higher level inmiPSCs are correlated with stronger binding by the reprogramming factors in miPSCs (Figure 3D).To determine whether the iPSC signature is specific to reprogramming with fibroblasts as opposed to other cell types, thistype of analysis was extended to iPSCs generated from mouseB cells by a different lab (Mikkelsen et al., 2008). As was shownwith fibroblast-derived miPSCs, miPSC lines made from B cellsalso display a common group of genes differentially expressedcompared to mESC lines (Figure S10). The high degree of overlap of B cell miPSC signatures with fibroblast miPSC linessuggests that iPSC gene expression signatures arise regardlessof the cell type of origin (522 genes, p 10 43). Furthermore,a significant portion of the B cell miPSC signature genes arealso found to be differentially expressed in our early-passagehuman iPSCs (729 genes, p 10 4). Taken together, early iPSCspossess a conserved gene expression signature that is sharedregardless of the lab of origin, species, or cell type from whichthey were derived.Global Reprogramming of Histone H3K27 PromoterMethylation in Late-Passage hiPSCsPerhaps as intriguing as the finding that all early hiPSCs appearto share a common gene expression signature that sets themapart from hESCs is the fact that this signature disappears afterextended culturing, albeit not completely. To further define at themolecular level how similar late-passage hiPSCs are to hESCs atsimilar passage, the state of histone H3 lysine 27 (K27) trimethylation was analyzed, since to date the genome-wide chromatinstructure of hiPSCs has not been probed. This chromatin modi

Cell Stem Cell Resource Induced Pluripotent Stem Cells and Embryonic Stem Cells Are Distinguished by Gene Expression Signatures Mark H. Chin,1 Mike J. Mason,1,8 Wei Xie,1 Stefano Volinia,10 Mike Singer,11 Cory Peterson,3 Gayane Ambartsumyan,2 Otaren Aimiuwu, 2Laura Richter, Jin Zhang,4 Ivan Khvorostov,4 Vanessa Ott,11 Michael Grunstein,1 Neta Lavon,9 Nissim Benvenisty,9 Carlo M. Croce,10 .