Gene regulatory network underlying endomesoderm specification inS. purpuratus embryos:  Many of the individual projects reported below are contributing to understanding of this GRN. At present, over 60 regulatory and signaling genes have been linked into this network.  The architecture of the network is emerging from an interdisciplinary approach in which high resolution spatial and temporal regulatory gene expression data are combined with perturbation data obtained by gene expression knockouts and the results of cis-regulatory analysis to provide a causal explanation of the observed embryology.  A model of the GRN through time is emerging which indicates the inputs and outputs of the cis-regulatory elements at its key nodes.  This model essentially provides the genomic regulatory code for specification of the endomesodermal territories of the embryo, up to gastrula stage (30 h after fertilization). This year we published the dynamic mechanism by which the endoderm/mesoderm fate decision is made, as well as that by which future anterior vs. posterior endoderm is specified. As of late 2011, the pre-gastrular skeletogenic lineage GRN and the endodermal GRNs are largely solved. The endodermal GRN project is now focused on the specification of the development of the post-gastrular gut, which consists of many distinct regions (foregut, midgut, hindgut, sphincters, blastopore/anus region). The initial major effort is to achieve a comprehensive determination of the dynamic regulatory states of these regions. Within the next year the pre-gastrular oral and aboral mesodermal GRNs, which produce different mesodermal cell types, will have been brought to a similar level of completeness as the pre-gastrular endodermal GRNs, including a complete analysis of mesodermal Notch target genes. (endoderm: Dr. Isabelle Peter, Jonathan Valencia, Miao Cui, Jina Yun, Natnaree Siriwon; mesoderm: Dr. Andrew Ransick, Dr. Stefan Materna*) *Graduated 2011

Dynamic Boolean model of endomesoderm gene regulatory network:  We have constructed a dynamic model representing the control system operative in life, such that the regulatory response capabilities of each gene in the endomesoderm GRN are formalized in a vector equation indicating the inputs and logic processing functions executed by the relevant genomic cis-regulatory module(s).  The vector equations encompass all the regulatory interrelations stated explicitly in the GRNs, and the model as a whole provides a direct test of the overall completeness of the experimental analysis underlying the GRN. Original strategies for incorporation of signaling interactions, embryonic geometry, and lineage, were devised. A wholly novel computational and graphic display apparatus was created to support model operations. Each hour the outputs of every gene in the model (if any) are computed from the inputs available then, for each endomesodermal spatial domain (skeletogenic, oral and aboral mesoderm, anterior and posterior endoderm); thus the model computes the dynamically changing regulatory states of the embryo. The relation between real time and change in transcriptional status had been calculated for sea urchin embryos earlier, in a first principles kinetic model (Bolouri and Davidson, PNAS, 2003), and these kinetics were applied to the temporal animation of the Boolean model. The results thus far are as follows: i, The model perfectly predicts the observed spatial domain of expression of each gene throughout the endomesodermal domains. ii,The model recreates the temporal dynamics directly observed for the spatial patterns of expression of almost all genes, with a few exceptions; thus the model demonstrates by direct comparison between data and observation that the GRNs are essentially complete (the oral and aboral GRNs only up to 18 h, the remainder to 30 h). iii, The model immediately pinpoints exactly where gaps in our knowledge remain. iv, The model can be used for in silico perturbation of the effects of gene knockouts and experimental embryology, and thus we have shown that it almost perfectly predicts the regulatory changes occasioned by certain gene over-expressions and gene knockouts, and even recreates the regulatory results of a famous experiment in which transplantation of early cleavage skeletogenic cells from the vegetal to the animal pole produces a second perfectly organized endomesoderm. (Dr. Isabelle Peter, Dr. Emmanuel Faure, Eric Davidson)

 Oral and aboral ectoderm GRNs:In an effort to extend GRN analysis to most of the domains of the embryo, we are working out the GRNs for oral and aboral ectoderm specification, including about 50 more regulatory genes (the one remaining major territory, the apical neurogenic region, is being studied in other sea urchin laboratories). The ectoderm is a complex mosaic of spatial regulatory states.  Both the aboral and oral ectoderms produce several sub-regional regulatory state domains, and they are separated by another territory with its own regulatory state, the neurogenic ciliated band. A very large amount of spatial expression analysis has been required to complete the roster of regulatory genes expressed in the ectoderm, and to unravel the constituent regulatory genes of the ectodermal domains abutting the endoderm, the remaining oral and aboral epithelia, the mouth region on the oral side, and the ciliated band. Complex inter- and intra-domain signaling events must also be taken into account. Based on extensive perturbation analyses and cis-regulatory data, GRNs are emerging that will soon approach the completeness of the endomesodermal GRNs. These GRNs will then be used for expansion to nearly the whole embryo of the predictive dynamic Boolean model. (Dr. Enhu Li, Dr. Smadar Ben-Tabou deLeon, Dr. Julius Barsi)

High throughputcis-regulatory analysis, and its impact on GRN analysis:A recent technological breakthrough is revolutionizing the processes of GRN validation and discovery, as well as vastly improving the efficiency with which cis-regulatory control systems can be analyzed.  This is the development of multiplexed cis-regulatory analysis using vectors marked with "bar-coded" sequence tags, up to 130 of which can be injected together into a single batch of sea urchin eggs (“nanotags”).  The individual vectors are regulated independently in vivo and their outputs can be de-convolved at once by NanoString technology. To this end a NanoString codeset was designed to recognize the barcode tag sequences. Expression of individual vectors can also be examined spatially since each vector also expresses GFP.  The effects of perturbations of gene expression can now be determined at the same time on endogenous genes and on their cis-regulatory systems isolated by high throughput functional genomic scans.  The uses of cis-regulatory nanotag technology include: i, scans of very large genomic regions to find cis-regulatory sequences in which large numbers of constructs can be assessed together; ii, time course measurements of quantitative output of >100 diverse cis-regulatory constructs at once; iii, effects of perturbations of regulatory state on large numbers of cis-regulatory constructs; iv, analysis of >100 mutant constructs in single experiments. (Dr. Jongmin Nam, Ping Dong)

Specific cis-regulatory projects using high throughput methods:  Cis-regulatory systems at certain GRN nodes are of particular importance, and many of these are the subjects of particular experimental analysis.  During this year cis-regulatory systems of the following genes, among others, were studied at the level of their sequence specific inputs and their functional meanings (some of these projects are now complete and have been or will soon be published): Among genes currently or recently thus characterized are alx1, foxa, brachyury, gcm, hnf6, tbx2/3, dlx, hox11/13b. (Dr. Smadar Ben-Tabou deLeon, Dr. Julius Barsi, Dr. R. Andrew Cameron, Dr. Andrew Ransick; Dr. Sagar Damle*, Miao Cui)

Embryonic transcriptome database and analysis:  We have embarked on a large-scale S. purpuratus transcriptome sequencing and analysis effort.  RNAs from 10 timed embryonic stages, from various feeding larval stages, and from all accessible adult tissues have been sequenced in depth and assembled, and quantitative per transcript databases are in construction. Three valuable kinds of data have been obtained and after computational analysis are being mounted on our public sea urchin genomics database:  i, We were able to correct erroneous gene models in the genome sequence in over 1/3 of cases from the a priori predictions to the actual mRNA structure(s); we also added several thousand new genes to the genome annotation; and we verified the remaining gene model predictions. The transcriptome data have vastly improved the usefulness and accuracy of the genome sequence. ii, The sets of transcripts expressed in each stage and tissue have been discovered, and classified in terms of a custom ontology of our own construction. This ontology reflects the classes of particular interest to the research community to which we belong and which we serve, such as transcripts coding for immune proteins, for cytoskeletal proteins, for transcription factors, for signaling factors, for biomineralization proteins, etc. The ontological classes were based on the expert annotations of genes in the S. purpuratus genome project. iii, We now possess global data on dynamic changes in prevalence of given transcripts during development and on absolute values. These values lock in nicely with NanoString and QPCR measurements in most cases. There is an innumerable wealth of data of biological interest in this data set. As one example, the egg transcriptome provides a comprehensive definition of maternal mRNA (first discovered in sea urchin eggs) both qualitatively and quantitatively, a subject now revisited for the first time in 30 years. (Dr. Qiang Tu, Dr. R. Andrew Cameron, Eric Davidson)

Physical isolation of embryonic cells expressing given regulatory states:Another technological breakthrough has been the development of methods for disaggregation of sea urchin embryos to the single cell level, and efficient FACS sorting, without significant loss of cells or reduction of viability. The cells are sorted on the basis of expression of recombineered BAC vectors, in which a flourophore is expressed under control of the cis-regulatory system of a gene canonically representing a given domain-specific regulatory state.  Recoveries of expressing cells are quite acceptable, and controls show that the procedure does not affect the distribution of transcripts. The availability of this technology leads in two different directions:  First, it will allow us to characterize the transcriptomes of many developmental compartments at different times, including complete knowledge of differentially expressed regulatory genes. This is the primary requirement for extension of GRN analysis to later and more complex developmental stages, a major near future laboratory objective. Second, we can obtain the transcriptomes of cells expressing given regulatory states.  For example in skeletogenic cells isolated on the basis of expression of two different specifically expressed BACs all known biomineralization gene transcripts are enriched and other effector genes expressed specifically in these cells can now be accessed. This in turn will lead to construction of "Global GRNs" in which the control systems of all specifically expressed downstream genes (of given ontological classes) are discovered and linked into our current upstream GRNs. (Dr. Julius Barsi, Dr. Qiang Tu, Erika Vielmas)

Evolutionary co-option at the cis-regulatory level:The major mechanism of evolutionary change in GRN structure is co-option of regulatory and signaling genes to expression in new spatial/temporal domains of the developing organism.  This means change of cis-regulatory modules at the sequence level, so that they respond to different regulatory states; or alternately, changes in the cis-regulatory modules of genes encoding the spatial allocation of regulatory states.  An excellent example is the use of Delta-Notch signaling to promote mesoderm specification in sea urchins, but to promote endoderm specification in sea stars (the sea urchin mode is the derived co-option).  Sea stars and sea urchins shared a last common ancestor about 500 million years ago.  To determine what happened in the lineage leading to sea urchins, we are carrying out a cis-regulatory study of sea star delta, for comparison to sea urchin delta, including cross-specific transfer of expression constructs. Current results show that though it is expressed quite differently in sea stars, a cis-regulatory module of sea star delta produces expression in sea urchin skeletogenic lineages, though no such lineage exists in sea stars. (Dr. Feng Gao)

Eucidaris tribuloides, an evolutionary window on the origins of the euechinoid endomesoderm specification GRN:  The euechinoids are the modern sea urchins, of which the main research model is S. purpuratus, for the last 40 years our laboratory workhorse.  The euechinoids diverged from the Paleozoic precursor echinoid lineage about 275 million years ago.  Eucidaris tribuloides is a descendant of the other surviving branch of echinoids deriving from the same common ancestor stock.  Its endomesodermal specification process is quite different from that of S. purpuratus; for example, it lacks a precociously invaginating skeletogenic micromere lineage altogether. Current results show the endodermal specification functions of E. tribuloides are similar to those of S. purpuratus, but its mesodermal specification is different in several respects. Its micromeres produce delta signals as do those of S. purpuratus, but control of their specification is differently wired, and they fail to express key skeletogenic genes in pre-gastrular development. The pleisiomorphic specification GRN of E. tribuloides mesoderm will reveal exactly how that of S. purpuratus evolved since divergence. In addition, we are attempting to reprogram the development of the skeletogenic cell lineage in E. tribuloides, by inserting regulatory apparatus from S. purpuratus.  We term this Synthetic Experimental Evolution.  (Eric Erckenbrack)

New genomics projects:  A large amount of additional echinoderm sequence is in process of being obtained. The leaders in this project are Richard Gibbs and Kim Worley at the Baylor College of Medicine Human Genome Sequencing Center (BCM-HGSC) in Houston, in close collaboration with us.  An initial draft sequence of the genome of Lytechinus variegatus has been obtained,and the genomes of the sea star referred to above, Patiria miniata, and of E. tribuloides are in process of being sequenced.Much additional genome sequence of S. purpuratus has also being obtained, so as to significantly improve its quality; and earlier skim sequences of two congeners, S. franciscanus and Allocentrotus (Strongylocentrotus) fragilis have been augmented. All of these data are being curated and mounted on the public genome databases that we maintain and continuously augment. (BCM-HGSC, R. Andrew Cameron, Eric Davidson)

Additional research endeavors:
Principles of developmental GRN design. We are formulating a general view of developmental GRN structure, and its implication for development and evolution.(i) Developmental GRNs are deeply hierarchical. (ii) They are composed of modular subcircuits executing discrete logic functions; (iii) These subcircuits evolve at different rates within the same GRN and may have diverse evolutionary origins; (iv) Multiple subcircuits are brought to bear on given developmental processes, including dynamic lockdowns by feedback circuitry, to ensure that they function accurately and resiliently:  the "wiring" is clearly not parsimonious in design; (v) Different processes, e.g., embryonic spatial specification, terminal differentiation, physiological response, are controlled by differently structured GRNs, which have different depths and are composed of different types of subcircuit. (vi) GRN structure provides specific guides to processes by which body plans have evolved. (Dr. Isabelle Peter, Eric Davidson)
Biotapestry. The GRN visualization software BioTapestry, developed by our collaborator Wm. Longabaugh (Institute for Systems Biology), is now in wide use, and we are further expanding its capacities so that it will automatically generate allowed network architectures from machine readable time and space of expression data plus results of perturbation analysis.  A second-generation version with much enhanced capacities has been published. (Wm.Longabaugh, Eric Davidson)
Recombineered BACs. Our BAC libraries have provided the source material for in vitro recombineered BACs used by the outside research community as well as ourselves. More than 100 different recombinant BACs from five echinoderm species have been constructed for use as reporter constructs, with the use of our own in house sequencing instrumentation. This includes constructs in which a fluorescent protein coding region (GFP, RFP, mCherry, Cerulean) has been inserted into the coding region of a gene of interest as well as numerous constructs in which cis-regulatory modules (CRM) have been deleted or specifically mutated.  (Julie Hahn, Ping Dong, Miki Jun)
Dynamic imaging of regulatory state clones. A method was perfected to allow periodic confocal imaging of immobilized embryos for many hours during development, and utilized to track clones of cells expressing given BAC constructs. A main result was the distinct cell behavior during gastrulation of veg2 cells expressing foxa, which execute convergent extension to build the archenteron after 30h, vs.veg1 cells expressing evenskipped, which much later invaginate as a coherent cone to generate the hindgut. The immobilization method is also being used to create a standardized canonical digital embryo through time, in which gene expression patterns and ultimately network circuitry can be mounted. (Dr. Emmanuel Faure, Dr. Isabelle Peter, Dr. Mat Barnett*)
Re-engineering the GRN to achieve a predicted cell fate switch. We used recombineered BAC vectors to place under control of a skeletogenic regulatory system a gene encoding an upstream regulator of pigment cell specification. On introduction into the embryo this BAC causes skeletogenic cells to switch fate and become differentiated pigment cells. Some cells adopt a mixed fate but many complete or nearly complete transformation to the pigment cell state. We observed a cryptic repression function which down regulates the skeletogenic GRN in the transformation process. (Sagar Damle*)