List of Research Projects
|
Computational analysis of the emerging sea urchin genome sequence.
R. Andrew Cameron, Kevin Berney, C. Titus Brown, Ian Lipsky
The posted whole genome shotgun sequence for the purple sea urchin produced by the Baylor College of Medicine, Human Genome Sequencing Center with support from HGRI has reached 8 million traces. The Sea Urchin Genome Project housed in the Center for Computational Regulatory Genomics provided the materials including libraries for this project. The sequences are being assembled currently and the result is expected to average sequence scaffolds of 20-30,000 bp. The computational arm of the Center will continue to provide simple analyses in order to facilitate gene discovery and future annotation of the genomic sequences for the broader sea urchin community. We have developed a data pipeline to collect, assemble and analyze trace sequences collected from searches against small candidate gene databases. In addition to the transcription factor gene collection produced in a parallel effort, we have produced gene lists for putative innate immunity genes and sex determination genes. The computational staff of the Center is continuously conducting other sorts of analyses on the available trace sequences as well. Quality control and distribution measurements of the trace collection are derived from comparison to other data sets of sequences collected in the past, viz. the BAC-end STC sequences determined at the High-throughput Sequencing Center at the University of Washington and posted on the website. The Sea Urchin Genome Project web site (http://sugp.caltech.edu/) serves as the information exchange site for these projects.Back to top
|
The conservation of cis-regulatory information in lower deuterostomes.
R. Andrew Cameron, Jonathan P. Rast, Ping Dong, Julie Hahn, Jane Wyllie
Sequences conserved between the genomes of two species can be presumed to reflect a functional significance. The evolutionary distances at which informative conserved elements emerge from the background are poorly understood. We have shown that in sea urchins a divergence of 50 million years provides adequate change for conservation of sequence to be visible in many of the examples so far examined. Often these conserved sequence regions possess cis-regulatory function and current results show that this approach may yield a more than 10-fold increase in rate of experimental cis-regulatory element discovery, compared to the most efficient "blind" search methods. We have launched a project to explore the rules for efficient cis-regulatory sequence prediction by interspecific sequence analysis. The work is focused on 20 different gene candidates whose expression pattern, regulatory inputs and downstream targets are well characterized in the purple sea urchin. The orthologous genes come from several different echinoderm species that display a range of phylogenetic relatedness. We have constructed BAC libraries for 4 species of sea urchins, and for more distant comparisons, a sea star and a hemichordate. We are currently screening the libraries to obtain BACs containing the candidate genes. As the BACs are identified they are put in the pipeline for sequencing. The BACs are characterized and a reporter coding region inserted into the gene using a bacterial recombination system. Gene transfer is used to determine the spatial and temporal expression pattern compared to the endogenous expression. Using the BAC sequences putative cis-regulatory regions indicated by interspecific sequence comparison at diverse distances are identified. The BAC reporter construct is then manipulated in order to test the function of these regions by gene transfer. A method that uses the BAC reporter for sequencing is under investigation as well. This will shortcut the sequencing procedure since only the immediate, useful gene region will be sequenced. We expect that this approach will reveal rules for computational cis-regulatory analysis while extending the current repertoire of BAC libraries, improving computational tools, and generating more efficient laboratory methods for this essential research area.
Back to top
|
A microsatellite-based physical map of the purple sea urchin genome.
R. Andrew Cameron, Kevin Berney, Elly Chow, Autumn Yuan, Eve Helguero, Ochan Otim
The sea urchin genome is the first large animal genome to be sequenced for which no physical map is available. However, mapping data are going to be required in order to enhance the quality of the genome sequence assembly. To produce a linkage map, we will extend our previously published microsatellite mapping strategy using animals of known relationship from our inbred lines. Genotyping by microsatellite markers will yield two kinds of evidence that address two related problems: clustering markers in linkage groups, ie chromosomes; and determining the order of the markers within each chromosome. In preparation for the genotyping of single embryos resulting from crosses in the pedigreed lines, we have scaled down the PCR reactions to use extracts of single embryos at about 200-300 cells or less that one nanogram of template DNA. In practice, the steps of this project are: 1) Design primer pairs computationally. Using the more than 8 million purple sea urchin whole genome traces we have pursued a computational strategy to identify useful primer sets taking into account sequence quality and primer frequency. We have tested this algorithm and demonstrated that it produces effective primers that amplify efficiently under standard conditions. 2) Test each primer pair using unlabeled primers and three genomes: two from the inbred animals and a third from the animal being sequenced. Considering the high degree of genomic polymorphism (4-5%) present in purple sea urchins, these latter tests are crucial to the design of a useful panel of micro-satellite markers. Indeed, about one third of the primers sets display polymorphism in this screen. 3) Measure the fragment size on an AB 3730 Sequencer, using GeneMapper software. There are 27 DNA samples from the pedigreed animals which will total about 120, 96 well plates of reactions for this phase of the analysis. Segregation behavior of these polymorphic variants will define linkage groups. 4) Generate crosses among the pedigreed lines to genotype with linkage group markers in order to define the order of markers in each group. About 1600 embryos from each cross should produce sufficient recombination to determine order. At this density of observation we will have a fair chance of detecting as little as 5-10% recombination, since both the reciprocals will be present. This will suffice to provide at least ordered groups of markers within each chromosome.
Back to top
|
Understanding gene regulation through motif analysis
Meredith Howard
The genetic program directing the development of a single cell fertilized egg into an sea urchin embryo is encoded in the organism's genomic DNA. The essence of this program is a network of genes encoding transcription factors and the cis-regulatory modules controlling the expression of those genes. Each module can receive multiple inputs at multiple sequence specific target sites for other transcription factors in the network, and these signals are integrated into a single output resulting in the gene being turned on or off in different areas of the developing organism at different points in time. Understanding the developmental process therefore requires finding the functional linkages of the network - connecting the output of regulatory genes to the genomic target sites to which those products bind to activate further rounds of specification. This task is made challenging by our incomplete understanding of how transcription factors discern functional target sites from the vast population of non-functional sites with the same sequence in the genome.
In order to study how functional target sites within cis-regulatory modules differ from non-functional sites scattered throughout the genome, we propose assembling a target site database of transcription factors known to be active in the gene regulatory network (GRN) that controls early development of the sea urchin. This data will then be incorporated into a cis-regulatory prediction algorithm which maps onto a sequence the most probable positions of binding sites and background. Possible cis-regulatory modules can then be identified by choosing a window size and threshold of binding sites per window.
Since the quality of data available on transcription factor binding sites in the literature is highly variable, the SELEX method (systematic evolution of ligands by exponential enrichment) will be used to generate libraries of sequences which bind specifically to S. purpuratus endomesodermal transcription factors. To this purpose, a set of proteins have been cloned fused to both a polyhistidine tag for purification purposes, and a V5 epitope to allow SELEX immunoprecipitation. The proteins SpOtx, SpEve, SpGataE, SpFoxA, SpFoxB, SpBra, SpGataC, SpTbr, SpGcm, and SpPmar1 have all been successfully cloned and purified. Once the SELEX methodology has been verified using the well-characterized sea urchin protein SpP3A2, we will experimentally determine the binding sites for these ten proteins. This data will not only to deepen our understanding of the current model, but also help us extend the model by identifiying co-regulated genes among the set of transcription factors recently identified from the sea urchin genome and characterized by expression pattern. Back to top
|
A genome-wide survey of sea urchin transcription factors
Meredith Howard, Lili Chen, Stefan Materna, C. Titus Brown, and R. Andrew Cameron
The sea urchin genome sequencing project now underway has given us the opportunity to do a definitive survey of transcription factors involved in the organism's development. The goal is to characterize when and where these genes are expressed so that they may be incorporated into the current Strongylocentrotus purpuratus endomesodermal gene regulatory network (GRN), filling in any missing connections and rendering the model complete.
The project has proceeded in several phases. The first step was to obtain a non-redundant set of sea urchin transcription factors from among the unassembled traces of the genome. This was accomplished by tblastn of a set of known transcription factors against the collection of genome traces to fish out any putative transcription factor sequence. The resulting traces were then sorted into bins by the reverse blastx such that each trace was associated with its most likely homolog. Finally, small scale assembly was done within each bin to limit redundancy among merges. In all, more than 250 previously unknown sea urchin proteins were identified, including members of the homeobox, sox, ets, zinc finger, nuclear receptor, forkhead, bHLH, and bzipper, families. The zinc finger family presented a particular challenge in that the inherent similarity of all zinc fingers made it nearly impossible to unambiguously assemble individual traces into merges or associate the various exons of individual proteins. Hence to date only the most easily distinguishable zinc finger proteins have been studied, though work is underway to resolve this problem.
In the next stage, QPCR time courses were obtained to map the expression of each gene from the unfertilized egg up to 48 hours of development. From the data, it is apparent that the vast majority of transcription factors, 86%, are used at least once during these 48 hours. In addition, new transcription factors are activated at a nearly steady rate throughout this time period irrespective of gene family, indicating that the spatial complexity of the embryo is being steadily elaborated through rounds of specification even before visible morphological evidence of this appears. In addition, the data indicate that 99.6% of maternally expressed genes are later expressed by the embryo, further illustrating the economical usage of regulatory proteins in development.
Currently, whole mount in situ hybridization (WMISH) studies are underway to identify the spatial expression domains of transcription factors during development. To date, we have located the expression of previously unknown genes to all territories of the embryo, including the vegetal plate, primary mesenchyme cells, the archenteron, and both the oral and aboral ectoderm. This information, along with perturbation analyses of currently known genes, will facilitate placing the newly found transcription factors into the endomesodermal GRN being constructed in the lab. Back to top
|
Transcriptional control of the sea urchin brachyury gene.
R. Andrew Cameron, William Chiu, Jane Wyllie, Elly Chow
The brachyury gene is a participant in the endomesoderm specification pathway and the founding member of T-box family of transcription factors. Gene expression is localized to the vegetal plate as seen by in situ hybridization in the blastula stage. By the gastrula stage transcripts are present in the oral ectoderm and in the region of the blastopore. Expression then subsides and increases again during the larval stage. Previously we had identified two sequence fragments that recapitulate the temporal and spatial extent of this pattern: a 4 kb region just 5-prime of and including the transcription start site and a region about 1800 bp that occupies most of the intron between the 6th and 7th exons of this transcription unit. In the process of analyzing these two sub-regions we identified another sequence tract that lies 5-prime of the ubiquitous enhancer near the transcription start site and that contains a sequence identical to the Smad inhibiting protein (SIP) binding site found in the upstream region of the Xenopus 'blimp' gene, a ortholog of sea urchin brachyury. The activity of this site is under investigation. Now we have narrowed the basal promoter of the brachyury gene to a 50 bp sequence containing a TATA box and lying just 5-prime of the start of transcription. An artificial construct containing these two elements, the intron sequence and the basal promoter is being used to test the inputs to brachyury identified by Q-PCR experiments with other members of the endomesoderm specification gene regulatory network.Back to top
|
Cis-Regulation of Spgcm
Andrew Ransick
Using a variety of experimental approaches, work continues toward defining the cis-regulatory architecture and critical 'trans' inputs of Spgcm, the echinoderm orthlog of the transcription factor glial cells missing. GFP-reporter constructs microinjected into fertilized eggs and assayed in embryos demonstrated that the regulatory sequences that promote expression of this gene in the secondary mesenchyme domain at the mesenchyme blastula are distributed across ~15 kilobases of sequence upstream of Spgcm coding exons, but are concentrated into a proximal (P) and distal (D) module working in concert with a relatively short but indispensible enhancer (E) element. Specification of the secondary mesenchyme cells is known to require a functional ligand/receptor interaction between Delta (D) and Notch (N), and suggests a role in regulating this gene via Suppressor of Hairless (Su(H)), a transcription factor known to be an effector of D/N signaling. Interestingly, while D/N signalling can lead to activation of specific target gene expression via a Su(H)/Notch interaction, this factor will also apparently repress transcription of those same target genes when D/N signalling is absent via Su(H) interactions with prevalent co-repressor proteins. Six binding sites for Su(H) occur in Spgcm regulatory sequences, including a pair of sites in the E-element that resemble a characterized and broadly distributed paired Su(H) site. Inserting point-mutations in all six Su(H) sites in the combined E+D+P GFP-constructs produces significant defects in spatial expression patterns, although the E-module paired Su(H) site still functions at about 30% efficiency. Additional alterations intended to completely disable that element are in progress. Microinjection of dominant negative (dn) Su(H) mRNA results in strong down-regulation of Spgcm. Q-PCR on cDNA harvested from injected embryos shows a 5-10 fold lower amounts of Spgcm message, while WMISH signals range from staining in a few cells relative down to completely undetectable. All embryos injected with dnSu(H) mRNA develop with pigment cell defects - many with an albino phenotype similar to that obtained after microinjection of GCM morpholinos, or dnD or dnN mRNAs. The spatial distribution of Spgcm expression is also known to be limited by the forkhead domain factor, foxa, and work is just beginning to discern the critical sites for this factor in Spgcm regulatory sequences. Back to top
|
Cis-regulatory analysis of SpWnt8
Takuya Minokawa, Athula Wikramanayake*
*University of Hawaii
SpWnt8 is a signaling molecule that is involved in the early specification of endomesoderm in the sea urchin embryo. The goal of this project is to understand the regulatory mechanisms responsible for SpWnt8 expression in the context of the endomesodermal gene regulatory network (GRN). Our GRN studies predict that there are at least two positive inputs in the cis-regulatory region for SpWnt8. These are mediated by the transcription factors (TF), TCF and Krox. This prediction is being tested by cis-regulatory analysis using gene transfer.
We already identified two DNA regions in the flanking regions of SpWnt8 exons (Fragment A and C) that are responsible for correct SpWnt8 expression at cleavage stage. These regions were cloned into CAT reporter constructs. Some putative TCF binding sites in both A and C fragments and a putative Krox binding site in C fragment have been found by the sequence analysis. To test whether these fragments contain functional TF binding sites, perturbation experiments have been done. Synthetic mRNA of Cadherin intracellular domain (Cad) and morpholino-substituted antisense oligonucleotide (MASO) for Krox have been used to block the TCF and Krox inputs, respectively. Each of these inhibitors has been co-injected in the embryos with one of the CAT expression constructs. As a result, the transcription of CAT mRNA from both A- and C-CAT was strongly suppressed by the overexpression of Cad mRNA, indicating that at least one of the TCF binding sites in both Fragments A and C is functional. Inhibition of CAT mRNA transcription from C-CAT by Krox MASO has also been observed, indicating that the putative binding site for Krox in fragment C is functional. A series of mutagenesis experiments on these putative TF binding sites is scheduled to determine the real functional binding sites for both TCF and Krox.
Back to top
|
Cis-regulatory analysis of SpFoxb
Takuya Minokawa, Gwendolyn Giok Bwee Ong
Forkhead class transcription factors are known for their multiple roles in the differentiation of endoderm and axial mesoderm in vertebrates. A forkhead transcription factor, SpFoxb, is expressed dynamically during the early embryogenesis of Strongylocentrotus purpuratus, suggesting that SpFoxb plays multiple roles in the sea urchin development, too. SpFoxb is exclusively expressed in the PMCs (which are micromere descendants) at mesenchyme blastula stage. This PMC-restricted expression is diminished at early gastrula stage. Following the diminishing PMC expression, the oral part of both endoderm and ectoderm start to express SpFoxb.
Our extensive studies of the PMC gene regulatory network suggests that SpFoxb is downstream of the SpPmar1 subnetwork system, and receives positive inputs from SpAlx, SpTbr and SpEts1 (see our website: http://sugp.caltech.edu/endomes/, Oliveri et al. 2003). The purpose of this project is to understand the regulatory mechanisms responsible for SpFoxb expression, especially in the context of PMC gene regulatory subnetwork. The identification of a DNA region responsible for PMC expression and testing of the predicted inputs are the immediate focus of this project.
A software analysis tool "FamilyRelations" was used to find candidate cis-regulatory elements. We identified in the region flanking the coding sequence more than ten tracts highly conserved between S. purpuratus and L. variegatus. Using CAT-reporter vector system, we examined the function of these conserved regions. One of the expression construct, which contains a 5' DNA region next to the transcription start site of SpFoxb, exclusively expresses in the PMCs at 24h mesenchyme blastula stage, indicating that this DNA region contains cis regulatory element(s) responsible for correct PMC expression. Detailed sequence analysis followed by mutagenesis experiments targeted to putative transcription factor binding sites in this fragment are scheduled in near future to test the prediction from our PMC gene regulatory network study.
Oliveri, P., McClay, D.R. and Davidson, E.H. Activation of pmar1 controls specification of micromeres in the sea urchin embryo. Developmental Biology 258:32-43 2003.
Back to top
|
Transposition of the "regulatory module"-from the study of the negative modules in the Endo16 promoter.
Chiou-Hwa Yuh, Sagar Damle
Across different phyla, members belonging to particular classes show a high degree of sequence similarity, and in some cases, a striking degree of conservation in expression domains and functions. These features bestow transcription factor genes with unique advantages for studying the evolution of development and the origin and diversification of body plans. Not only is the protein itself conserved, the regulatory region is also much conserved. A hypothesis that explains evolutionary diversity of gene regulation is that pre-existing regulatory modules, rather than being created by novel mechanisms, can move in different position of the genome by transposition. As a consequence, the diversification of the gene expression pattern arises from the combination of these transposable regulatory modules. I have found a perfect tool to test this important evolutionary question.
The Endo16 promoter contains a very unique negative module (EF module), which is surrounded by repeat sequences. My previous work showed that 5' serial deletion of the Endo16 regulatory region had a very minor effect on endoderm specific expression. Instead, on an intact Endo16 promoter, mutation of putative transcription factor binding sites on the EF module resulted in a dramatic increase in ectopic expression.
The sea urchin genome is highly polymorphic. Dr. Wray's lab recently sequenced the Endo16 regulatory region from ten S. purp individuals, and found that only one has the whole EF module. The remaining 9 individuals either lacked the EF module, or contained a different DNA sequence. For instance, the individual SpuLA29 is missing the EF module, whereas SpuSB68U is missing EF, but also contains a long insertion in its G module. SpuSB68L is also missing EF, and SpuSB49 is similar to the original isolated clone. On the other hand, SpuR1 has a totally different sequence in place of the EF module. To test the function of each allele, I used fusion PCR to replace the 5' Endo16 regulatory region with regulatory modules from different alleles and detected CAT mRNA localization by in situ hybridization. The in situ pattern showed that the SpuR1 has a restricted expression pattern identical to that of the published Endo16 sequence. This implies that two unrelated insertions at approximately the same location both have a similar functional impact on repression in the ectoderm. Fusion constructs from individuals lacking the EF module released repression toward ectoderm and the CAT mRNA expression domain expanded slightly toward the ectoderm border. This result is consistent with those from the in-vitro mutagenesis experiment.
We have searched the sea urchin genome traces, and discovered 242 matches to EF modules. We also found 154 matches to the SpuR1-specific repression domain, but only 3 that matched the BA module. At the time of the search, the sea urchin genome had been sequenced to 3X coverage. This result implies that the genome has about 81 sites that contain the original EF module, and 52 sites containing the SpuR1-specific repression allele, but only one site (which is in front of Endo16) that contains the BA module. This evidence strongly suggests that the EF module is a transposable regulatory module. It will be interesting to find out which other genes are regulated by the EF modules. This may solve a fundamental mystery of evolution, where recombination among modular regulatory regions results in novel patterns of developmental gene expression. The interesting evolutionary hypothesis is that the EF module must transpose from somewhere else in the genome. It was accidentally transposed into 5' end of the Endo16 gene.
Back to top
|
Identified, proving and cloning UI, a protein involved in the sea urchin developmental regulatory gene network.
Chiou-Hwa Yuh, Elizabeth R. Dorman, Titus C. Brown
The study of the transcriptional regulation of Endo16, which is an early vegetal plate marker gene, can help us understand endomesoderm specification. Detailed analyses have been carried out in the positive modules (A, B and G) of the Endo16 promoter. Module B is responsible for late stomach expression. The mutation of UI site on the module B absolutely shut off this module's activity as shown by in situ hybridization on the mutated construct. Therefore the UI site has been identified as a spatial controller of module B. Experimental and computational analysis of Endo16 module A and B has revealed aspects of cis-regulatory organization that are of great importance.
To elucidate the mechanisms by which proteins that bind to UI site, activate transcription, we purified transcription factors by affinity chromatography and obtained partial amino acid sequences. Nuclear extracts from 20 hour embryos were prepared from a large-scale embryo culture. After affinity column purification, proteins were separated on SDS-PAGE and Coomassie blue stained bands were cut from the gel. In-gel digestion of the protein and amino-acid sequencing were then performed by an amino acid sequencer. Then, degenerate oligos deduced from the peptide sequences were designed to hybridize to cDNA clones from an arrayed library. Since the sea urchin genome has been sequenced, we also used the peptide sequences for UI to computationally mine the genome for UI candidate genes.
We found two cDNA clones by screening the cDNA library with degenerate oligos: one belongs to a zinc-finger binding protein, and the other is an octomer binding protein. Computational searching identified five different genomic sequences that align to the UI-peptide sequences. We named them Merge 0 to Merge 4. Q-PCR primers were made according to the cDNA sequences predicted by blast search. Preliminary real time RT-PCR results show that there is a class 3, POU domain transcription factor (Merge 0) that has very low expression at an early stage, and starts to be expressed after 30 hours post-fertilization. It reaches a peak at 60 hours post-fertilization, which correlates to the late expression peak of endogenous Endo16. We also found the two cDNA clones obtained from cDNA library screening have similar expression time courses: they are expressed at the 20 to 30 hour-stage and show pretty low expression at later stages. The Merge 2 sequence is a mesenchyme homeo box 2 protein, according to blast search against other genome, which has a very interesting expression profile. It has a sharp peak at 24 hours and a later peak at 60 hours.
The endogenous expression pattern for the two cDNA clones and Merge 0 and Merge 2 will be obtained soon by in-situ hybridization. We are currently trying to get full-length cDNA clones by either RACE library or additional screening of the cDNA library. Once full-length cDNAs have been obtained, antisense morpholinos against these genes will be tested on the expression of the Endo16 gene as well as other genes in the endomesoderm network.
Back to top
|
Develop a high throughput way to identify the inter-relationship between regulatory modules and finding the non-conserved repression modules in the Otx locus.
Chiou-Hwa Yuh
Analysis of cis-regulatory systems of territorially expressed genes is one of the main focuses in our laboratory. Traditionally, we are using in vitro mutagenesis and reconstruction of regulatory systems with synthetic DNA fragments in order to unravel the cis-regulatory "information processing system". With the help of "FamilyRelations", we are able to find highly conserved genomic DNA sequences between two species of sea urchin, S. purpuratus and L. variegates. Linking these conserved elements into reporter genes has been the most straight forward way to test for functionality. There are several problems with using this method. First of all, the interrelationship between modules is difficult to identify by this strategy. Secondly, the negative functional modules usually are evolutionary non-conserved, so they can be missed by searching only for conserved modules. Thirdly, the Endo16 basal promoter which we use in the expression constructs may not universally interact with regulatory elements from other genes. Fourthly, this method is time-consuming and requires a great deal of effort.
We have invented several new methods for high-throughput gene-expression analysis. With homologous recombination, we can very easily replace an endogenous gene's small exon with a GFP reporter gene. Using PCR amplification, we can make GFP expression fragments containing different portions of genomic DNA. By fusion PCR, we can fuse conserved DNA fragments together, and test their effect in combination. With the help of a newly purchased cell sorting machine, COPAS (Complex Object Parametric Analysis and Sorting), we can detect GFP activity quantitatively within minutes. Furthermore, by gel-shift assay, in which we use a bait regulatory DNA as a probe and add a different sequence of unlabeled putative regulatory DNA to the reaction, we can find out which DNA fragments physically interact with each other through protein-protein contacts mediated by bound transcription factors. We can further test their function directly by fusing them to the bait and a CAT reporter system through fusion-PCR.
To understand the regulatory mechanism, the locus containing the Otx gene has been discovered by screening a BAC library from two different species of sea urchin. Using the newly developed program "FamilyRelations", we were able to find seventeen partially conserved regions. Among them, the element 15 locates immediately upstream of Otx-beta1/2 and element 17 locates immediately upstream of Otx-alpha. Four elements (11, 14, 15 and 17) had been shown to express in endomesoderm when linked individually to the CAT reporter system. We used the homologous recombination technique to replace exon 6 with GFP. This construct was therefore designed to detect the transcriptional activity of the Otx-alpha promoter. We also replaced exon 4 and 5 with GFP to detect the transcriptional activity of the Otx-beta1/2 promoter. Five PCR fragments surrounding the Otx-alpha transcription start site and ten different PCR fragments surrounding the Otx-beta1/2 transcription start site were made within a few days, and their activity identified quantitatively by the COPAS. We found the conserved region number 16 has a strong up-regulatory effect, a 9.2 fold increase compared to element 17 alone on Otx-alpha transcription. We also discovered a 3.23-fold increase compared to element 15 alone on Otx-beta1/2 transcription. Furthermore, element 17 can enhance the expression of Otx-beta1/2 by 2 fold, and the element 15 will repress the expression of Otx-alpha by 2 fold. An interesting finding was that the non-conserved DNA fragment in between element 14 and 15 has a repressive effect.
To test the repressor, we carried out a gel-shift assay to detect the negatively non-conserved interactive modules. We used PCR amplification to perform a non-biased screen for interactive modules that can interact with the otx bait fragment. The bait fragment contains three Otx binding sites and a GATA4,5 binding site that have been shown to be important for expression in endomesodermal territory but that can still express in ectoderm. By gel shift assay, we identified a few DNA fragments that can compete the DNA-protein complex formation on the bait. We tested their function by linking these domains to the bait DNA using fusion-PCR and joining this construct to a CAT-reporter. Interestingly, two fragments suppressed oral ectoderm expression, and one fragment suppressed aboral ectoderm expression. Upon further examination of the DNA sequence, we found that the two oral-ectoderm suppression modules had classical CREB binding sites and oral-ectoderm repressor contained binding sites identified from the Spec2a promoter study.
The purpose of the project is to develop a high throughput way of identifying not only the positive modules, and the interrelationship between modules, but also to help to identify the functional but non-conserved negative modules.
Back to top
|
Trans-specification of primary mesenchyme cells through genetic rewiring of the mesoderm specification network.
Sagar Damle
In the sea urchin Strongylcentrotus purpuratus, the identity and regulatory relationship of a number of transcription factors involved in endomesoderm developement have been well characterized. However, the ultimate demonstration of intellectual control of the causal moving parts of a system is to reengineer it. The goal of my project is to determine whether SpGcm is sufficient to trans-specify primary mesenchyme cells (PMC) into an secondary mesenchyme cell (SMC) fate. This will be done by placing the SpGcm coding sequence under the control of a promoter that directs PMC-specific gene expression. The Davidson lab has developed a system whereby BAC-sized DNA fragments can be introduced into fertilized sea urchin eggs through microinjection and integrated into the genome as early as the two-cell stage. With this system, it should be possible to explore the effects of creating novel connections between regulatory pathways.
SpGcm is thought to play two roles in development. Its early expression in all presumptive mesoderm suggests it is capable of setting up a mesodermal transcriptional state. This state gives cells a competency to respond to signals that specify various SMC or mesodermal cell lineages. Some evidence for this theory already exists. For example, embryos injected with antisense morpholino against SpGcm do not correctly express SpGataC in the oral arch of veg2 mesoderm (A. Ransick, unpublished data). The later role of SpGcm in pigment cell specification is perhaps more difficult to characterize through morpholino-analysis. However, in-situ hybridization shows late Gcm expression exclusively in pigment cells.
The T-box transcription factor Tbrain is expressed in the large micromeres at swimming blastula stage. Its expression persists through gastrulation as micromeres differentiate into PMCs and arrange themselves within the blastocoel. I will replace the first exon of SpTbrain with the SpGcm coding sequence. This recombined BAC construct will have the SpGcm coding sequence under control of SpTbrain's regulatory system, which induces expression in PMCs.
The degree to which Gcm can both override the skeletogenic program of PMCs and specify either pigment cell or mesodermal cell fate will be determined by measuring the spatial distribution transcripts of a number of well-known molecular markers. SpGata-C is a mesoderm-specific transcription factor controlled by SpGcm, whereas SpPks, SpSult and SpFmo are all thought to play roles in the biosynthesis of the pigment echinochrome in pigment cells. Whole mount in-situ hybridization with probes for these genes will be used to identify the extent of Gcm respecification. While both PMCs and SMCs have the ability to migrate, their final positions in the embryo are different. The location of Gcm-expressing PMCs and their degree of pigmentation will therefore also be useful in characterizing Gcm's role in mesoderm specification.
Back to top
|
Cis-regulatory analysis of the sea urchin delta gene
Roger Revilla
The delta gene plays two different roles in the specification of the endomesoderm of the sea urchin embryo. Each one of these roles requires Delta to be localized in a specific territory of the embryo. It is first required in the micromeres to serve as a signal that is necessary to segregate the mesodermal and endodermal fates of the surrounding cells. It is later localized in the prospective SMCs, where it signals the endodermal cells and is required for gastrulation to occur. The goal of this project is to analyze the cis-regulatory system that localizes the expression of Delta in the right place and the right time to serve its roles in the specification of the endomesoderm. It has already been shown that the early localization of Delta in the micromeres depends on activator(s) that are present ubiquitously, and a repressor that is present everywhere except in the micromeres. Comparison of genomic DNA sequences of Strongylocentrotus purpuratus containing delta gene with the orthologous region of Lytechinus variegatus genome has been used to identify conserved patches of sequence that might contain cis-regulatory elements. Two sequence elements have been identified that are able to recapitulate the two phases of expression of the delta gene. The element that recapitulates its early phase of expression has been shown to contain binding sites for activator(s) ubiquitously present and binding sites for the repressor that localizes delta in the micromeres. Future work will identify the sites in the DNA that bind these factors, and it will also elucidate the binding sites of the key factors that are responsible for localizing Delta in the prospective SMCs. Finally, we also hope to be able to identify the transcription factor that acts as a repressor of delta everywhere in the embryo except the micromeres, which has been suggested to play a key role in the installation of the skeletogenic program of gene expression
Back to top
|
A cis-regulatory analysis of SpGata-e
Pei Yun Lee and Kelly Lin
SpGata-e is the S. purpuratus ortholog to vertebrate Gata genes 4/5/6. The expression of SpGata-e is first detected in presumptive secondary mesenchyme cells (SMCs) during the hatching blastula stage. Its expression expands to include both future SMCs and endoderm in the mesenchyme blastula. In the gastrula, SpGata-e is expressed in at the tip of the archenteron and hindgut. By the end of embryogenesis, SpGata-e is expressed in the midgut and coelomic pouches.
A 600bp DNA sequence in the first intron is the only cis-regulatory element discovered thus far that is active during embryogenesis. It is responsible for directing SpGata-e expression in the vegetal plate from the onset of zygotic SpGata-e expression in the 15hr blastula. This element also maintains expression in mesoderm cells at the tip of the invaginating archenteron and endoderm cells until mid-gastrulation. After gastrulation, this element loses its specificity in directing expression; instead it controls ubiquitous low level expression. Current ongoing cis-regulatory analysis includes cloning overlapping 5kb fragments of the SpGata-e genomic region into reporter vectors to identify the element(s) responsible for regulating post-gastrulation expression of SpGata-e.
A search in the sequence of the 600bp cis-regulatory element for putative DNA binding sites of transcription factors known to be upstream of SpGata-e identified 3 putative SpOtx binding sites. Co-injection of mRNA of an Otx-engrailed fusion or a morpholino to the Otx-b transcript abolished the expression of GFP RNA molecules as assayed by quantitive real time PCR. Gel shift analysis has shown that the Otx transcription factor binds cooperatively to a pair of Otx binding sites in the cis-regulatory element.
Back to top
|
Constitution of transcription complexes at the C-element of sm50 gene
Ochan Otim
The complexity of interactions in embryonic nuclear extracts associated with the functions of the 25 bp C-element of the cis-regulatory region of the sea urchin Strongylocentrotus purpuratus sm50 was examined by extensive mutation and gel shift analyses. The C-element exercises the primary spatial control function for accurate expression of the SM50 protein in the skeletogenic lineages. Six nuclear extract preparations representing the early stages of development ranging from 6 (32-cell stage) to 26 h (mesenchyme blastula stage) were employed to visualize radiologically the binding activities at the element. At least eight uniquely identifiable transcription complexes are formed as a consequence of direct or indirect transcription factors interaction with the C-element. Site-specific mutagenesis of the C-element has revealed a cluster of at least three distinct target sites within the 25 bp, all of which are utilized in varying combinatorial arrangements depending on the embryonic developmental age; one of these sites (also recognized by SpHnf6) is required constitutively. Attempts to identify the complexes are under way.
Back to top
|
Understanding the transcriptional control of cyIIIa
C. Titus Brown
CyIIIa is a cytoskeletal actin expressed at high levels throughout development in the aboral ectoderm of S. purpuratus. The 2.3kb of genomic DNA immediately adjacent to the transcription start site is sufficient to direct correct spatiotemporal expression of a CAT reporter gene, and contains binding sites for nine distinct proteins present in 22hr crude nuclear extract (Calzone et al., 1988). 8 of these 9 proteins have been characterized to some extent, but one protein remains unidentified, and several proteins play roles that have not been fully examined.
I am continuing to expand our understanding of the CyIIIa cis-regulatory region in several ways. First, I have used the results of the whole-genome transcription factor catalog (Howard et al., unpub.) to screen for candidates for the remaining unidentified transcription factor, and have found a strong candidate in a zic/odd-paired transcription factor ortholog. I have also expressed the TEF-1 transcription factor in vitro and shown via gel shift that it binds to the P5 binding site. Finally, I am using Q/RTPCR on cDNA from embryos staged at every 6 hrs through 60 hrs to determine the time courses of the 8 known proteins.
The ultimate goal of the cyiiia project is to build a complete understanding of its transcriptional regulation by showing that the spatiotemporal expression patterns of its regulators are sufficient to explain, both quantitatively and spatially, the expression pattern of cyiiia itself. In addition to the above work, we are building a cis-regulatory "wiring diagram" (Yuh et al., 1988) for CyIIIa that will put the individual binding sites into a combined functional context.
Back to top
|
Cis-regulation of the early and late forms of SpKrox1.
Carolina Becker Livi
The sea urchin zinc-finger transcription factor SpKrox1 is an important early regulator in the endomesoderm network. Previously we have characterized some of the downstream functions of SpKrox1 and established a subset of its downstream connections. More recently, we have made progress characterizing its cis-regulatory region in an effort to test the predictions made by the network model regarding upstream regulators of SpKrox1.
SpKrox1a and b are two splice forms expressed in different temporal patterns in the sea urchin embryo. Their respective regulatory regions have been found by both computational and experimental means. First we compared the sequences 5' and 3' of the exon (including introns) of Strongylocentrotus purpuratus and Lytechinus variegatus using "FamilyRelations". Patches indicating significant levels of conservation were selected for further experimental work. An analysis of the distribution of putative binding sites for TFs was also performed, more specifically searching for binding sites for those TFs thought to regulate SpKrox1 expression by perturbation analysis. Fragments far upstream of the exons were also tested, but none yielded expression when using Endo16 basal promoter. We believe that the endogenous basal promoter is necessary for the appropriate expression of promoter constructs.
We used Fusion PCR to attach genomic DNA fragments containing the putative cis-regulatory elements and basal promoter to the coding region of green fluorescent protein (GFP). A construct containing a 900bp fragment together with exon 1a recapitulates the expression pattern of the late form driving expression in the gut in 95% of the embryos with GFP expression. This fragment contains several Otx binding sites, but we do not yet know if they mediate SpKrox1 expression.
We used the homologous recombination technique to replace the coding region of exon 1b with GFP in the BAC clone of SpKrox1. This construct was therefore designed to detect the transcriptional activity of the SpKrox1b promoter under its own basal promoter control. Using PCR amplification, we made 10 or so GFP expression fragments containing different portions of genomic DNA. A 600bp fragment located 5' of exon 1b recapitulates the expression pattern of the early form driving expression in the vegetal plate endoderm in 85% of the embryos with GFP expression. This fragment also contains several otx sites and preliminary experiments show that injecting antisense morpholino oligos against SpOtx does reduce the number of embryos expressing this construct. In similar experiments we have found that inhibiting the b-catenin pathway by injecting the intracellular domain of Cadherin also reduces the percentage of embryos that express constructs driving GFP.
Now we will further test whether or not mutating binding sites within these fragments will affect GFP expression and therefore confirm the inputs predicted by the endomesoderm network.
Back to top
|
Delta expression in starfish and sea urchin embryo after 500my divergence
Feng Gao, Veronica F. Hinman, Kirsten Welge
Delta expression is required for normal endomesodermal specification in starfish and sea urchin embryogenesis, the two of which last shared a common ancestor around 500mya.
Three phases of Delta expression were identified in sea urchin embryos as a part of the S. purpuratus GRN project. It is first expressed in the micromere descendant cells around 6hrs after fertilization, which serves as the Delta-notch signalling targeted at mesodermal Gcm to segregate the mesodermal and endodermal fates of Veg2 descendant cells. By mesenchyme blastula stage, its expression is extinguished in the ingressed PMCs, while is activated in SMCs which results in an upregulation of GataE in precursive endomesodermal cells required for gastrulation to occur. During gastrulation, expression is seen in the apical plate of the embryo while disappearing in the vegetal plate.
As a part of a comparative network analysis project undertaken between S. purpuratus and A. miniata, the following work was done with the AmDelta gene: 1) a full length AmDelta cDNA was isolated by cDNA library screening; 2) a time course of Delta expression was determined by real-time QPCR; 3) WMISH was performed on post-hatching embryos; 4) 25 BAC clones positive to AmDelta were found through BAC library screening; 5) homologous recombination was used to insert the GFP reporter just in place of the first exon of AmDelta BAC DNA; 6) AmDelta-BAC-GFP and antisense morpholinos against AmDelta were microinjected into fertilized starfish eggs. Two phases of Delta expression were identified in starfish embryos based on WMISH and AmDelta-BAC-GFP microinjection. Its first phase was found in the center of the vegetal plate at blastula stage, where Delta is required for normal endomesodermal specification and acts as an input into AmGataE from AmDelta morpholino microinjection. After gastrulation, expression is seen around the blastopore and in the oral ectoderm, with a strikingly similar pattern to that of the Brachyury gene. In the context of timing, spatial pattern, gene expression and developmental fate, we speculate the 1st phase of Delta expression in starfish embryo is homologous to the 2nd phase expression in sea urchin, and probably represents the most basal function of the Delta gene in echinoderms. The 1st phase of expression individually in sea urchins is late-derived and lineage-specific to echinoids as an accessory to the invention micromere. The 2nd phase of expression in starfish and 3rd phase in sea urchin are distantly divergent as a result of their 500my divergence.
Identification of the cis-regulatory regions of Delta in starfish and sea urchin holds the answer to the inferences above. Two cis-regulatory elements of SpDelta have been identified which are respectively responsible for the first two phases of expression in the sea urchin embryo. A new experimental method is underway to locate cis-regulatory regions of starfish AmDelta given that the genomic sequence of starfish Delta gene is still unknown. This involves the following steps: 1) PCR walking outward from both ends of the Delta coding sequence along AmDelta-BAC-GFP; 2) PCR products sequencing; 3) designing primers from two distal ends of the AmDelta sequence; 4) PCR again; 5) microinjection with PCR products to locate the regions regulating the two phases of AmDelta expression.
Back to top
|
Regulatory gene network evolution: A comparison of endomesoderm specification in starfish and sea urchins.
Veronica F. Hinman, Albert Nguyen
We are undertaking an evolutionary comparison of the gene regulatory network (GRN) of transcription factors underlying the specification of endomesoderm in sea urchins and starfish. The extensive analysis of this network in sea urchins has provided a unique opportunity for a comparative investigation to elucidate mechanism of evolution at this level. We would like to answer questions such as, which components of such a regulatory system are conserved, how are changes incorporated into a GRN, and how do these changes relate to the evolution of morphology? The starfish Asterina miniata has been developed as an ideal experimental model for this analysis. Gametes are readily available and gene transfer and perturbation of gene products have been performed. Starfish last shared a common ancestor with sea urchins around 500 million years ago and they appear to be at an ideal evolutionary distance for meaningful comparisons; they share many conserved aspects in their development and yet there exist specific morphological differences.
We have previously shown that a common developmental feature of starfish and sea urchin GRNs is the use of an orthologous three gene positive regulatory feedback loop that serves to 'lock down' gene expression required for the specification of the endoderm and thus to drive development forward. The conservation of this feature across the immense period of evolutionary time such as separates these echinoderms demonstrates the indispensable nature of such a process in their development. Several differences were also noted in the GRN architecture. We have noted that tbrain (tbr) is incorporated into the endomesoderm-specification network in starfish while it is involved in primary mesenchyme cell specification in sea urchins. Also, the starfish gatae gene is repressed from the mesoderm by foxa while this is not the case in sea urchins.
We are continuing these GRN architecture comparisons, principally by examining the specification of the mesoderm. Certain differences were already noted between this process in starfish and sea urchins e.g. the different usage of tbr and gatae genes (see above). We have cloned several starfish regulatory genes that are expressed in the mesoderm, viz the transcription factors AmGcm and AmGataC and the signalling ligand AmDelta. Unlike in sea urchins, the starfish gcm gene is not expressed in the vegetal plate but is expressed in a patch of cells in the pre-blastula that appear to migrate throughout the aboral ectoderm. The delta gene of sea urchins has two phases of expression, firstly in the micromeres where it is required for the expression of the mesodermal factors gatac and gcm, and secondly in the vegetal plate where it is needed for the specification of the endoderm. It appears that the second phase of expression and function is conserved in the starfish, but Delta is not required at all for the expression of the mesoderm factors AmGcm and AmGataC. Significant differences may thus have occurred in the specification of mesoderm in these two taxa that is related to the novel acquisition of the micromeres by sea urchins.
We are further expanding this work to identify and analyse the cis regulatory elements of several genes of the starfish GRN. The regulatory relationships underlying the architecture of the GRN are inherited as the coding sequence of the transcription factor and the DNA sequence of its binding site. Binding site modifications are likely to provide the greatest potential for evolution.
We know from the comparative GRN analysis that the brachyury (bra), orthodenticle (otx) and gatae genes in starfish and sea urchins are similarly regulated, yet comparative sequence analyses using "Family Relations" fail to find any significant patches of sequence conservation in the surrounding 100-150Kbp of DNA. In order to understand how the sequence of a cis regulatory region has evolved, we are analysing, in detail, the cis regulatory region of the starfish bra, otx and gatae genes and are comparing these to the regulatory regions of orthologous sea urchin genes. We have used several methods to identify the regulatory regions. We currently have an approximately 500bp region including the 5' UTR of AmBra DNA that drives correct reporter gene expression in pregastrular starfish embryos. Expression of this regulatory element is enhanced by fusing it with either of two genomic regions downstream of the coding region. We also have identified an approximately 500bp region of DNA downstream of the otx coding region that drives correct reporter gene expression. This region was identified using the "Cluster Buster" software which searched for a statistical over representation of consensus binding motifs for the Otx, Krox and Gatae transcription factors, all of which are known to regulate otx expression. The arrangement of these binding sites in the cis regulatory elements of the starfish and sea urchin otx genes was found to be remarkably similar suggesting that some functional constraint must exist in their relative arrangement.
We are also collaborating with Christophe Battail and Hamid Bolouri to use a bioinformatics approach to search for over-represented "words" in the genomic regions surrounding coding sequence as a method to identify regulatory elements. Testing is underway on potential regions.
Back to top
|
Regulatory aspects of sea urchin biomineralization
Gabriele Amore
In echinoderms larval skeletogenesis is observed only in two of the five classes of the phylum, echinoids and ophiuroids. It is considered a derived feature in these groups. In sea urchins (echinoids) the skeletal matrix (sm) genes involved in the making of the larval skeletal spicules, are believed to be the same that are used in adult spines and tests. A cooption of the sm-genes from the adult to the larval life phase is hypothesized. Studying the regulation of the expression of sea urchin sm-genes provides an opportunity to approach the more general problem of the evolution of the regulation of gene expression.
To reveal the genomic organization of sm-genes, BAC libraries were screened with probes obtained from sm-cDNAs and BLAST analysis were performed. The four sm-genes pm27, sm37, sm50 and sm32 were found in a cluster. The presence in the same cluster of pm27 and sm32 is a new finding. Also, the sm50 and sm32 genes share the same first exon, so they represent splice variants. The same arrangement for these genes was found in two closely related sea urchin species, S. purpuratus and L. variegatus, although little conservation was observed in the non-coding sequences. To determine if all the sm-genes are clustered in the sea urchin genome, more BAC clones will be isolated and characterized.
The cluster arrangement of the sm-genes is suggestive of a common mechanism responsible for their coordinated expression, maybe via a locus control region (LCR). Recombination techniques that allow the knock-in of the green fluorescent protein (GFP) coding sequence into sm-genes in the BAC clones will be employed. In these constructs, GFP expression will be driven by the promoter of only one of the four sm-genes. Sequences far upstream or downstream from each gene will then be modified. The short- and long-range effects of these mutations will then be tested by microinjecting the whole BAC clones into sea urchin zygotes and analyzing GFP expression.
Back to top
|
|
|