Thursday 23 February 2023

AUTOLOGUS REGULATION

AUTOLOGOUS REGULATION

 Our biological destiny is inherited by us in every cell. Our DNA is the repository of this information in every cell. DNA is another incredible wonder of Nature. Todd Smith gives a great description (1): Six billion base pairs of DNA are packaged into 22 pairs of chromosomes, plus two sex chromosomes. Each base pair is 34 angstroms in length (.34 nanometers, or ~0.3 billionths of a meter), so six billion base pairs (all chromosomes laid out head to toe) form a chain that's two meters long. If we could hang this DNA chain from a hook, it would be slightly taller than an average human. But that's just the DNA from one cell. Each of us have around 50 trillion cells (50,000 billion). If we took the DNA from all of those cells and laid it out in a linear fashion, it could wrap around the earth 2.5 million times, or reach to the sun and back 300 times! Yet cells manage to pack all that DNA into a structure so small we can't even see it without a microscope. 

This long hard disk is twisted and braided and compressed so amazingly in the tiny nucleus of our tiny cell. Each tiny cell contains all the information to build a complete living organism or human being. Basically it carries two types information: one for autologous regulation- continuously managing itself as per inherited temporal program. This silences most of the genes and activates only those that give it its identity and characteristics. It uses multiple layers of tools and collaborations between those tools to accomplish this as per instructions embedded in it. Second type of information is its mesh like behaviour. Cell functions on its own as per the type of cell it becomes but it also functions collectively with other cells to build as per the design of the organism. On top of this mesh there are non cellular players like bioelectrical networks that influence each cell and also collectively as another language of communication amongst them. 

Cells are very crowded places: there are some 42 million protein molecules in a simple cell, revealed a team of researchers led by Grant Brown, a biochemistry professor in the University of Toronto's Donnelly Centre for Cellular and Biomolecular Research. The majority of proteins exist within a narrow range -- between 1000 and 10,000 molecules. Some are outstandingly plentiful at more than half a million copies, while others exist in fewer than 10 molecules in a cell. These molecules move very very fast inside the cell. In a blog by Ken Shiriff where he quotes from the book Molecular Biology of the Cell: You may wonder how things get around inside cells if they are so crowded. It turns out that molecules move unimaginably quickly due to thermal motion. A small molecule such as glucose is cruising around a cell at about 250 miles per hour, while a large protein molecule is moving at 20 miles per hour. Note that these are actual speeds inside the cell, not scaled-up speeds. I'm not talking about driving through a crowded Times Square at 20 miles per hour; to scale this would be more like driving through Times Square at 20 million miles per hour!

Because cells are so crowded, molecules can't get very far without colliding with something. In fact, a molecule will collide with something billions of times a second and bounce off in a different direction. Because of this, molecules are doing a random walk through the cell and diffusing all around. A small molecule can get from one side of a cell to the other in 1/5 of a second.

 As a result of all this random motion, a typical enzyme can collide with something to react with 500,000 times every second. Watching the video, you might wonder how the different pieces just happen to move to the right place. In reality, they are covering so much ground in the cell so fast that they will be in the "right place" very frequently just by chance.

 A rendition of a cross section of a cell and how crowded it is.

 In addition, a typical protein is tumbling around, a million times per second. Imagine proteins crammed together, each rotating at 60 million RPM, with molecules slamming into them billions of times a second. This is what's going on inside a cell.

In super tiny tightly packed strands of DNA heritable intelligence decides which gene (a segment of the DNA) will be read and which part of the strands will be tightly sealed to avoid being read. The ‘reading’ of the strands is by a process using enzymes and many floppy phase changing proteins as described in previous post. So many things have to come together at the right place for the gene to be read – all inside a tiny tightly packed part of a tiny cell. 

From what is not read and what is read in our DNA a cell gets it’s identity and function. Only 10%to 20%  of the coding genes are active at any given time in a cell. There is intelligence even in the spatial arrangement of each of the 200 types of cells. It’s like each type of cell is of a particular color and shape in a puzzle and Nature arranges them to form 80 different 3-D organs with incredible functions like our eyes which allow us to see and liver that does complex processing. Cells also form bones and cartilage and tendons and muscles. All of this different things made from the same basic cell. And each cell can be made to turn into any other type of cell. Unbelievably each cell has information on its ‘hard disk’ to build each and every of the organs, bones, muscles and skin. We literally start from a single cell! 

As we read in my earlier post Headwaters, this ‘reading’ or  transcription of our DNA is quite pervasive and is observed in 85% of our genome. Out of this only 2% is involved in protein coding. Rest is involved in regulating this 2% and it’s translation. The more the complex organism the bigger the ratio between coding and non coding but this tells only one part of the story. 




Even in such a crowded cell with such a huge genome Nature maximizes this space by alternatively splicing 95% of the genome. So instead 50,000 genes (coding and non coding) generating 50,000 transcripts not only 85% of the genome transcribes but 98% of this transcriptome undergoes alternative splicing! Creating uncountable isoforms. By alternative splicing we mean that same region of our genome can be ‘read’ in multiple versions. Supposing we mark a region from 1 to 10 and  neighbouring region is marked from 11 to 20 as two genes but those genes due to alternative splicing can be transcribed as 5 to 9 or 3 to 6 or 2 to 9 making 3 transcripts from the same gene. This splicing can also include neighbours so it can go 7 to 15 or 3 to 12, etc. to explain it simply. So in the preceding Headwaters post we learnt about how most of the transcription from non coding regions and some proteins create layer upon layer of regulation of the protein coding genes driving the changes that make us from an egg to an adult and after puberty it launches the process of aging. Now in this post we find that that is not all that happens in the genome and it’s housing structures like histones and  chromosomes. On top of this there is spliceosome that cuts up the genome into not just linear transcripts across its length but unending variety of isoforms due to rampant alternative splicing. Look at the packaging brilliance of Nature: a 2 meter long DNA 85% of which transcribes into transcripts in a nucleus that is 10 microns (one micron is one millionth of a meter) would be miraculous enough but Nature maximizes this by adding pervasive alternative splicing that creates multiple transcripts from same gene! Thereby multiplying the number of transcripts by multifold that are produced from the 2 meters. 

 

 




From Universal Alternative Splicing of Non Coding Exons by Tim Mercer et. al

 Only a limited number of transcripts whether of a full gene or alternatively spliced gene translates into protein. In my previous post Headwaters we read about how these shapeless, floppy proteins gather near a gene activation site and magically phase change into a condensate that hovers over the site. Similarly a different condensate activates splicingAn article published in Genome Biology Journal on 28th November 2018 by Dr. Steven Salzburg et. Al. states the following: “We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts. We detected over 30 million additional transcripts at more than 650,000 genomic loci, nearly all of which are likely nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells.




The interesting thing to note is the huge number of transcripts they found: 30 million! They claim that most of them are non functional but Nature rarely spends resources to construct huge volumes of non-function things. The non protein coding transcripts too have very important roles. In a paper titled ‘Pervasive Transcription of the Human Genome Produces Thousands of Previously Unidentified Long Intergenic Noncoding RNAs’ by Matthew J. Hangaue et. Al. the authors say “It is now becoming more and more clear instead that, far from being genetic “deadwood” these repetitive expanses are actively and deliberately transcribed into non-coding RNAs which play a major role in regulating gene expression and silencing, organizing nuclear architecture, compartmentalizing the nucleus, and modulating protein function.” My previous post explains in detail the various types of non coding transcripts and the regulatory roles they play but here we additionally examined the alternative splicing that generates not only variety of coding transcripts but also as we read above huge number of non coding transcripts. 

 What is fascinating is how these transcripts govern their own births: if you recall we covered Long non coding RNAs in the previous post-in a paper titled ‘Epigenetic regulation of alternative splicing: How LncRNAs tailor the message’ by authors Pisignano and Lafomery they write about some of the ways in which LncRNAs regulate alternative splicing which in turn leads to various transcripts including LncRNAs. An excerpt from their paper “Both short (<200 nt) and long (>200 nt) non-coding RNAs can contribute to the regulation of alternative splicing in many different ways; either indirectly by regulating the activity of splice factors; or directly, by interacting with pre-mRNAs. Long non-coding RNAs (lncRNAs) are particularly well suited to these roles due to their demonstrated capacity to act as regulatory molecules that modulate gene expression at every level. Either alone, or in association with partner proteins, these long RNA polymerase II transcripts have been shown to take part in a wide range of developmental processes and disease in complex organisms.” So which are the ways they mentioned in which LncRNAs regulate alternative splicing:

1.     LncRNAs regulate alternative splicing through chromatin modification: An intimate relationship exists between lncRNAs and chromatin conformation.  LncRNAs regulate chromatin modifications by recruiting or directly interacting with histone-modifying complexes or enzymes at specific chromosomal loci. A possible lncRNA-mediated crosstalk between histone modifications and the pre-mRNA splicing machinery has also been proposed. Several lncRNAs appear to control important aspects of chromatin organization including chromatin looping, either remaining tethered to the site of transcription or moving over distant loci. 

2.     LncRNAs regulate pre-mRNA splicing through RNA-DNA interactions: LncRNAs can tether DNA forming an RNA-dsDNA triplex by targeting specific DNA sequences and inserting themselves as a third strand into the major groove of the DNA duplex. These are known as R-loops; three-stranded nucleic acid structures, composed of RNA–DNA hybrids, frequently formed during transcription. Aberrant R-loops are generally associated with DNA damage, transcription elongation defects, hyper-recombination and genome instability. Recent lines of evidence indicate a potential role for R-loops in alternative pre-mRNA splicing. A class of lncRNAs, the so-called circular RNAs (circRNAs) are abundant, conserved transcripts originate from a non-canonical AS process (back-splicing) leading to the formation of head-to-tail splice junctions, joined together to form circular transcripts. 

3.     LncRNAs regulate pre-mRNA splicing through RNA-RNA interactions: Identified in multiple eukaryotes, Natural Antisense Transcripts (NATs) are a class of long non-coding RNA molecules, transcribed from both coding and non-coding genes on the opposite strand of protein-coding ones. Regardless of their genomic origin, NATs can hybridize with pre-mRNAs and form RNA-RNA duplexes. In some cases, a double function is also possible, and NATs can encode for proteins on one hand, while at the same time working as non-coding molecules modulating the splicing of a neighbouring gene’s transcript. 

4.     LncRNAs regulate pre-mRNA splicing by modulating the activity of Splicing Factors: lncRNAs interact in a dynamic network with many SFs and their pre-mRNA target sequences to modulate transcriptome reprogramming in eukaryotes. LncRNAs regulate the localization and phosphorylation status of Splicing Factors. 

 The authors conclude by stating that “With the increasing prevalence of splicing events and the discovery of over a hundred thousand lncRNAs, it is likely that the involvement of lncRNAs in regulating AS is far greater than the currently known.”

 

  

Regulation of pre-mRNA splicing by lncRNAs. LncRNAs (red) are able to control pre-mRNA splicing by (a) modifying chromatin accessibility through recruiting or impeding access to chromatin modifying complexes at the transcribed genomic locus. In some cases, this might result in more drastic long-range structural changes; (b) interacting with the transcribed genomic locus through an RNA-DNA hybrid; (c) hybridizing with the pre-mRNA molecule (light blue); (d) promoting SF recruitment or by sequestering SFs into specific subnuclear compartments, thereby interfering with SF activities. Credit: Epigenetic Regulation of Alternative Splicing: How LncRNAs Tailor the Message. Authors: Giuseppina Pisignano and Michael Ladomery

In my preceding post Headwaters we see various ways in which many types of non coding RNAs regulate gene expression not only inside the cell but also through the circulating secretome. Here we saw how alternative splicing leads to protein diversity and non coding transcription by creating alternative transcripts from the same gene. But what is amazing is that non coding RNAs influence the alternative spliceasome. A very interesting paper titled Aging is associated with a systemic length-associated transcriptome imbalance by Dr. Luis Amaral et. Al. in which they find out that as we age longer transcripts reduce and many of them are associated with longevity genes. They cite various possible causes as the source of the origin of these change like heat shock protein leaving translation with truncated protein lengths and spliceosome and splice factors deliberately transcribing shorter transcripts. But the best clue is that they also found in some subset of tissues and cell types exact opposite is seen happening! In these short transcripts are seen reducing and long transcripts are seen increasing. So what is this a dead giveaway of? Temporal program of autologous regulation. The age related changes are not random but are orchestrated by transcription and splicing machinery and their coplayers. In a paper titled Aging associated changes in the expression of LncRNAs in human tissues reflect a transcriptional modulation of ageing pathways by Dr. Joao Pedro de Megalhaes et. Al they observed that LncRNAs are very tissue and lineage specific and typically highly specific spatio-temporal expression patterns. This again shows evidence of an intricately designed regulatory plan that unfolds with timeline of the living organisms. All this intricately complex regulation in such tiny environment is for spatial and temporal organization of a life form:

Spatial organization: Imagine a tiny cell 1/10th the diameter of a human hair has information that it reads which tells it where it should locate itself with respect to other cells in our body. So a cell that is designated to be an eye cell, as it emerges from the multiplication of cells from one single fertilized egg, knows it has to move precisely towards the sockets being formed in the head and then through epigenetic changes it becomes an eye cell! It will not float and land up on the hand or turn into a skin cell in the eye. The precision is mind boggling. Where is that information, that instruction that it must move there to become an eye cell? It’s already labeled in its DNA. Imagine tens of trillions of cells each knowing exactly where it needs to locate itself in a 3 dimensional space of the life form and then what it needs to become to form various organs and tissues and muscles and bones! It must need to coordinate and jostle with its neighbours to land at its physical destination. Dr. Michael Levin says there is Bioelectrical memory which connects all cells in a mesh and guides each cell to where it needs to be. This process is called spatial organization. 

Temporal organization: Once a cell takes its place and it’s epigenetic buttons are clicked to transform it into a type of cell a whole different process of organization begins. In this process the 10% or 20% of the coding  genes which typically are active begin to print proteins that fulfill their various tasks in line with their cell’s type. So a pancreatic cell with code for insulin for example. These are functional tasks of the cell but parallely as we have read above there is also highly complex regulation that is happening of those protein coding genes and their transcripts. This continuous background regulation creates constant changes in the cell from birth till death. Initially these macro changes are related to development: to make us grow from an egg to an adult and after puberty the main theme of these changes is to dial down important repair and recycling systems so that within a given range the life form dies. These latter changes manifest as aging. These regulatory changes of the spliceosome, alternative splicing, non coding and coding gene transcription all together leading to a particular proteomic configuration which in turn influences the efficiency of all the tasks that are done by those proteins. The changes stop some proteins, change some proteins, reduce some proteins and increase some proteins. This is ongoing all our lives. Ironically these transcriptional and proteomic changes also affect the cells DNA itself as progressively double strand breaks increase as we age and their repair efficiency reduces when it’s needed even more. This brings us to the main observation driving this post: 

Autologus Regulation: Nature has created this unit of mind boggling complexity and intricate design: the cell. All life forms on our planet are built from this unit. Incredibly this unit produces regulators that governs itself! It produces transcripts and proteins that regulate the regulators! So basically it writes its own biological destiny. Inherited genetic factors and lifestyle factors do also influence our biological destiny but only in a narrow range. The main driver continues to remain the inherited repository of information in the cell itself. The information it carries enacts it’s spatial organization and the same source of information also enacts it’s temporal organization. It transcribes transcripts that influence the transcriptional machinery and splicing machinery to decide whether to transcribe the entire gene or whether to transcribe an alternate version or whether to silence it. Some of those transcripts along with some of the translated proteins will make further alterations to the transcriptional decisions and splicing decisions in a continuing loop of self regulation driving the two major themes: development before adulthood and aging after adulthood. Besides these two main themes there are also changes that occur due to environmental stimuli. But overall unless they are extreme or fatal these are dominated by the two main themes. Some of these instructions are exchanged between cells through direct connections with neighboring cells or through the secretions of one cell entering another. 

This self regulation is a very interesting process created by Nature which we rarely get to witness anywhere else. It’s easy to miss how incredibly remarkable is this technology developed by Nature and evolution. DNA carries information that when read sequentially builds us into an adult starting from a single cell and DNA also carries information that when read sequentially after puberty leads to gradual aging and death. We inherit both, our youth code and our death code,  from the moment we are a fertilized  egg. Let me try to explain it with an example. Let’s say a branch office is opened (cell) in which there is no manager but only an SOP manual – a standard operating procedure master handbook for the entire year that all the staff has to follow. It gives instructions to the HR dept on what kind of staff to hire. It has various printers that print out instructions daily giving tasks to all the staff. But imagine that only 30% of the employees actually do the tasks that produce the parts that the branch manufactures. 70% of the employees are getting instructions daily from the SOP to manage those 30% employees and what they produce by making changes in the master SOP that is daily giving instructions to those 30%. So the SOP itself has instructions to daily make changes in the SOP and thereby resulting in changes in the production. But those changes and their edits are so complex that it requires 70% of the employees just taking new instructions daily from the SOP and coming over and editing the future chapters of the SOP manual. These self edit instructions flow out sequentially as each new page of the SOP is read each new day of the year. Other branches also exchange data (secretome) and send their employees to also make edits in each other’s SOPs’.  In the beginning there is tremendous excitement and new teams are hired and production is going full swing making wonderful products that sell very well (puberty). At its peak the training reaches a point where a team of employees can go and open another branch (reproduction). But once that is done the SOP begins to give out instructions to edit itself (autologous regulation) so that in forthcoming pages the production quality, hiring quality, raw material quality all of it is purposely, gradually brought down (aging). In the beginning it’s hard to notice but after some months of such gradual changes the consequences begin to show and unsold products start piling up. Cash flow is affected, salaries are affected. And what at its peak was a dynamic factory full of enthusiastic, productive workers becomes demoralized and stressed out leading to even further degradation at the branch creating a snowballing stranglehold from which the branch can’t escape and at some point it shuts down which is death. This is done so that there is no over crowding of the branches creating over supply which would destroy the company itself and also to ensure fresh young staff is recruited with every new branch which is enthusiastic and hard working. 

 Coming back to our biology there are two basic goals of autologous regulation: One is to build an adult from a fertilized egg. Second is to gradually make the adult age that would culminate with sufficient degradation to cause death anywhere between average lifespan to maximum lifespan of that species. One of the key reasons for this regular recycling every generation is because thanks to a paper last year by Dr. Vadim Gladyshev we learnt of this marvelous event occurring during early embryogenesis: all the inherited errors and insults of germline cells is wiped clean to make a brand new error free baby. Have humans outgrown this need to regular recycling? Can our intelligence help us to resolve the challenge of accumulation of biological errors and insults? As mentioned in my previous post I continue to take inspiration from certain life forms which seem to be immortal in permanent youth. I cite the Ginkgo Biloba tree because a researcher Dr. Richard Dixon has studied it. Even after a thousand years the tree that he studied still had photosynthesis efficiency and immune resilience of a 20 year old tree. Question arises as to how it’s able to do this. In almost all other life forms the DNA harbors temporal instructions that, as we read above,  make changes to the spliceosome and the splicing factors and the transcription factors and the epigenetic marks which result in gradual collapse of our repair and recycling systems and ultimately death. How is Ginkgo Biloba allowing all the changes related to development to reach adulthood but freezing or blocking or erasing further regulatory changes thereby permanently remaining in youth? Many scientists wonder if we can prolong our youth would we still die when we reach 122 or 125? Ginkgo Biloba tree says no. 

Two technologies are moving towards reversing human biological age. One of them is partial reprogramming of the cell using some of the yamanaka factors. This will in effect reverse the epigenetic signature, the gene expression and the proteome back to an earlier point closer to our youth. Question is does it also change the transcriptome? If not,  aging related changes would again bring the cell back to an impaired state. If it does also turn back the temporal needle of the transcription program to where it was in our twenties then it would again take decades for the cell to get impaired again. The only catch is that this is the same path that cell would take if it were reverting back to embryogenic cell state and that state can lead to cancer. So does partial reprogramming fully protect against cancer? One can never know till many years later. Second technology is an arbitrage. Signaling and regulatory molecules circulating in the plasma of the young are injected into the circulatory system of the old. As those molecules enter the impaired cells they reset the proteome of that cell back to how it was in youth. Thereby rejuvenating those cells. This does not stop the legacy transcription in the cell which after a point would begin the degradation all over again. The question here is if the pro youth molecules are injected repeatedly would that at some point ‘flip’ the transcriptome to how it was during youth? If yes then it would take decades before the cell would get impaired again. 

Human biology is incredibly complex. But does it have to be this complex? The complexity arises to maintain autologous management of the entire body. But can it be improved? Why do we need to generate voltage only from the food we eat? Why can’t we re-engineer so that we need only sunlight for energy like trees and plants do so beautifully? So much of our body’s parts are devoted to eating, digestion and excretion. If we did not need to eat to generate electrical energy we could reduce 50% of of our organs. Also why can’t we store electrical  energy in our body/cell? We humans have created batteries to store electrical power so are we now ahead of evolution? Can we create alternate source of electrical energy in our cells? We have the intelligence to do it. Can we obviate the need for oxygen? We will also be able to edit the embryogenic process safely to alter our human organs and systems and form.  I guess all of this is possible in the distant future. It will all start with our control over biological age.