Showing posts with label Mystery of Aging. Show all posts
Showing posts with label Mystery of Aging. Show all posts

Thursday, 23 February 2023

AUTOLOGUS REGULATION

AUTOLOGOUS REGULATION

 Our biological destiny is inherited by us in every cell. Our DNA is the repository of this information in every cell. DNA is another incredible wonder of Nature. Todd Smith gives a great description (1): Six billion base pairs of DNA are packaged into 22 pairs of chromosomes, plus two sex chromosomes. Each base pair is 34 angstroms in length (.34 nanometers, or ~0.3 billionths of a meter), so six billion base pairs (all chromosomes laid out head to toe) form a chain that's two meters long. If we could hang this DNA chain from a hook, it would be slightly taller than an average human. But that's just the DNA from one cell. Each of us have around 50 trillion cells (50,000 billion). If we took the DNA from all of those cells and laid it out in a linear fashion, it could wrap around the earth 2.5 million times, or reach to the sun and back 300 times! Yet cells manage to pack all that DNA into a structure so small we can't even see it without a microscope. 

This long hard disk is twisted and braided and compressed so amazingly in the tiny nucleus of our tiny cell. Each tiny cell contains all the information to build a complete living organism or human being. Basically it carries two types information: one for autologous regulation- continuously managing itself as per inherited temporal program. This silences most of the genes and activates only those that give it its identity and characteristics. It uses multiple layers of tools and collaborations between those tools to accomplish this as per instructions embedded in it. Second type of information is its mesh like behaviour. Cell functions on its own as per the type of cell it becomes but it also functions collectively with other cells to build as per the design of the organism. On top of this mesh there are non cellular players like bioelectrical networks that influence each cell and also collectively as another language of communication amongst them. 

Cells are very crowded places: there are some 42 million protein molecules in a simple cell, revealed a team of researchers led by Grant Brown, a biochemistry professor in the University of Toronto's Donnelly Centre for Cellular and Biomolecular Research. The majority of proteins exist within a narrow range -- between 1000 and 10,000 molecules. Some are outstandingly plentiful at more than half a million copies, while others exist in fewer than 10 molecules in a cell. These molecules move very very fast inside the cell. In a blog by Ken Shiriff where he quotes from the book Molecular Biology of the Cell: You may wonder how things get around inside cells if they are so crowded. It turns out that molecules move unimaginably quickly due to thermal motion. A small molecule such as glucose is cruising around a cell at about 250 miles per hour, while a large protein molecule is moving at 20 miles per hour. Note that these are actual speeds inside the cell, not scaled-up speeds. I'm not talking about driving through a crowded Times Square at 20 miles per hour; to scale this would be more like driving through Times Square at 20 million miles per hour!

Because cells are so crowded, molecules can't get very far without colliding with something. In fact, a molecule will collide with something billions of times a second and bounce off in a different direction. Because of this, molecules are doing a random walk through the cell and diffusing all around. A small molecule can get from one side of a cell to the other in 1/5 of a second.

 As a result of all this random motion, a typical enzyme can collide with something to react with 500,000 times every second. Watching the video, you might wonder how the different pieces just happen to move to the right place. In reality, they are covering so much ground in the cell so fast that they will be in the "right place" very frequently just by chance.

 A rendition of a cross section of a cell and how crowded it is.

 In addition, a typical protein is tumbling around, a million times per second. Imagine proteins crammed together, each rotating at 60 million RPM, with molecules slamming into them billions of times a second. This is what's going on inside a cell.

In super tiny tightly packed strands of DNA heritable intelligence decides which gene (a segment of the DNA) will be read and which part of the strands will be tightly sealed to avoid being read. The ‘reading’ of the strands is by a process using enzymes and many floppy phase changing proteins as described in previous post. So many things have to come together at the right place for the gene to be read – all inside a tiny tightly packed part of a tiny cell. 

From what is not read and what is read in our DNA a cell gets it’s identity and function. Only 10%to 20%  of the coding genes are active at any given time in a cell. There is intelligence even in the spatial arrangement of each of the 200 types of cells. It’s like each type of cell is of a particular color and shape in a puzzle and Nature arranges them to form 80 different 3-D organs with incredible functions like our eyes which allow us to see and liver that does complex processing. Cells also form bones and cartilage and tendons and muscles. All of this different things made from the same basic cell. And each cell can be made to turn into any other type of cell. Unbelievably each cell has information on its ‘hard disk’ to build each and every of the organs, bones, muscles and skin. We literally start from a single cell! 

As we read in my earlier post Headwaters, this ‘reading’ or  transcription of our DNA is quite pervasive and is observed in 85% of our genome. Out of this only 2% is involved in protein coding. Rest is involved in regulating this 2% and it’s translation. The more the complex organism the bigger the ratio between coding and non coding but this tells only one part of the story. 




Even in such a crowded cell with such a huge genome Nature maximizes this space by alternatively splicing 95% of the genome. So instead 50,000 genes (coding and non coding) generating 50,000 transcripts not only 85% of the genome transcribes but 98% of this transcriptome undergoes alternative splicing! Creating uncountable isoforms. By alternative splicing we mean that same region of our genome can be ‘read’ in multiple versions. Supposing we mark a region from 1 to 10 and  neighbouring region is marked from 11 to 20 as two genes but those genes due to alternative splicing can be transcribed as 5 to 9 or 3 to 6 or 2 to 9 making 3 transcripts from the same gene. This splicing can also include neighbours so it can go 7 to 15 or 3 to 12, etc. to explain it simply. So in the preceding Headwaters post we learnt about how most of the transcription from non coding regions and some proteins create layer upon layer of regulation of the protein coding genes driving the changes that make us from an egg to an adult and after puberty it launches the process of aging. Now in this post we find that that is not all that happens in the genome and it’s housing structures like histones and  chromosomes. On top of this there is spliceosome that cuts up the genome into not just linear transcripts across its length but unending variety of isoforms due to rampant alternative splicing. Look at the packaging brilliance of Nature: a 2 meter long DNA 85% of which transcribes into transcripts in a nucleus that is 10 microns (one micron is one millionth of a meter) would be miraculous enough but Nature maximizes this by adding pervasive alternative splicing that creates multiple transcripts from same gene! Thereby multiplying the number of transcripts by multifold that are produced from the 2 meters. 

 

 




From Universal Alternative Splicing of Non Coding Exons by Tim Mercer et. al

 Only a limited number of transcripts whether of a full gene or alternatively spliced gene translates into protein. In my previous post Headwaters we read about how these shapeless, floppy proteins gather near a gene activation site and magically phase change into a condensate that hovers over the site. Similarly a different condensate activates splicingAn article published in Genome Biology Journal on 28th November 2018 by Dr. Steven Salzburg et. Al. states the following: “We assembled the sequences from deep RNA sequencing experiments by the Genotype-Tissue Expression (GTEx) project, to create a new catalog of human genes and transcripts, called CHESS. The new database contains 42,611 genes, of which 20,352 are potentially protein-coding and 22,259 are noncoding, and a total of 323,258 transcripts. These include 224 novel protein-coding genes and 116,156 novel transcripts. We detected over 30 million additional transcripts at more than 650,000 genomic loci, nearly all of which are likely nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells.




The interesting thing to note is the huge number of transcripts they found: 30 million! They claim that most of them are non functional but Nature rarely spends resources to construct huge volumes of non-function things. The non protein coding transcripts too have very important roles. In a paper titled ‘Pervasive Transcription of the Human Genome Produces Thousands of Previously Unidentified Long Intergenic Noncoding RNAs’ by Matthew J. Hangaue et. Al. the authors say “It is now becoming more and more clear instead that, far from being genetic “deadwood” these repetitive expanses are actively and deliberately transcribed into non-coding RNAs which play a major role in regulating gene expression and silencing, organizing nuclear architecture, compartmentalizing the nucleus, and modulating protein function.” My previous post explains in detail the various types of non coding transcripts and the regulatory roles they play but here we additionally examined the alternative splicing that generates not only variety of coding transcripts but also as we read above huge number of non coding transcripts. 

 What is fascinating is how these transcripts govern their own births: if you recall we covered Long non coding RNAs in the previous post-in a paper titled ‘Epigenetic regulation of alternative splicing: How LncRNAs tailor the message’ by authors Pisignano and Lafomery they write about some of the ways in which LncRNAs regulate alternative splicing which in turn leads to various transcripts including LncRNAs. An excerpt from their paper “Both short (<200 nt) and long (>200 nt) non-coding RNAs can contribute to the regulation of alternative splicing in many different ways; either indirectly by regulating the activity of splice factors; or directly, by interacting with pre-mRNAs. Long non-coding RNAs (lncRNAs) are particularly well suited to these roles due to their demonstrated capacity to act as regulatory molecules that modulate gene expression at every level. Either alone, or in association with partner proteins, these long RNA polymerase II transcripts have been shown to take part in a wide range of developmental processes and disease in complex organisms.” So which are the ways they mentioned in which LncRNAs regulate alternative splicing:

1.     LncRNAs regulate alternative splicing through chromatin modification: An intimate relationship exists between lncRNAs and chromatin conformation.  LncRNAs regulate chromatin modifications by recruiting or directly interacting with histone-modifying complexes or enzymes at specific chromosomal loci. A possible lncRNA-mediated crosstalk between histone modifications and the pre-mRNA splicing machinery has also been proposed. Several lncRNAs appear to control important aspects of chromatin organization including chromatin looping, either remaining tethered to the site of transcription or moving over distant loci. 

2.     LncRNAs regulate pre-mRNA splicing through RNA-DNA interactions: LncRNAs can tether DNA forming an RNA-dsDNA triplex by targeting specific DNA sequences and inserting themselves as a third strand into the major groove of the DNA duplex. These are known as R-loops; three-stranded nucleic acid structures, composed of RNA–DNA hybrids, frequently formed during transcription. Aberrant R-loops are generally associated with DNA damage, transcription elongation defects, hyper-recombination and genome instability. Recent lines of evidence indicate a potential role for R-loops in alternative pre-mRNA splicing. A class of lncRNAs, the so-called circular RNAs (circRNAs) are abundant, conserved transcripts originate from a non-canonical AS process (back-splicing) leading to the formation of head-to-tail splice junctions, joined together to form circular transcripts. 

3.     LncRNAs regulate pre-mRNA splicing through RNA-RNA interactions: Identified in multiple eukaryotes, Natural Antisense Transcripts (NATs) are a class of long non-coding RNA molecules, transcribed from both coding and non-coding genes on the opposite strand of protein-coding ones. Regardless of their genomic origin, NATs can hybridize with pre-mRNAs and form RNA-RNA duplexes. In some cases, a double function is also possible, and NATs can encode for proteins on one hand, while at the same time working as non-coding molecules modulating the splicing of a neighbouring gene’s transcript. 

4.     LncRNAs regulate pre-mRNA splicing by modulating the activity of Splicing Factors: lncRNAs interact in a dynamic network with many SFs and their pre-mRNA target sequences to modulate transcriptome reprogramming in eukaryotes. LncRNAs regulate the localization and phosphorylation status of Splicing Factors. 

 The authors conclude by stating that “With the increasing prevalence of splicing events and the discovery of over a hundred thousand lncRNAs, it is likely that the involvement of lncRNAs in regulating AS is far greater than the currently known.”

 

  

Regulation of pre-mRNA splicing by lncRNAs. LncRNAs (red) are able to control pre-mRNA splicing by (a) modifying chromatin accessibility through recruiting or impeding access to chromatin modifying complexes at the transcribed genomic locus. In some cases, this might result in more drastic long-range structural changes; (b) interacting with the transcribed genomic locus through an RNA-DNA hybrid; (c) hybridizing with the pre-mRNA molecule (light blue); (d) promoting SF recruitment or by sequestering SFs into specific subnuclear compartments, thereby interfering with SF activities. Credit: Epigenetic Regulation of Alternative Splicing: How LncRNAs Tailor the Message. Authors: Giuseppina Pisignano and Michael Ladomery

In my preceding post Headwaters we see various ways in which many types of non coding RNAs regulate gene expression not only inside the cell but also through the circulating secretome. Here we saw how alternative splicing leads to protein diversity and non coding transcription by creating alternative transcripts from the same gene. But what is amazing is that non coding RNAs influence the alternative spliceasome. A very interesting paper titled Aging is associated with a systemic length-associated transcriptome imbalance by Dr. Luis Amaral et. Al. in which they find out that as we age longer transcripts reduce and many of them are associated with longevity genes. They cite various possible causes as the source of the origin of these change like heat shock protein leaving translation with truncated protein lengths and spliceosome and splice factors deliberately transcribing shorter transcripts. But the best clue is that they also found in some subset of tissues and cell types exact opposite is seen happening! In these short transcripts are seen reducing and long transcripts are seen increasing. So what is this a dead giveaway of? Temporal program of autologous regulation. The age related changes are not random but are orchestrated by transcription and splicing machinery and their coplayers. In a paper titled Aging associated changes in the expression of LncRNAs in human tissues reflect a transcriptional modulation of ageing pathways by Dr. Joao Pedro de Megalhaes et. Al they observed that LncRNAs are very tissue and lineage specific and typically highly specific spatio-temporal expression patterns. This again shows evidence of an intricately designed regulatory plan that unfolds with timeline of the living organisms. All this intricately complex regulation in such tiny environment is for spatial and temporal organization of a life form:

Spatial organization: Imagine a tiny cell 1/10th the diameter of a human hair has information that it reads which tells it where it should locate itself with respect to other cells in our body. So a cell that is designated to be an eye cell, as it emerges from the multiplication of cells from one single fertilized egg, knows it has to move precisely towards the sockets being formed in the head and then through epigenetic changes it becomes an eye cell! It will not float and land up on the hand or turn into a skin cell in the eye. The precision is mind boggling. Where is that information, that instruction that it must move there to become an eye cell? It’s already labeled in its DNA. Imagine tens of trillions of cells each knowing exactly where it needs to locate itself in a 3 dimensional space of the life form and then what it needs to become to form various organs and tissues and muscles and bones! It must need to coordinate and jostle with its neighbours to land at its physical destination. Dr. Michael Levin says there is Bioelectrical memory which connects all cells in a mesh and guides each cell to where it needs to be. This process is called spatial organization. 

Temporal organization: Once a cell takes its place and it’s epigenetic buttons are clicked to transform it into a type of cell a whole different process of organization begins. In this process the 10% or 20% of the coding  genes which typically are active begin to print proteins that fulfill their various tasks in line with their cell’s type. So a pancreatic cell with code for insulin for example. These are functional tasks of the cell but parallely as we have read above there is also highly complex regulation that is happening of those protein coding genes and their transcripts. This continuous background regulation creates constant changes in the cell from birth till death. Initially these macro changes are related to development: to make us grow from an egg to an adult and after puberty the main theme of these changes is to dial down important repair and recycling systems so that within a given range the life form dies. These latter changes manifest as aging. These regulatory changes of the spliceosome, alternative splicing, non coding and coding gene transcription all together leading to a particular proteomic configuration which in turn influences the efficiency of all the tasks that are done by those proteins. The changes stop some proteins, change some proteins, reduce some proteins and increase some proteins. This is ongoing all our lives. Ironically these transcriptional and proteomic changes also affect the cells DNA itself as progressively double strand breaks increase as we age and their repair efficiency reduces when it’s needed even more. This brings us to the main observation driving this post: 

Autologus Regulation: Nature has created this unit of mind boggling complexity and intricate design: the cell. All life forms on our planet are built from this unit. Incredibly this unit produces regulators that governs itself! It produces transcripts and proteins that regulate the regulators! So basically it writes its own biological destiny. Inherited genetic factors and lifestyle factors do also influence our biological destiny but only in a narrow range. The main driver continues to remain the inherited repository of information in the cell itself. The information it carries enacts it’s spatial organization and the same source of information also enacts it’s temporal organization. It transcribes transcripts that influence the transcriptional machinery and splicing machinery to decide whether to transcribe the entire gene or whether to transcribe an alternate version or whether to silence it. Some of those transcripts along with some of the translated proteins will make further alterations to the transcriptional decisions and splicing decisions in a continuing loop of self regulation driving the two major themes: development before adulthood and aging after adulthood. Besides these two main themes there are also changes that occur due to environmental stimuli. But overall unless they are extreme or fatal these are dominated by the two main themes. Some of these instructions are exchanged between cells through direct connections with neighboring cells or through the secretions of one cell entering another. 

This self regulation is a very interesting process created by Nature which we rarely get to witness anywhere else. It’s easy to miss how incredibly remarkable is this technology developed by Nature and evolution. DNA carries information that when read sequentially builds us into an adult starting from a single cell and DNA also carries information that when read sequentially after puberty leads to gradual aging and death. We inherit both, our youth code and our death code,  from the moment we are a fertilized  egg. Let me try to explain it with an example. Let’s say a branch office is opened (cell) in which there is no manager but only an SOP manual – a standard operating procedure master handbook for the entire year that all the staff has to follow. It gives instructions to the HR dept on what kind of staff to hire. It has various printers that print out instructions daily giving tasks to all the staff. But imagine that only 30% of the employees actually do the tasks that produce the parts that the branch manufactures. 70% of the employees are getting instructions daily from the SOP to manage those 30% employees and what they produce by making changes in the master SOP that is daily giving instructions to those 30%. So the SOP itself has instructions to daily make changes in the SOP and thereby resulting in changes in the production. But those changes and their edits are so complex that it requires 70% of the employees just taking new instructions daily from the SOP and coming over and editing the future chapters of the SOP manual. These self edit instructions flow out sequentially as each new page of the SOP is read each new day of the year. Other branches also exchange data (secretome) and send their employees to also make edits in each other’s SOPs’.  In the beginning there is tremendous excitement and new teams are hired and production is going full swing making wonderful products that sell very well (puberty). At its peak the training reaches a point where a team of employees can go and open another branch (reproduction). But once that is done the SOP begins to give out instructions to edit itself (autologous regulation) so that in forthcoming pages the production quality, hiring quality, raw material quality all of it is purposely, gradually brought down (aging). In the beginning it’s hard to notice but after some months of such gradual changes the consequences begin to show and unsold products start piling up. Cash flow is affected, salaries are affected. And what at its peak was a dynamic factory full of enthusiastic, productive workers becomes demoralized and stressed out leading to even further degradation at the branch creating a snowballing stranglehold from which the branch can’t escape and at some point it shuts down which is death. This is done so that there is no over crowding of the branches creating over supply which would destroy the company itself and also to ensure fresh young staff is recruited with every new branch which is enthusiastic and hard working. 

 Coming back to our biology there are two basic goals of autologous regulation: One is to build an adult from a fertilized egg. Second is to gradually make the adult age that would culminate with sufficient degradation to cause death anywhere between average lifespan to maximum lifespan of that species. One of the key reasons for this regular recycling every generation is because thanks to a paper last year by Dr. Vadim Gladyshev we learnt of this marvelous event occurring during early embryogenesis: all the inherited errors and insults of germline cells is wiped clean to make a brand new error free baby. Have humans outgrown this need to regular recycling? Can our intelligence help us to resolve the challenge of accumulation of biological errors and insults? As mentioned in my previous post I continue to take inspiration from certain life forms which seem to be immortal in permanent youth. I cite the Ginkgo Biloba tree because a researcher Dr. Richard Dixon has studied it. Even after a thousand years the tree that he studied still had photosynthesis efficiency and immune resilience of a 20 year old tree. Question arises as to how it’s able to do this. In almost all other life forms the DNA harbors temporal instructions that, as we read above,  make changes to the spliceosome and the splicing factors and the transcription factors and the epigenetic marks which result in gradual collapse of our repair and recycling systems and ultimately death. How is Ginkgo Biloba allowing all the changes related to development to reach adulthood but freezing or blocking or erasing further regulatory changes thereby permanently remaining in youth? Many scientists wonder if we can prolong our youth would we still die when we reach 122 or 125? Ginkgo Biloba tree says no. 

Two technologies are moving towards reversing human biological age. One of them is partial reprogramming of the cell using some of the yamanaka factors. This will in effect reverse the epigenetic signature, the gene expression and the proteome back to an earlier point closer to our youth. Question is does it also change the transcriptome? If not,  aging related changes would again bring the cell back to an impaired state. If it does also turn back the temporal needle of the transcription program to where it was in our twenties then it would again take decades for the cell to get impaired again. The only catch is that this is the same path that cell would take if it were reverting back to embryogenic cell state and that state can lead to cancer. So does partial reprogramming fully protect against cancer? One can never know till many years later. Second technology is an arbitrage. Signaling and regulatory molecules circulating in the plasma of the young are injected into the circulatory system of the old. As those molecules enter the impaired cells they reset the proteome of that cell back to how it was in youth. Thereby rejuvenating those cells. This does not stop the legacy transcription in the cell which after a point would begin the degradation all over again. The question here is if the pro youth molecules are injected repeatedly would that at some point ‘flip’ the transcriptome to how it was during youth? If yes then it would take decades before the cell would get impaired again. 

Human biology is incredibly complex. But does it have to be this complex? The complexity arises to maintain autologous management of the entire body. But can it be improved? Why do we need to generate voltage only from the food we eat? Why can’t we re-engineer so that we need only sunlight for energy like trees and plants do so beautifully? So much of our body’s parts are devoted to eating, digestion and excretion. If we did not need to eat to generate electrical energy we could reduce 50% of of our organs. Also why can’t we store electrical  energy in our body/cell? We humans have created batteries to store electrical power so are we now ahead of evolution? Can we create alternate source of electrical energy in our cells? We have the intelligence to do it. Can we obviate the need for oxygen? We will also be able to edit the embryogenic process safely to alter our human organs and systems and form.  I guess all of this is possible in the distant future. It will all start with our control over biological age. 



Saturday, 14 December 2019

HEADWATERS: FROM WHERE THE AGING CASCADE BEGINS


ETERNAL YOUTH 

HEADWATERS
From Where The Aging Cascade Begins



One of the greatest mysteries of biology is the source of sequential orchestration of genetic events in the DNA. For example when we are around 6 years old we begin to lose our milk teeth. What tells exactly at that age to do this? What decides the timing for puberty or menopause? Or just after puberty what tells our cells around that age to bring down the protein production support machinery's efficiency by 70% to begin aging? As stated in my earlier posts there are deliberate very harmful changes that occur in a timed manner to promote aging. If we can find out from where all these instructions come may be the detrimental ones could become a therapeutic or gene therapy target. Is it a single part of our brain that gives these instructions at a particular time/age? No. Apparently each of our 30 trillion cells has its own manager that releases time regulated instructions or temporal regulation. So it seems there is no single boss but 30 trillion managers each managing their own cell and coordinating with other managers. There is a postal system enabled by our blood circulation that scurries messages amongst all the managers probably to react to environmental stimuli or to even out the changes in gene expression. The messages in this postal system are carried in envelopes: EVs: extracellular vesicles and their cargo are the messages. In our body something keeps time and releases instructions for developmental changes in the first phase and aging related in the next phase. No one has been able to point out from where. Even the scientists who are computing clocks that measure changes in methylation in our genome, epigenome and rDNA also have not been able pin point the source of these changes. We will try to do some detective work but later in this essay, next we will share some of the mindblowing discoveries and their movies being made showing us for the first time how genes are turned on. It is important for our quest to understand this.
The brilliant scientist Ibrahim Cisse. Bryce Vickmark
Ibrahim Cisse is a great success story. From a background with few resources in Niger, Africa through education and merit he is today a biophysicist at MIT, USA. His tweak to single cell high resolution microscopy allowed him to take films of RNA transcription in action. This has changed the way we thought about how genes were activated. It seems there are lots of tiny floppy proteins that coalesce into phase change droplets at a target gene switch. They form a mesh in which many proteins would flit in and out sometimes within seconds. They would collectively turn on a gene. The length of their stay would determine how many proteins would get transcribed. Once the job is done these floppy proteins would disperse like a flash mob.
It takes a village of proteins to turn on genes. W.K. Cho Science 2018
It is indeed fascinating to see how the required proteins gather at the required address. Then on their own like magic condense into droplets that coalesce together to form a new village of fast moving droplets. In this temporary body other proteins flit in do their job and zip out sometimes within seconds. Where do these transcription proteins come from? What gives each of those proteins the instructions to all of these complex tasks which would seem to need some level of intelligence at such nano scale?! And voila the selected gene is turned on and begins transcribing a copy of itself. These required proteins do not have a Google Map (t) that home them in at the right address in the looong DNA coils. They bump around like manic blind mice till they bind into the lock made for them. It’s so tempting to believe that some central intelligence is guiding and sequencing these trillions of gene controls of activation or silencing or methylation or acetylation, etc. But such a complex central control tower would be unmanageable. So this becomes our clue as to source of all instructions. We will come back to it later. Next let us review our DNA. The proteins as one saw execute almost all our processes. There may be 90,000 types of proteins. Each one being born in their designated cells and doing their given tasks. Where do proteins come from? Our DNA but only 3.5% of our DNA is protein coding. Until recently scientists used to consider the balance 96.5% as junk DNA. But DNA formation is the culmination of hundreds of millions of years of adaptations and optimizations. There is no way so much DNA would be wasted from generation to generation. Recent discoveries have shown that the non coding 96.5% carries out some of the most important functions. Our body is a collection of cells. Each cell is same and born with identical DNA ‘brain’. Right from the time they differentiate into the 200 different type of cells giving our body distinct organs or structure. Till they die they follow instructions. The instructions come either from within its coding DNA or non coding DNA or from other cells. So just as I formed a conclusion about how aging is executed in our body I have also come to believe that the answer to the greatest mystery of biology is that which instructions will reach which cell and when is decided by the cells themselves as per Nature’s program carried in the non coding section of the DNA. Now I want you to imagine that there are 30 trillion offices around the world. Each office has a central computer with pre-installed software and a 3D printer (transcription/translation). Each office gets designated to a department (organs, teeth, bone, etc.) All the offices are connected by internet. Internet is the signalling system. This is a true democracy so no concept of Leaders. Each office is it’s own leader and co-exists and coordinates with other leaders. Now all the offices have a common instruction manual but the pre installed software controls which pages are visible/actionable and which pages can not be opened. The pre-installed software in each office activates tasks which result in the smooth operation of the entire company. Each one does it’s part and collectively the software orchestrates the functioning of a complex working enterprise. Our DNA is the preinstalled software in each cell (office). Only 3.5% of the software can assign printing jobs that result in posting of instructions/actionable items. Rest of the software is about making sure by individually executing orders it collectively runs our entire body and all its systems and functions. It’s like a live jigsaw puzzle of 30 trillion parts and each part has a chip that ensures it goes and locks on its own at its correct location in the puzzle. If all cells carry the same blue print how do cells differentiate into the 200 different types and how do they function in the role assigned and not some other role? It is done by control of gene expression. At any given time only a few genes are active in a cell approximately 3% to 5%. So how is cellular type, it’s function and it’s production of proteins which in turn control various processes in our body controlled or regulated? There are various mechanisms of gene regulation, structural as well as chemical. Like Chromatin accessibility, histone modifications including acetylation and deacetylation, ubiquitination, phosphorylation, etc., DNA methylation, demethylation, binding affinity alterations, repressors, during transcription, during transport of mRNA, stability of mRNA, during translation and postranslational, etc. As one can see the regulation of genes is highly complex but works in concert with regulation of genes in other cells to culminate into the object of such activity both in the cell and collectively in the body. We are built and we operate based on these controls. Overall there are  three basic types of changes that occur due to regulations of gene expression: developmental changes which build us from a single egg to an adult, stimuli adaptative changes and finally aging related changes (post developmental changes). As soon as developmental changes stop just a little after puberty there are deliberate negative or harmful changes that begin our multi decade process of aging. These changes are also highly conserved to ensure that all humans degrade to a point of death. But the question still remains: From where do the instructions come to execute exactly the control action (these changes) needed in each cell at exactly the time it is needed?

Now we have seen above what a fascinating process it is of activating a gene. The activation and repression of genes in each cells DNA culminates in our creation and operation. What scientists have not yet been able identify so far is from where do activation repression instructions come. Not only that the non stimulus related instructions come in a sequential manner. If it didn’t we would turn 20 then suddenly turn old then turn back into 7 year old, randomly. Another aspect of the mystery is how correct set of genes are switched on and off at exactly the correct location. If not eyes would appear on chest and finger could grow on skull. What is incredible is the low error rate over 200+ million years. The fidelity and precision of the system is mind boggling. As mentioned above controlling millions of complex processes in trillions of cells is impossible to achieve from a single source like a part of our brain. Its would be 100000 times more complex than managing all the flights in the world from a single air control tower. So we can infer that the origin of the instructions are also decentralized. In that case the only place in a cell that contains such data is our DNA. Now we have researched quite a bit on the coding part of our DNA. From that we can conclude that coding part needs upstream instructions to execute it’s control over gene expression and protein creation. From the little we have studied the 96.5% non coding part of the DNA we have noticed some interesting functionalities. The majority of the non coding part of the DNA (which does not print proteins) is highly conserved over 200 million + years. From the research papers that I read it hinted at this portion of DNA controlling the part of ‘which’ instruction will be executed ‘where’ and most importantly ‘When’. The when part ensures we move from a baby to an adolescent and later an elderly person in a sequential manner not suddenly becoming old then turning baby then turning middle age or any such random order. This clock that decides when to trigger which change and where lies hidden and protected in depths of our non coding DNA. There was an ongoing debate wherein one side said that almost all the non coding part of DNA is junk accumulated over the years and has no function. Edward Rubin's team at Lawrence Berkeley National Laboratory snipped out 3% of the non coding DNA in mice and did not find any abnormalities. On the other side Martin Sauvageau and colleagues at Harvard University and Broad Institute found that when they created a knockout mice model without 18 Long Non Coding RNAs (LncRNAs) it caused major growth defects including abnormalities in lungs, heart, gastrointestinal tract and neocortex. While a deletion in a small part of non coding DNA not showing any abnormalities does not prove that rest of the non coding DNA has no function. Whereas deleting a part of non coding RNAs causing fatal abnormalities does prove that non coding sections of DNA have critical functions. Hundreds of studies recently have uncovered more and more functions of the non coding elements. Data from ENCODE suggests that more than 75% of the human genome is transcribed into RNAs, whereas only 3% of these RNAs are from protein coding genes (Djebali et al., 2012; Ecker, 2012; Pennisi, 2012). The balance 72% transcribed RNAs will have functions most of which are yet to be discovered.
This is how we visualize DNA

But this is how it functions with layers of regulation

There are about 20,000 genes in the human genome, but as many as 1 million of regulatory elements in the non coding part of DNA.  If we compare the number of protein-encoding genes in worm and human, for example, humans don’t have that many more protein-coding genes than worms. The noncoding genome scales up much better with the developmental and pathological complexity of an organism. The fraction of protein-coding DNA in the genome decreases with increasing organismal complexity. In bacteria, about 90% of the genome codes for proteins. This number drops off to 68% in yeast, to 23-24% in nematodes and to 1.5-2% (or 3.5% as per some studies) in mammals. Using data from comparative genetic studies, the researchers found that the 300,000 functional elements they found made up about 70 per cent of the evolutionarily conserved non-coding DNA shared by mice and humans. The spatial organization is how regulatory elements know where to execute their tasks. How the regulatory elements contribute to activate a gene is not determined by a specific recognition tag, but by where precisely the gene is in the genome says scientist Francois Spitz. The winding and folding of the DNA around histones and nucleosomes which fold again to form a 30nm fiber which forms loops called Chromatin which again regulates transcription by remaining tightly condensed or open for allowing the village of phase change proteins to activate genes. All this folding is not stochastic but precise. The control of regulation occurs due to specific addresses/locations. Any misfolding can result in incorrect genetic actions. The non coding DNA controls our biological cycle by structural and chemical manipulations. The million non coding regulatory elements are interspersed with coding regulatory elements there are co-regulators and there are regulatory elements that control other regulatory elements which sometimes in turn control other regulatory elements. There are hundreds of different mechanisms of manipulating transcriptional activity acting individually or in clusters. Creating a complex web of intricately managed cellular and biological regulation. This complex management leads us to grow from an egg into an adult and from young to old. The non coding part of our DNA starts this complex regulatory cascade.
Some of the important elements of non coding DNA – this is not an exhaustive listing but to show the massive amount of regulation that originates from the non coding part of DNA and also to see how complex is its control over our biological life:

Transposons: I would like to quote from a very good essay by Francesca Tomasi and Olivia Rhoades: Nearly 46% of our DNA is made up of transposons! For millions of years, transposons enjoyed plenty of travel around our genome. They inserted themselves throughout our evolving DNA for as long as they could before these changes started to make the host human less suited for survival in a given environment. When their random insertion provided some sort of life advantage—increased ability to absorb certain nutrients, for instance—or took place with no negative effect, the resulting modifications to the genome were passed on to future generations. Any insertions that caused death or illness, meanwhile, were a lot less likely to make it past a single generation. As such, over millions and millions of years of trial and error, transposons gradually integrated themselves in increasing numbers throughout our genomes. Eventually, their ability to move without negative consequence likely became, for the most part, saturated. And as a result, over 99% of the transposons in the human genome lost their ability to move. But we still have some active transposable elements within us: sometimes they can wreak havoc and cause disease. At a much finer level of resolution, transposons contribute to creating genes, modifying them, and programming and reprogramming them. Many transposons and retroelements contain captured gene fragments and can be part of gene regulatory regions. The bottom line for genomes is that the cleavage and resection of DNA by transposases virtually guarantees sequence variation, genome scrambling, and the appearance of transposons at rearrangement breakpoints. Simply put, transposases drive genome evolution.

Non Coding RNA: a good reference is a chapter: Loudu Srijyothi, Saravanaraman Ponne, Talukdar Prathama, Cheemala Ashok and Sudhakar Baluchamy (October 10th 2018). Roles of Non-Coding RNAs in Transcriptional Regulation, Transcriptional and Post-transcriptional Regulation, Kais Ghedira, IntechOpen, DOI: 10.5772/intechopen.76125. Available from: https://www.intechopen.com/books/transcriptional-and-post-transcriptional-regulation/roles-of-non-coding-rnas-in-transcriptional-regulation.
Non coding RNAs are functional RNA molecules from our non coding DNA but do not code proteins. There are many types of non coding RNA: such as small non coding RNAs – sncRNAs: miRNA, piRNA, SiRNA, SnRNA and long non coding RNAs – lncRNA: lincRNA, NAT, eRNA, circRNA, ceRNA, PROMPTS. ncRNAs play critical roles in defining DNA methylation patterns as well as chromatin remodeling this having a substantial effect on epigenetic signaling. ncRNAs play roles in transcriptional and post transcriptional regulation. Methylation patterns change in a linear fashion through out our life and to an extent where they are used by algorithms to predict biological age. ncRNA regulation is tissue specific and also makes changes in a linear fashion following our stages of lifecycle: earlier ensuring developmental changes and just after puberty age related changes.
Find below a diagram encompassing the RNA universe:


Small interfering RNA – siRNA and micro RNA- miRNA: These molecules induce mRNA degradation or translational repression which thereby changes gene expression. Surprisingly about 60% of the translated protein coding genes are negatively regulated by miRNAs! To make it even more complex there are lncRNAs which bind and degrade target miRNAs thereby upregulating that gene's expression. Layer upon layer of regulation. miRNAs also play role in cell proliferation, cell differentiation, development and cell death.
Long non coding RNAs: Their actions can be divided into 4 types. The diagram below will explain them:

LncRNAs have diverse regulatory functions and might regulate gene expression by modulating chromatin remodeling, cis and trans gene expression, gene transcription, post-transcriptional regulation, translation, protein trafficking and cellular signaling. These below are some of the ways in which lncRNAs regulate:
Transcriptional regulation is done by:
Enhancer ncRNAs – eRNAs – as name suggests they upregulate gene expression.
Activating ncRNAs – transcriptional activating function. Although function is similar to eRNAs their mechanisms are different.
lncRNAs that recruit chromatin modifiers- they recruit chromatin remodeling complexes to specific DNA location to activate or repress genes.
ncRNAs involved in genomic imprinting- participate in epigenetic silencing of an allele inherited from either parent. One example is X-chromosome inactivation in females.
Post translational regulation is done by acting as competing endogenous RNAs that regulate microRNA levels which in turn modulate mRNA levels by altering mRNA stability, mRNA decay, and translation. Some of their regulatory actions:
LncRNAs as a source of miRNAs- 50% of the miRNAs are produced from non coding transcripts. LncRNA genes contain embedded miRNA sequences which may be located within an exon or an intron or occur in clusters within the genome. Though the sources are different, the pathways converge at the level of pre-miRNA structure which produce miRNA.
LncRNAs as negative regulator of miRNA- as mentioned above miRNAs act as negative regulator of gene expression lncRNAs competitively bind them and degrade them thereby upregulating target gene expression.
LncRNA mediated mRNA degradation- they do this mRNA degradation directly independent of miRNAs.

Cis Regulatory Elements: Non Coding regulatory elements near a gene. Cis-regulatory events are complex processes that involve chromatin accessibility, transcription factor binding, DNA methylation, histone modifications, and the interactions between them, control of chromosomal replication,  condensation, pairing and segregation. Types:
Promoter: helps in Initiating transcription of a gene.
Enhancers: provide binding site to Transcription Factors to enhance gene expression.
Silencers: repress the gene after binding with Transcription Factors.
Response Elements: provide locational homing for Transcription Factors.
Insulators: Acts as a boundary wall.
Almost 1/3rd of the genome- about 1 billion base pairs – may be involved in cis-regulatory functions.

Trans Regulatory Elements: modify expression of distant genes.

Introns: Introns are non coding elements inside a gene which are spliced out before transcription of the remaining gene (exon). Alternative splicing allows multiple proteins to be generated from the same gene. 90% of our protein coding genes have introns and of those 95% have alternative splicing! Human genome contains an average of 8.4 introns/gene: 139,480 in our entire genome. Accounts for 25% of our genome. Why our genome has so many conserved Introns is still being discovered but some of the regulatory functions that have been found are transcription initiation, transcription termination, time delay during transcription, alternative splicing, recruitment of nuclear export factors, recruitment of shuttle proteins, it increases translation yields, etc.

Repeated non coding DNA sequences: repeated noncoding DNA sequences at the ends of chromosomes form telomeres. Telomeres protect the ends of chromosomes from being degraded during the copying of genetic material. Repetitive noncoding DNA sequences also form satellite DNA, which is a part of other structural elements. Satellite DNA is the basis of the centromere, which is the constriction point of the X-shaped chromosome pair. Satellite DNA also forms heterochromatin, which is densely packed DNA that is important for controlling gene activity and maintaining the structure of chromosomes.

As ENCODE data suggests that 75% of the human genome is transcribed to RNAs but only 3% of that is from protein coding genes. So rest of the RNA and other factors have complex functions as mentioned above – most of which are yet to be discovered. We read above how multiple transcription factors help activate a gene. Many of these transcription factors are coded in the non protein coding part of our DNA. Each has a function creating an affinity for co-factors, their shapes locking to the exact location. There are regulatory factors that regulate other regulatory factors. So for example one example of factors would regulate as per their purpose but then another layer of factors emerge due to temporal reasons as during aging to bind and block the first layer from activating genes. Such factors emerging from transcription of non coding DNA seem to regulate at various levels, pre-transcription, post transcription, post translation, etc. Both spatial and temporal organization of genome is incredible but what fascinates me is the temporal organization of genomic activity. All the little information we have discovered about the world of non coding DNA shows incredibly complex and precise management of our body through the management of the intracellular and extracellular environment through the management of gene expression and post gene expression. To go one step ahead one can say that which proteins get printed correctly, are stable and get activated and which proteins are blocked at any given time in our lifecycle determines the homeostatic status of our various systems which culminates into our lifestage at that time. The non coding elements not only regulate gene expression/repression but also ultimately protein production/repression.  Our biological destiny is defined by the proteome which is regulated by the transcriptome. The 90,000 types of proteins that are produced, which help run our bodies, from the 3.5% of the genome are regulated by the factors transcribed from 96.5% of the genome! We can safely deduce that the highly conserved part of the non coding genome holds the ‘clock’ that triggers various transcriptions based on time or life stage. This temporal execution turns us from an egg to an adult and then gradually and deliberately in a calibrated manner destabilizes homeostatic balance and efficiency of important processes like repair to make us grow old and eventually die. Just like the developmental program, the aging program too is global, unfolding in every cell in our body. Change is constant in our lifecycle. I am not talking about the changes that occur for day to day operations. I am talking about macro global changes. If there were no changes we would remain an egg. We would remain a baby if changes stopped after we developed into one. These changes make us an adult and then begin aging till we die. Now here is what is fascinating: If we stop these lifecycle changes from occurring after we become an adult there is no reason why we could not remain young forever. This is not a theoretical speculation. Researcher Richard Dixon from University of North Texas and collaborators discovered that a 1,000 year old Gingko Biloba tree's gene expression was the same as a 20 year old tree with no sign of senescence or deterioration. The tree's ability to photosynthesize, germinate seeds, grow leaves or resist disease/infections remains the same as trees thousand years younger.

1,400 year old Gingko Biloba tree planted by Chinese Emperor of Tang Dynasty (618-907) in full bloom at Zen Monastery in Shaanxi!

Our biological systems are so brilliantly designed that at its homeostatic peak it could maintain optimum operations almost forever. If the Gingko Biloba tree can figure it out why can't we. If we want to remain forever in homeostatic bliss and at our youthful peak we will need to discover the region of non coding DNA that gives birth to elements that trigger the various changes that lead to the aging phenotype. Once discovered we would need to either edit or block those regions or elements. We would need to freeze our gene expression pattern as soon as we become an adult. Reward will be eternal youth. This, whenever does happen, would be a permanent solution to remaining young but what about now? Another strategy is to hack the factors and proteins that cause progressive age related changes in the activation and repression of genes and replace them with factors and proteins from a young environment. This will change the gene expression signature back to what it was in youth. In turn that would make us young again. Only catch is that unlike the permanent change to youth that is possible by discovering the birthplace of elements that make the age related changes in gene expression, the hacking protocol requires regular hacks for life to maintain youth.