Wednesday, September 23, 2015

Junk DNA: Nessa Carey's new book about the actually important stuff in the genome.

In 2001, when the first draft of the human genome was completed, researchers were surprised to learn that only 2% of the human genome codes for proteins. At the time, scientists were very focused on proteins and thought that there would be a much larger number of protein-coding genes in the human genome due to our complexity. The term "junk DNA" has been used to describe the other 98% of the genome. With only 20,000 protein-coding genes, the human genome contains almost the same number of genes as the simple roundworm and model organism C. elegans. However, C. elegans has very little excess DNA, suggesting that this junk DNA could be part of the explanation for the increased complexity of humans. This is the starting point for Nessa Carey's second book Junk DNA: A Journey Through The Dark Matter of the Genome, which explains the importance of the non-coding portion of the genome.

Some scientists have argued that the term junk DNA should be scrapped for a more neutral term like non-coding DNA. They suggest that the term is dated and inaccurate. In addition, calling it junk is rather pejorative and is based on the protein-focused view of the genome. Carey's book nicely demonstrates that the other 98% isn't always junk.

What is the other 98% of the genome good for then? Some non-coding DNA has well-established functions. For example, the centromeres are the stretches of DNA that allow the chromosomes to attach to the cell's chromosome segregation apparatus (the mitotic spindle) when the cell copies and divides its DNA. Another example is the telomeres, the lengthy repeat regions of DNA at the ends of the chromosomes. Because telomeres shorten with every cell division, they are linked with aging.


Junk DNA also encodes several special types of RNAs, including long non-coding RNA (lncRNA), microRNA (miRNA), and small interfering RNA (siRNA), that control gene expression. One of the earliest described examples of these special RNAs is found in the biology of sex determination. In XX females, one X chromosome is inactivated to ensure that genes on the X chromosome are not overexpressed. This process, called X chromosome inactivation, is controlled by a gene called Xist (X-inactive specific transcript). Xist encodes a long non-coding RNA, which covers one X chromosome and inactivates it (Xi). Interestingly, on the opposite strand from Xist is a gene called Tsix, which is expressed on the active X chromosome (Xa). The expression of these genes is mutually exclusive, ensuring that only one X chromosome is activated. The Xist/Tsix story highlights the power of special RNAs in controlling gene expression. These RNAs are the subject of intense research in both basic and clinical settings. Carey describes several approved drugs and promising clinical trials based on anti-sense approaches.

In short, Junk DNA was quite readable and should be informative for readers at any level of knowledge about molecular biology. My only complaint about the book was Carey's decision not to include protein or gene names in her writing. In the first chapter, she explains that this was due to the fact that half of her readers find it disruptive. Instead, where applicable, she includes footnotes with the gene or protein names. Unfortunately for me, I am in the half that finds it disruptive to read footnotes to learn the name of the gene in question. Otherwise, the book was very up to date and comprehensive. I also liked her use of simple graphics to explain complex concepts in molecular biology. I recommend Junk DNA for those who want to learn more about why the non-coding regions of our DNA are not junk.

Sunday, September 13, 2015

The Emperor of All Maladies - comments on the second part of the PBS special

In all my reading about cancer biology, I have not yet tackled The Emperor of All Maladies, which is said to be the best book on the subject.  Luckily, PBS and Ken Burns have delivered an excellent three-part series based on Siddhartha Mukherjee's 2011 book. This post covers the contents of part two, "The Blind Men and The Elephant."

This part of the series focuses on discovering the cause of cancer. The title, an allusion to the parable, refers to the fact that for many years scientists could not find the connection between the three major causes of cancer: viral, chemical, and genetic. The ideas were separated ideologically and scientifically. At conferences, the scientists that supported each of these ideas did not interact. It was only relatively recently that the connections between these causes were illuminated.

The earliest carcinogen was discovered in 1911 by Peyton Rous, who described the viral origin of avian sarcoma (for more information check out this great story by Jessica Wapner). In 1964, Burkitt lymphoma was linked to the Epstein-Barr virus. These results led to an rapid increase in the focus on viral carcinogenesis with the idea that a vaccine could prevent cancer. This focus came at the cost of other ideas about the causes of cancer. Unfortunately, Human papillomavirus (HPV) and Hepatitis (HepB and C) have been the only other viral carcinogens identified.

The second idea was that chemicals cause cancer. Lung cancers became increasingly common in the late 1940s. Epidemiological studies showed links between cigarette smoking and lung cancer, but tobacco companies obfuscated the results. In 1964, scientific links between cigarettes and lung cancer were firmly established thanks in part to the Kennedy administration's blue ribbon panel tasked with investigating the matter. Once the epidemiological methods were established for tobacco, other chemicals were added to the carcinogen list.

The final idea was that genes caused cancer. The major breakthrough came from Michael Bishop and Harold Varmos, who were studying the Rous sarcoma virus. Their timing was perfect the tools of molecular biology were becoming readily available. Work from their labs led to the discovery of a gene called Src, the first described oncogene. The oncogene idea was that normal genes in our bodies that control cell growth can be turned on at high levels and cause cancer. Robert Weinberg later identified the first human oncogene, Ras. Dozens of other oncogenes were found in subsequent years, leading to optimism that the cure for cancer was surely close at hand. However, we have since learned that cancer is a complex disease (some argue a collection of diseases) with a diverse range of etiologies, which makes it impossible to treat with a one-size-fits-all approach.

Interspersed with the description of cancer research was a narrative of one woman's treatment for breast cancer. This story begins with a history of breast cancer treatment, including a discussion of William Halsted's radical mastectomy. As I covered in more detail in my recent post on Pandora's DNA, Halsted's method was the standard treatment for breast cancer for nearly a century. The success rate was not impressive, but the treatment approach was unchallenged until Bernard Fisher criticized its use. Fisher performed a clinical trial to compare the use of the lumpectomy with the radical mastectomy. In 1985, his results showed that either approach was just as effective, but that the lumpectomy was less invasive and led to improved quality of life. Radical mastectomy was no longer the standard therapy: "cutting more did not mean curing more".

The intention of this segment was to illustrate the personal side of cancer, but for me the link between the two segments was how the treatment of cancer has evolved in parallel with developments in cancer research. This highlights how basic scientific research is a critical starting point for successful clinical outcomes.