Debunking myths on genetics and DNA

Thursday, December 29, 2011

Beta blockers and genetic variation

It's happening again.
I'm sitting in a meeting, and suddenly I feel my hands prickling. My heart thumps faster. I can't concentrate on what people around me are saying. My head is buzzing, and cold sweat trickles down my neck. I feel the tingling of panic biting at the tip of my fingers. My muscles tense, adrenaline spikes. Every cell of my body screams, "Danger!"

Now, part of me wants to take out my notebook and jot everything down for my next high-adrenaline, action-packed story.

The other (more sensible) part of me, wants to run out of the room, pick up the phone, dial my doctor's office number, and yell, "I NEED THE BETA BLOCKERS AGAIN!!!!!"

In case you didn't know, beta blockers are a wonderful drug. I'm told you can't take them during Olympic competitions, but that's okay, I've given up my Olympic dreams a long time ago. They are considered performance enhancers because they fend off the action of adrenaline and hence prevent stage freight and all sorts of anxieties. In other words, they make you happy. Olympics aside, they are prescribed in clinical cases with a high risk of infarction myocardial ischemia, or, as in my case, when the thyroid acts up and starts producing too much thyroid hormones. (I've actually been feeling really well for the past year, knock on wood!)

They block the beta-adrenergic receptors (hence the name), which are the receptors that are stimulated by adrenaline, the hormone produced by the adrenal glands. When adrenaline is released, it binds to the beta-adrenergic receptors and this causes a bunch of things to happen: the heart starts pumping faster in order to better oxygenate the muscles; blood flow is diverted from non-essential organs to the muscles; pupils and airways dilate; vessels narrow. Basically, the body is getting ready to "either fight or flight."

So here comes the beautiful, casually charming and nonchalantly laid-back, beta-blocker -- our hero. He sits right on the receptor and when the adrenaline comes, he smiles, puffs out some smoke, and, talking around a charred cigarette butt, says, "So long, babe. Spot's taken."

Yeah. Been reading too much Chandler.

I've been quite happy with beta-blockers, though my problem was not of a cardiac nature. So, I was quite surprised to find out that
"Recent evidence suggests that there is substantial inter-individual difference in how patients respond to beta-blockers: Some patients experience strong side effects such as excessive hypotension and bradycardia, whereas others experience no measurable response. Several lines of evidence suggest that the individual genetic background is responsible for these observed response differences [1]."

In order to understand this, one needs to understand how drugs are metabolized within the body. A cytochrome enzyme is an enzyme involved in the canalization of organic substances, and, as a consequence, in drug metabolism. The enzyme responsible for the metabolism of most beta-blockers is, CYP2D6. Now, here's the interesting part: the corresponding gene shows a large variability due to genetic polymorphisms (any variant that's at least 5% prevalent in the population). In other words, different individuals may present different alleles, and the frequency of these variants varies across ethnic groups. In [1], Nagele and Liggett discuss the most important SNPs and gene variants within adrenergic receptors and CYP2D6 and their possible effects in the metabolism of beta-blockers. They go in far more details than I want to here, so I'll limit myself to highlight the importance of their review: it turns out that patients with cardiovascular risk factors can suffer from potentially fatal complications after noncardiac surgery, and for that reason beta-blockers have been used as a preventive treatment of perioperative infarction. In this context, it is relevant to be able to predict the drug response based on the patient's genetic variation. Beta-blockers can indeed lower the risk of perioperative MI and cardiac death, but also carry a substantial risk for adverse cardiovascular side effects, such as hypotension and bradycardia. Given that genetic variation in CYP2D6-dependent metabolism and adrenergic signaling may affect the outcome, the authors conclude:
"Given the apparent inter-individual variation in efficacy and adverse effects of beta-blockers for prevention of perioperative MI, the biologic plausibility, and the low costs of genotyping by modern methods, it seems to us that a rigorous pharmacogenomic investigation is indicated. Ultimately, this could lead to a “genetic scorecard” that would recommend when a beta-blocker should be used and the dose, for prevention of perioperative MI."

As for me, I'm actually fine. I don't know what CYP2D6 alleles I carry, but the days full of adrenaline and jumping nerves are over. So, hooray for the beta-blockers!

[1] Nagele P, & Liggett SB (2011). Genetic Variation, β-blockers, and Perioperative Myocardial Infarction. Anesthesiology, 115 (6), 1316-27 PMID: 21918425

Photo: crystal sculpture, Santa Fe, NM. Canon 40D, focal length 85mm, shutter speed 1/30.

Monday, December 26, 2011

Sense and antisense in the human genome

I hope you all had a wonderful holiday. Short post today, as I'm sure we're all still digesting all the yummy holiday food and sweets, and maybe some of you are still celebrating. One of my recurrent topics on the blog has been antisense genes. Until recently, I had no idea such things existed, let alone in humans. It turns out, they are quite abundant in humans.

Antisense genes are overlapping genes that are transcribed on opposite DNA strands. I've discussed how antisense genes regulate conjugation in bacteria, and how antisense RNA transcripts can be used in gene therapy. Today I'd like to discuss a paper that examined five different human cell types and found evidence for antisense transcripts in thousands of genes.

As you know, a gene is a piece of DNA, and a gene transcript is the RNA trasncribed from that gene. DNA is made of two strands coiled together, which are conventionally referred to as the plus strand and the minus strand. The general thought has been that sense transcripts produce functional proteins, whereas antisense transcripts have regulatory functions. For example, they can "silence" a gene since the antisense RNA will attach to the sense RNA and a double-stranded RNA can no longer produce a protein.

In [1], He et al. developed a technique that allows to change the RNA transcript in a way that, once turned back into DNA, it will only match either the plus or the minus DNA strand. This way one can establish from which strand it had been transcribed. The researchers analyzed five cell types: PBMC, peripheral blood mononuclear cells isolated from a healthy volunteer; Jurkat, a T cell leukemia line; HCT116, a colorectal cancer cell line; MiaPaCa2, a pancreatic cancer line; MRC5, a fibroblast cell line derived from normal lung. They called "S genes" the ones that contained only sense tags or had a sense/antisense tag ratio of 5 or more; "AS genes" contained only antisense tags or had a sense/antisense tag ratio of 0.2 or less; and finally, "SAS genes" contained both sense and antisense tags and had a sense/antisense ratio between 0.2 and 5.

I found this figure in particular to be quite interesting:

From the figure, it's clear that sense genes tend to accumulate in the exons (the coding bits of a gene), whereas the antisense genes accumulate more in the promoters, regions upstream of a gene that regulate and promote transcription, and, though to a less extent, in the terminator regions. The authors of the paper used these data to argue that, while
"promiscuous expression would lead to a uniform distribution of antisense tags across the genome, the observed distribution was nonrandom, localized to genes and within particular regions of genes, much like sense transcripts."
In other words, antisense genes are non-randomly distributed and may in fact contribute to antisense-mediated regulatory mechanism that, according to the data presented in [1] affects from 2900 to 6400 human genes. More on this in the next post! Happy Holidays, everyone!

[1] He, Y., Vogelstein, B., Velculescu, V., Papadopoulos, N., & Kinzler, K. (2008). The Antisense Transcriptomes of Human Cells Science, 322 (5909), 1855-1857 DOI: 10.1126/science.1163853

Friday, December 23, 2011

CHIMERAS is on wallpaper!

I've been on Google+ for a couple of months, but I really got active only a couple of weeks ago. I discovered the Daily Photography Themes through a reader here and absolutely loved it. There are some truly, truly amazing photographers who share photos and tips, and in general, it's a great and very supportive community were you can't help but strive to take better pictures encouraged by the work of these superb photographers.

Well, one of those superb photographers, Jamie Furlong, organizes a weekly wallpaper contest, and I still can't believe that this week he picked my photo as the winner! You can download the image as a full resolution wallpaper (in various formats) here. All images there are fantastic, so don't leave the site without checking out Jamie's amazing portraits and street shots from India.

Happy Holidays everyone!

Thursday, December 22, 2011

Guest post: How to camouflage a virus and why it's important

Last week I covered a couple of recent gene therapy studies and discussed the different types of vectors used in order to make these therapies more efficient. One of the obstacles that hinders the efficiency of gene therapy is the immune system: if the patient has previously developed immunity against the viral vector, the virus will be quickly cleared out of the system without being able to deliver the genes. Therefore, the question is: how can we prevent the immune system from attacking the viral vector?

One of my regular readers here on the blog, antisocialbutterflie is an expert on "capsid recognition," and kindly offered to discuss the topic. The capsid is the outer shell of the virus and the idea is to disguise it so the antibodies won't recognize it. Without further ado, I'm going to let my guest take over the post.

The monumental advances in genetics over the last six decades have improved our understanding of diseases and where they originate. Now armed with this knowledge there are many disorders that we could fix if we had a way to deliver the right DNA. In many ways gene therapy is like the Cold War. We have nuclear warheads but it isn’t going to do any good if we don’t have an effective delivery system. A delivery system must first be able to hit the target but it must also allow the device to do its job and do it covertly enough that it doesn’t call up the defenses of its enemy too quickly. In the case of gene therapy you can’t just inject someone with naked DNA and expect something to happen. Thankfully nature helped us out by providing a fabulous rocket powered, laser guided missile system in the form of viruses.

You typically think of viruses in terms of disease but there are plenty of viruses that coexist without any the deleterious effects. The virus of choice for many gene therapy trials is the adeno-associated virus or AAV. AAV is a single stranded DNA virus indigenous to primates from the family parvoviridae, which may sound vaguely familiar if you’ve ever had a pet. This is the same family of viruses that give us canine parvo and B19 in humans, but in the case of AAV it doesn’t really do much of anything. In fact AAV can’t even replicate unless its host is also infected with a helper virus like adenovirus or herpes. They are also mildly immunogenic so they can travel (mostly) under the immune system’s radar without eliciting a strong response.

The virus only has two genes that can make seven proteins when you factor in alternate start sites and splice variants. Most gene therapy vectors utilize the cap gene and their resultant 3 VP proteins to create a viral shell for the therapeutic gene they want to deliver. Sixty of these proteins interlock together to make an icosahedron. Basically the pieces, when assembled, look like 20-sided dice with a therapeutic gene stuffed inside. As you might imagine the core fold of the protein has to be fairly specific to fit together but there are loopy bits on the exterior of the capsid called “variable regions” that give each variant its unique properties. These variable regions act like velcro to grab onto cell receptors. In multicellular organisms different tissues express different receptors so you can use these regions to target a specific cell type. Unfortunately it is also these variable regions that immune cells target for neutralizing antibody production. Since these viruses target humans, antibodies already exist to the naturally occurring variants, which is why many of the gene therapy vectors have to be tweaked to be effective.

There are several ways of engineering a good gene therapy vector that can escape the immune system while still targeting the tissues that you want. One strategy is to mix and match. This can be accomplished by mixing up whole capsid proteins from several serotypes or cutting out bits and pieces of different protein sequences and pasting them together to make a single new one. There is also directed evolution where you introduce errors in the gene to create a new protein sequence and then select for the mutations that confer the appropriate tissue specificity while still escaping the neutralizing antibodies. Another strategy involves adding a peptide to the capsid that you know binds a specific receptor as a way to target it. You can even create protective coats for your virus made from things like PEGs or lipids to allow it to evade the immune system in the same way enveloped viruses do. All of these alterations run the risk of blocking capsid assembly, changing tissue specificity, and/or reducing infectivity so it’s a bit of a crap shoot as to whether your efforts are all for naught. For every successfully engineered vector there are hundreds that didn’t work.

As mentioned in a previous post, there is also the issue of gene size. There is a finite amount of space inside that d20 and some of it has to be taken up by the inverted terminal repeat sequence that packages the DNA. If you have a bigger gene you need a bigger virus like adenovirus, HSV or HIV. These bigger viruses are made up of several capsid proteins that have to be accounted for and provoke a bigger immune response making the can of worms even larger.

I hope this enlightens you a bit to the field of gene therapy vector design and its challenges. Thanks to Elena for having me.

That was not only enlightening, but also exceptionally clear -- thank you! It was fascinating to learn that one of the ways this is accomplished is by introducing artificial mutations in order to trick the immune system... Another lesson learned from viruses themselves, as that is exactly how HIV escapes our immune defenses.

Antisocialbutterflie is an X-ray crystallographer with a PhD in biomedical science. She is currently a postdoc and dreaming of a day when she can step away from the bench, preferably before she sets fire to her FPLC. In addition to the random blog comment she occasionally writes fiction when time permits.

Pulicherla, N., & Asokan, A. (2011). Peptide affinity reagents for AAV capsid recognition and purification Gene Therapy, 18 (10), 1020-1024 DOI: 10.1038/gt.2011.46

Tuesday, December 20, 2011

Fingerprint evidence: not exactly what CSI showed you

Every scientific type of analysis has an error rate. I've mentioned it before: science is not about certainty, it's about accurately measuring the uncertainties. Unfortunately, when it comes to forensic sciences, this causes a logical problem: scientists like to talk about being 90% sure about something, but in trials there's only two outcomes: innocent or guilty. You can't do 90% guilty and 10% innocent.

It occurred to me that this was an issue when I heard somebody talk about how fingerprint analysis has no error rate. Being a statistician, all sorts of red flags rose in my head. Of course there's an error rate! Any procedure has an error rate because the error rate is a definition: you count the number of successful identifications and divide by the total number of comparisons made. Maybe the fingerprint matching has never failed? Unfortunately, that's not true

As some of you know, I was in Los Angeles last month and I was lucky enough to speak to a fingerprint analyst with the LAPD. She explained the procedure to me, which is pretty cool, actually. Blood fingerprints are photographed. Everything else is dusted: black powder is used for clear surfaces, and white powder for dark surfaces. The powder sticks to the oily residue from hands and the rest is brushed away. The analyst then applies tape to the surface, which lifts the dust that has attached to the oily residues. This accurately reproduces the print onto the tape and the tape is then transferred to a card.

The typical procedure is to scan the card into the computer and send it to a number of databases, the largest of which is IAFIS, maintained by the FBI. An automated algorithm comes back with 2-3 possible matches with the prints lifted from the scene, and at that point a person compares the matches from the database with the in order to make a final identification. If the database comes back with no match, but the police has obtained the prints from a possible suspect, then again the comparison is done by hand.

Just like any other procedure that deems itself as "scientific," fingerprint analysis too needs an error rate, and I'm not alone in making this claim: in 2010 UCLA law professor Jennifer Mnookin obtained a federal grant in order to study error rates in fingerprint evidence. The difficulties are intrinsic: you have to fold in the variability from analyst to analyst and from case to case, as some fingerprints are lifted in ideal situations whereas others are only partial, blurred, etc.

To explain the importance of this type of research, in a 2010 paper [1] Mnookin asks the following question to two hypothetical law students:
"Is fingerprint identification one of the most secure forms of evidence we have, or is its scientific validity remarkably untested?"
Sadly, one of her hypothetical students
"would discover that the scientific validity of fingerprint evidence is surprisingly untested. He would ascertain, for example, that there were no generally shared and validated objective standards for declaring a ‘match’ either required by judges within the courtroom as a precondition to admissibility, or self-imposed by the fingerprint community—that in the end, deciding whether or not a match existed was left to the discretion of each examiner based on his judgement and experience. He would find, in sharp contrast to, say, DNA profiling, that there was no fully specified statistical model of fingerprinting, that would permit the likelihood that two prints came from the same person to be expressed in probabilistic terms."
Mnookin concludes:
"Assuming that errors were not randomly distributed across the array of test prints, proficiency tests designed genuinely to test both the method and its limits might provide useful information about the circumstances that tended to lead to errors, and perhaps could lead to practices that could reduce the likelihood of making such mistakes. Such proficiency tests are technically feasible, and whatever they found would teach us a good deal more about the frequency of errors than we know at present."
To get a more rounded perspective on the matter, I contacted my friend Mark Pryor, an assistant district attorney with the County of Travis, in Texas. Mark was very happy to hear from me, and he was even happier to tell me that he had just found a home for his novel The Bookseller, signing a three-book-deal with Seventh Street Books, a new mystery and thriller imprint from Prometheus Books. Way to go, Mark! He answered my questions while floating twelve inches above ground...

EEG: I understand that every time you have an expert witness take the stand they have to state what their expertise is, and their background and how many cases they've investigated, etc. Are they ever asked, over the course of the testimony, what the error rate for their kind of analysis is? Is it at all required for some kind of analyses?

MP: Yes, they always explain their qualifications, experience and training. This is for two reasons: one, to qualify them as an expert, so that they pass what's known as the Daubert test for scientific expert testimony. That's a call for the judge. But it's also so the jury can know them a little, understand who this person is giving the scientific opinion/testimony. I have never heard one asked what about error rates, actually.

EEG: Has a defense lawyer ever questioned fingerprint analyses during one of your cases, and if so, how did you reply to that? If not, how would you reply to that?

MP: I have not taken a fingerprint case to trial yet, so no. I'm sure if I did, defense counsel would thoroughly question the expert, maybe about error rates, certainly about the procedures they use and the practices in place to make sure there were no mistakes. I think what I would emphasize in my re-direct of such an expert is precisely those same things: making it clear neither I nor the fingerprint expert has any interest in making a mistake and blaming the wrong person for a crime. In other words, I want my print expert to come across precisely as he is: a disinterested expert on the science of fingerprint analysis.

EEG: More in general, do you ever "question" the scientific evidence that detectives bring to your desk before deciding whether or not to issue an arrest warrant?

MP: I personally don't have anything to do with arrest warrants, a detective will get that from the judge directly. Sometimes I'll review one if it's a big case, but I generally won't see any scientific evidence - usually because the arrest is based on non-scientific evidence because a lot of the science stuff takes a while to process (I'm thinking of DNA here). Now, for trial I always look very hard at ALL my evidence, scientific included. Again, for two reasons: I don't want to try an innocent person, and also because I don't want to find out any mistakes/weaknesses once I'm in trial, I want to know those in advance!

I confess I'm still a bit troubled by the process. TV shows like CSI make it look like scientific evidence is either black or white, while unfortunately there's a full range of grays in between. And like Mark said, the bottom line is not to try an innocent person. I'm glad the law is represented by scrupulous people like Mark, but I'm also grateful that Professor Mnookin got her grant. I couldn't find any update on her website, but I'll keep an eye on it because I'm eager to find out about their results.

As for Mark, his book The Bookseller will hit the shelves towards the end of 2012, but in the meantime you can follow his blog, DA Confidential, where he discusses being an assistant district attorney, a father, a husband, and a writer.

[1] Mnookin, J. (2007). The validity of latent fingerprint identification: confessions of a fingerprinting moderate Law, Probability and Risk, 7 (2), 127-141 DOI: 10.1093/lpr/mgm022

Sunday, December 18, 2011

Enough with OXTR associations. Here's what I really want to know.

EDIT: After reading the post, please check out the comments. Luke, from Genomes Unzipped, helped me understand the matter better, so don't miss his comment!

Another OXTR paper came out in PNAS, the third since September. OXTR is the gene coding the oxytocin receptor. Given the benefits of oxytocin (dubbed the "love hormone"), people have focused on studying this gene and, in particular, possible associations between a common OXTR polymorphism, rs53576, and various behaviors:
"One SNP in the third intron of OXTR has emerged as a particularly promising candidate in recent studies on human social behavior: rs53576 (G/A). In recent studies, the A allele of rs53576 has been associated with reduced maternal sensitivity to child behavior, lower empathy, reduced reward dependence, lower optimism and self-esteem, and, in men, negative affect. Moreover, the A allele has also been associated with a larger startle response and reduced amygdala activation during emotional face processing. Associations have also been reported between other variants of OXTR and amygdala volume, risk for autism, the quality of infants‚ attachment bonds with their caregivers, attachment anxiety in adult females, and autistic-like social difficulties in adult males [1]."
This study in particular [1] recruited 194 individuals and found an association between the SNP in question and the way the participants reacted to positive feedback during stressful situations. They did this by measuring cortisol responses to stress based on the fact that psychosocial stress increase the levels of salivary cortisol. In AA carriers they found that these levels remained unchanged whether they received the support or not. The researchers conclude:
"Physiologically, it can be speculated that oxytocin released in the context of social support influences stress processing systems via oxytocin receptors in hypothalamic‚ limbic circuits. One likely important site of action is the amygdala, critically involved in basic emotional processing and the regulation of complex social behavior."
I confess I've been eagerly following these OXTR studies and indeed they make a great story. There's a part, though, that puzzles me, and the reason why I'm discussing this paper today is to ask a general question. If you're an expert on these things I welcome your input.

I understand these are important studies because, despite some recent criticism, they are still getting published, and PNAS, as we all know, is one of the top science journals out there. However, the thing I don't understand is that rs53576 is a silent SNP. That's actually not surprising, because, as it turns out, most common polymorphisms are silent. What is surprising, though, is that most silent SNPs are non functional, and none of these studies I've read seems to raise the question. Let me explain.

Rs53576 sits in an intron, a part of the gene that is not transcribed into RNA and hence, in this case, does not affect the way the oxytocin receptor is made. In the analogous studies we do in my group, which are NOT on humans, we look for non-silent mutations because those are the ones that affect the crystal structure of the protein. We then look at what differences in structure these mutations yield to explain how more or less molecules bind to the protein, and this how we explain the observed effects. If rs53576 were a non-silent mutation, I'd know where to look to explain these associations: I'd look at how the SNP affects the crystal structure of the receptor, the hypothesis being that the oxytocin receptor in AA carriers binds less oxytocin than GG carriers (or something along those lines, I obviously don't know the details of this particular receptor). But rs53576 is silent. Hence, if the associations are real, there is something else going on. So, why hasn't anybody raised the question of what else is going on here?

The first thing that comes to mind is that this particular SNP could be in linkage disequilibrium with some other SNP or groups of SNPs which, instead, are non-silent. We tend to inherit polymorphisms in groups, and so if rs53576 comes in the same "package" (they're called haplotype blocks) as some other functional SNP, then rs53576 is NOT the causal SNP for all these effects and we should really be looking elsewhere. The way to find out, of course, is to repeat all these studies with whole genome data. But, it could also be an epigenetic change or a post-transcriptional modification occurring between the primary transcript RNA (which contains both introns and exons) and the mature messenger RNA (which then yields to the protein). The positions of introns can indeed affect the translational properties of the RNA, and that's what yields to the so-called "functional intronic SNPs." The fact that intronic polymorphisms can be functional is extremely interesting, and in fact, last year, this study showed that one particular SNP found in one intron of GH1, the growth hormone, could indeed be functional.

Whatever it is, at this point, isn't it more interesting to investigate what's going on with this SNP at the molecular level rather than looking at all these association studies which may or may not be true?

[1] Chen, F., Kumsta, R., von Dawans, B., Monakhov, M., Ebstein, R., & Heinrichs, M. (2011). Common oxytocin receptor gene (OXTR) polymorphism and social support interact to reduce stress in humans Proceedings of the National Academy of Sciences, 108 (50), 19937-19942 DOI: 10.1073/pnas.1113079108

Photo: Fall colors along the Rio Grande. Shutter speed 1/40, F-stop 5.6, ISO speed 100, and focal length 85mm.

Saturday, December 17, 2011

Epigenetic reprogramming: how cells start afresh

Last week I talked about the chromatin, the complex of DNA and proteins that resides inside the nucleus. There were two key points to that post: (1) the topology inside of the chromatin, or, in other words, how the chromosomes are arranged inside the nucleus, is correlated to which genes are active and which aren't; (2) these changes in the chromatin that allow for gene expression and gene silencing can be inherited, though how it's still a mystery.

I admit I left that second point a bit vague last week. So, with the help of a fantastic review I found on PubMed [1], today I'd like to develop the topic further.

The rearrangements of the chromosomes within the chromatin determine what are known as epigenetic marks:
"Epigenetic marks are covalent modifications of the DNA (DNA methylation) or post-translational modifications of the histone proteins (histone modifications) that make up the chromatin into which our DNA is packaged. [1]"
Different cells in the body present different epigenetic marks depending on which genes are expressed and which are silent. Within a specific cell line, epigenetic marks are conserved as cells divide, thus maintaining the differentiated state of the cell. For example, skin cells will divide in skin cells and not change into brain cells, right?

This is true for all cells in the body except one very special set: the germline cells. If you think about it, it makes perfect sense: germ cells give rise to an embryo, and hence have to remain undifferentiated. Therefore, all epigenetic marks must be reset in order to enable a completely new undifferentiated state, a process called epigenetic reprogramming.
"It is almost twenty years since the discovery of the biological importance of germline DNA methylation in the context of imprinted genes, and ten years since the identification of the key enzymes responsible for de novo DNA methylation in mammals. Even so, what specifies why specific DNA sequences become epigenetically distinguished in germ cells is still only partially understood."
During developmental epigenetic reprogramming, primordial germ cells emerge with their own epigenetic marks and, as these cells migrate and proliferate, the marks are gradually lost (DNA methylation is globally erased):

It's interesting to see how the new marks are established in an asymmetric fashion for males and females. In the male embryo, de novo methylation takes place and the new marks are established and completed by birth. In the female embryo, the process is arrested in the oocytes and resumed at puberty. In the event of a fertilized oocyte, the marks are erased again, as illustrated by the blue and red line descending again in the above figure.

Smallwood and Kelsey explain the various phases of the above processes in great detail. Interestingly,
"DNA methylation is distributed throughout the genome, at repetitive elements and single-copy sequences. With the recent development of genome-wide methylation profiling techniques employing next-generation sequencing, the full pattern of DNA methylation in gametes, and how it is laid down during germ-cell development, is beginning to emerge. [...]Despite the advances in the identification of key factors in DNA methylation in the germline, many questions remain over mechanism – in particular, how a select number of imprinted gDMRs and CGIs are specified for DNA methylation. The development of deep-sequencing technologies has opened new horizons, and it is now possi- ble to profile DNA methylation on a genome-wide scale in very small amounts of genomic DNA, providing an unparalleled opportunity to shed new light on mechanisms of de novo DNA methylation in germ cells [13,81]. Because the interaction of DNMT3 proteins with nucleosomes is regulated by several histone modifications (at least in vitro) it is now imperative that such capabilities are matched by the development of chromatin immunoprecipitation sequencing (ChIP-Seq) protocols to profile histone modifications in vivo in limited amounts of starting material; this would undoubtedly represent an important advance in the field of epigenetic reprogramming."
I asked my dad, a developmental biologist from the University of Pisa, what his thoughts were on the matter, and this is what he had to say:
"The conclusion to be drawn from these latest findings is thus as follows. So far we have been looking at single epigenetic changes and asked the question what does each one of them mean in relation to the phenotypic effects envisioned on an organismic scale. Needless to say that we have not gone very far by pursuing this simple-minded approach. The newly emerging evidence is pointing to another direction. Taken together, the epigenetic markers of chromatin imprinting, histone acetylation and base methylation should perhaps be considered as systemic modifications rather than simple one-to-one cause-effects relationships. By this I mean to say that the nuclear context in which such modifications occur is as important as any other macromolecular co-factor sustaining their interaction with the phenotypic counterparts. Perhaps by knowing how epigenetic markers are changed on a genomic scale it would be possible in the future to understand how they relate to one another and how altogether have provided living creatures with an adequate responding repertoire to adapt to ever changing environments during evolution."
[1] Smallwood SA, & Kelsey G (2011). De novo DNA methylation: a germ cell perspective. Trends in genetics : TIG PMID: 22019337

Thursday, December 15, 2011

So mice can be vaccinated against HIV. What about humans, though?

I hope I can get away with yet another paper on gene therapy this week. You may actually have already heard about this one: it came out at the end of November and it had quite some resonance because the researchers claimed to have established lasting immunogenicity to HIV in mice‚ using, again gene therapy.

I have already discussed the potential use of gene therapy to cure HIV. In fact, the only human to ever be "cured" of HIV was a leukemia patient who, after receiving a genetically modified vector, developed HIV-resistant T-cells. In that case gene therapy was the only way to save the patient's life as he was dying of leukemia. It was quite an interesting study and I loved learning about it. However, in general, gene therapy is NOT a feasible way to end the AIDS pandemic. It's too expensive, too risky, and 2/3 of the infected people live in South Africa where even drugs are too expensive, you can imagine gene therapy.

No, the most efficient means to wipe out the virus, from both an economical and a clinical perspective, is a vaccine.

Okay, I'm biased. I work on HIV vaccine design. And when this paper appeared in Nature many colleagues rolled their eyes. "Too risky." "Too impractical." "It'll never work in humans." Which meant I had to read the paper. So I did.

From the abstract:
"As an alternative to immunization, vector-mediated gene transfer could be used to engineer secretion of the existing broadly neutralizing antibodies into the circulation. Here we describe a practical implementation of this approach, which we call vectored immunoprophylaxis (VIP), which in mice induces lifelong expression of these monoclonal antibodies at high concentrations from a single intramuscular injection. This is achieved using a specialized adeno-associated virus vector optimized for the production of full-length antibody from muscle tissue. We show that humanized mice receiving VIP appear to be fully protected from HIV infection, even when challenged intravenously with very high doses of replication-competent virus. Our results suggest that successful translation of this approach to humans may produce effective prophylaxis against HIV."
So, it is some kind of vaccine. And at the same time it's not. In a standard vaccine you inject a deactivated form of the virus in order to elicit antibody production in the host. With this new method, instead, you inject a virus which carries the genes for the antibodies. Instead of letting the immune system find a way to produce the antibodies, the researchers provided the "instructions" on how to make them: they injected into the muscle a viral vector containing the genes for the antibodies.

The vector used in the study is a self-complementary adeno-associated virus, which I discussed here. The researchers produced AAV vectors that either expressed luciferase (for the controls) or the neutralizing antibody b12 and administered them to mice through a single intramuscular injection. The mice were then populated with human peripheral mononuclear cells and then challenged with HIV. After the challenge, most mice expressing luciferase showed dramatic loss of CD4 cells (the cells infected by HIV) whereas mice expressing b12 antibody showed no CD4 cell depletion. Basically, the therapy was working.

They also tested a cocktail of historically known broadly neutralizing antibodies: b12, 2G12, 4E10, and 2F5. Again, after being adoptively populated with huPBMCs, the mice were
"challenged by intravenous injection with HIV and sampled weekly to quantify CD4 cell depletion over time. Animals expressing b12 were completely protected from infection, whereas those expressing 2G12, 4E10 and 2F5 were partly protected."
Finally, they repeated the experiment with one of the newest and most potent broadly neutralizing antibodies, VRC01, and found similar results, with higher protection established at higher doses of the vector.

The fact that the mice were challenged intravenously is quite impressive because mucosa routes present a bottleneck for the virus, whereas intravenous challenges are much more efficient in initiating the infection.

A number of things remain to be seen, the safety and efficacy of the therapy in particular. In this hemophilia B study the administration of intravenous AAV showed lasting results, even though the new genes were expressed at a low level. However, other scAAV studies have failed and, as an additional word of caution, we should not forget all the therapies successfully tested in mice that later failed in humans: assuming this technique passes the required safety checks, it still remains to be seen whether results in humans would be comparable to the mouse model.

Still. Despite my original bias, I confess I find these results pretty cool. Don't tell my boss, though!

Balazs, A., Chen, J., Hong, C., Rao, D., Yang, L., & Baltimore, D. (2011). Antibody-based protection against HIV infection by vectored immunoprophylaxis Nature DOI: 10.1038/nature10660

Photo: Sunset on Croc Rock and the Rio Grande. Shutter speed 1/25, focal length 38mm, f-stop 5.0, ISO speed 100.

Tuesday, December 13, 2011

Not all vectors are created equal

As I was reading the paper I discussed yesterday, I realized there was a part I didn't fully understand and I needed to research more. I received some great comments on that post that pointed me in the right direction.

A gene delivery vector is an engineered virus modified so that it contains the genes needed for therapy. Once inside the cell, the genetic material needs to reach the nucleus where it has to recruit a complementary DNA strand in order for the gene to be expressed.

Conceptually, it seems easy enough. In practice, every step of the way has its hurdles and of all the vectors you inject into the host, only a fraction turns into expressed genes. With adeno-associated virus (AAV) the major bottleneck is the de novo synthesis of the DNA: not all the single-stranded DNA delivered to the nucleus is successfully converted into double-stranded DNA, thus hindering the efficiency of the vector.

Luckily, there's a few alternatives. Suppose you have two viruses, and each carries complementary DNA. Assuming they both reach the nucleus, the two DNA strands will find each other (no need to recruit a complementary strand from the existing chromatin), and voila' -- you have a double-stranded DNA. Now, this in general wouldn't be possible with just any virus, because they tend to have a preference for which strand they carry. But with AAV we're in luck because it packages either strand with equal efficiency.

This approach is also prone to issues. For example, it's hard to predict whether or not the two complementary strands, once inside the nucleus, will find one another. The likelihood increases with the dose, but that also increases the chances of recombination.

What about packaging both strands inside the same vector? Turns out it's possible, even though you lose in capacity (you can't package as much DNA inside the virus, approximately half of what you could achieve with the previous AAV). As McCarty explains in [1]:
"This can be achieved by taking advantage of the tendency to produce dimeric inverted repeat genomes during the AAV replication cycle. If these dimers are small enough, they can be packaged in the same manner as conventional AAV genomes, and the two halves of the ssDNA molecule can fold and base pair to form a dsDNA molecule of half the length. Although this further restricts the transgene carrying capacity of an already small viral vector, it offers a substantial premium in the efficiency, and speed of onset, of transgene expression because dsDNA conversion is independent of host-cell DNA synthesis and vector concentration."

The above figure shows the steps through which this is achieved. Technical details aside, this new mechanism exploits the virus's ability to naturally form short complementary strands. The "shortness" diminishes the capacity, but if you can get away with delivering short genes, then you can efficiently bypass the de novo synthesis bottleneck and greatly increase the efficiency of the vector. These are called self-complementary vectors, scAAV in the case of adeno-associated virus.

There are still hurdles to circumvent. You still face the potential barrier posed by humoral immunity, the fact that the immune system might recognize the virus and destroy it before it can reach its destination. In [1] McCarty reviews several applications of scAAV, including cell lines where studies have been successful, and others that haven't. But the paper I discussed yesterday was certainly a great step forward and a success story in the use of scAAV vectors.

[1] McCarty, D. (2008). Self-complementary AAV Vectors; Advances and Applications Molecular Therapy, 16 (10), 1648-1656 DOI: 10.1038/mt.2008.171

Monday, December 12, 2011

Another gene therapy success story

Last October I reported an incredible story in which researchers used an HIV chimeric virus to cure leukemia. Here's another success story.

Hemophilia B is a blood clotting disorder caused by spontaneous mutations in the Factor IX gene, leading to a deficiency of Factor IX, an enzyme essential in blood coagulation. The gene is expressed mostly in the liver, where the enzyme is produced and then sent into circulation in the blood. Less than 1% of normal levels of Factor IX lead to severe hemophilia and require a lifetime treatment of intravenous injections of FIX protein concentrate 2-3 times a week.
"Somatic gene therapy for hemophilia B offers the potential for a cure through continuous endogenous production of FIX after a single administration of vector, especially since a small rise in circulating FIX to at least 1% of normal levels can substantially ameliorate the bleeding phenotype [1]."
Unfortunately, previous gene therapy experiments have been unsuccessful, showing only transient expression of Factor IX (effects weaned off after a while). It is possible that the patients' immune system produces a T-cell response against the infused cells. In its essence, gene therapy is the transfer of a healthy gene in the cell line affected by the defective genes. This transfer is usually obtained through a modified virus (the "vector") because viruses have the innate ability to attack a cell and inject it with their own genetic material. When choosing a viral vector, therefore, it is essential to establish that the patient's immune system does not "recognize" the viral vector. That's often the problem with gene therapy: basically, you are trying to "fool" the immune system by injecting extraneous genetic material and hoping that it successfully replaces the defective one. However, the immune system is not so easy to fool.

Nathwani et al. [1] tried a new gene therapy approach. The vector typically used is a modified adenovirus, but the researchers used a self-complementary vector so it would yield higher efficiency in transgene expression. They also used a subtype of virus that has a lower prevalence in humans, and thus lower chances of patients having developed humoral immunity against it. Lastly, whether previous approaches would infuse the virus directly to the liver, in this study patients were administered the vector in the peripheral vein, which is a less invasive and safer approach.

The results were quite promising: "AAV-mediated expression of FIX at 2 to 11% of normal levels was observed in all participants. Four of the six discontinued FIX prophylaxis and remained free of spontaneous hemorrhage; in the other two, the interval between prophylactic injections was increased."

The difference in responses to the therapy was partly due to the fact that patients were given different doses of the genetically modified virus (low, moderate, and high), which in fact were dose dependent. However, it also suggests that individual immune responses and different exposures to the virus greatly affect the outcome. These factors need to be better understood and it will require a larger number of participants in future studies. Furthermore, as with any other gene therapy treatment, there are risks associated: the six participants are currently monitored for hepatic dysfunction, but the researchers are optimistic that, even with such potential risks, this kind of therapy
"has the potential to convert the severe bleeding phenotype into a mild form of the disease or to reverse it entirely."

Edited to add a cool comment from antisocialbutterflie:

Intriguingly enough my grad lab worked on AAV capsid proteins (not my project but I could give the first 15 minutes of a seminar from memory). It's a great gene therapy vector assuming (a) there isn't the preexisting immunity that you mentioned and (b) the gene you are therapifying (word?) is small enough.

The engineering of non-natural variants that exhibit the appropriate tissue specificity while escaping the existing immune response is pretty cool. It all boils down to a series of surface loops (variable regions) that mediate both the receptor-binding but also present the antigens that antibodies are developed against. Swapping out the antigenic residues are likely to disrupt the receptor binding and mess up the tissue specificity making it useless as a therapy vector.

There is also a finite amount of DNA it can package. If I remember correctly the max is around 8 kb. If it's bigger you have to turn to things like adenoviruses or the previously mentioned HIV.

[1] Nathwani, A., Tuddenham, E., Rangarajan, S., Rosales, C., McIntosh, J., Linch, D., Chowdary, P., Riddell, A., Pie, A., Harrington, C., O'Beirne, J., Smith, K., Pasi, J., Glader, B., Rustagi, P., Ng, C., Kay, M., Zhou, J., Spence, Y., Morton, C., Allay, J., Coleman, J., Sleep, S., Cunningham, J., Srivastava, D., Basner-Tschakarjan, E., Mingozzi, F., High, K., Gray, J., Reiss, U., Nienhuis, A., & Davidoff, A. (2011). Adenovirus-Associated Virus Vector–Mediated Gene Transfer in Hemophilia B New England Journal of Medicine DOI: 10.1056/NEJMoa1108046

Photo: sculpture by Joshua Tobey, Santa Fe, NM. Shutter speed 1/40, F-stop 7.1, focal length 75mm, ISO speed 100.

Remember back when...

I generate most of my figures in R. What do you guys use? I suppose it changes from field to field. However, no matter what your field is, there comes a time when you have to use Adobe Illustrator to beautify your figures and get them paper-ready.

I have a love/hate relationship with AI.

I love it because it lets me do so many things.
I hate it because it won't let me do so many things.

So, the other day I was having one of my AI fits when everything around me got blurry, a heavy fog lifted in my office, and I was propelled back to many, ahem, some years back when I was in elementary school and my dad was preparing his paper figures...

Okay, the blurring and the fog I added for special effects, but going back to my dad: he is a developmental biologist, and back in the days when Apples only came in black screens with fixed menus in a green font, my dad would start by developing his pictures in the camera obscura. He would then trim the pictures and mount them on paper board -- one board for each figure, and each figure had different panels.

Now, for the labeling, this is what he'd use:

Remember those? I loved them when I was a kid! We'd get comic books where you could add your characters to a scene -- it was fun! But those tiny letters my dad would use to label his photos -- man, they were a pain in the ***! And the letters were a piece of cake compared to the thin lines and geometrical shapes he'd use for the graphs. You'd have to press very delicately. If you pressed too hard you'd ruin the photo, or the lines would break, and you'd have to start over. If you'd press too lightly they'd come off and you'd find them all over your hand, the lines especially, sticking out like misplaced hairs.


So, on second thoughts... I love AI. I really, really love AI.

Friday, December 9, 2011

Understanding the cell nucleus in order to unravel the mystery of epigenetic heritability

The above image is the striking view of the surface of a cell nucleus (in pink). The dark crater represents a hole in the nucleus and offers a peek inside: the granular consistency that you see there are the chromosomes, bundled together in what may appear a random distribution but, in reality, is nothing but random:
"In all eukaryotic species analyzed so far, spatial genome arrangements are nonrandom: chromosomes or genomic loci occupy preferential positions with respect to each other and/or to nuclear landmarks [1]."
The nucleus contains a combination of DNA and proteins (mostly histones) called chromatin. Histones can be thought of spools around which the DNA wraps, forming a structure called nucleosome. Proteins in the chromatin can be silenced or activated, thus allowing differentiated cells to express only the genes necessary to their specific function. The budding yeast Saccharomyces cerevisiae was the first eukaryote cell to have its entire genome sequenced and, due to its relatively compact size (16 small chromosomes), it has been studied extensively to understand the structure of the cellular nucleus. For example, one of the largest protein complexes on the nuclear envelope is the nuclear pore complex, or NPC, which modulates the exchange of components between the nucleus and the cytoplasm. Several genes are relocated to the NPC when activated, and, as Zimmer and Fabre note [1],
"The region close to the nuclear envelope thus emerges as a mosaic, with the vicinity of NPCs representing zones favorable to transcription, whereas the zones between NPCs are more repressive."
These spatial arrangements are not static but they undergo re-arrangements (through complicated chemical alterations like cytosine methylation and/or post-translational modification of the histone amino acids). The extent of packaging of the nucleosome affects gene expression, however, to this day, little is known on what determines this delicate spatial arrangement.

And here's the intriguing bit: the re-arrangements the chromatin undergoes are generally reversible. And yet there's a level of these modifications that not only remains unmodified, it becomes inherited [2]:
"Chromatin modifications are often termed epigenetic marks; however, an unresolved issue in the field is the relationship between these modifications, including those established during transcription, and epigenetic inheritance (that is, the stability of these alterations during cell divisions and development). It seems that most, if not all, histone modifications are reversible, so it remains to be determined how epigenetic persistence of chromatin states is achieved, and which modifications are heritable."
These are the transgenerational epigenetic modifications I have discussed here and here. It's a real puzzle because heritability happens through the germ line cells, but in this cell line transcription only happens de-novo after fertilization. So at what level and how are epigenetic changes inherited? In [2], Berger reviews the various types of chromatin modifications and concludes with a nice analogy:
"Language is defined by the Webster dictionary as systematic means of communicating ideas using conventionalized signs or marks having understood meanings. This definition can be used to describe the complexity of the relationship between epigenetic marks and the biological processes they influence. As scientists, it falls to us to learn and understand this language a task that we have only begun to undertake."
EDIT: as I was preparing this post, I found this article on Scientific American, which talks about untangling the 3D human genome, and how the topology inside the nucleus determines which genes are on and off. There's a neat video, if you scroll to the bottom of the article.

[1] [1] Zimmer, C., & Fabre, E. (2011). Principles of chromosomal organization: lessons from yeast The Journal of Cell Biology, 192 (5), 723-733 DOI: 10.1083/jcb.201010058

[2] Berger, S. (2007). The complex language of chromatin regulation during transcription Nature, 447 (7143), 407-412 DOI: 10.1038/nature05915

Thursday, December 8, 2011

Timing the AIDS pandemic and why it made history (Part II)

In Part I of this post I discussed the Science paper that proved HIV was the result of a cross-transmission from chimpanzees to humans. In that paper, Hahn et al. conclude with an open question:
"The timing of SIVcpz transmission to humans, leading ultimately to the HIV-1 pandemic, has been a challenging question. We know from analyses of stored samples that humans in west central Africa had been infected with HIV-1 group M viruses by 1959 and with group O viruses by 1963. But how much earlier were these viruses introduced into the human population? [...] It should be possible to estimate the timing of the onset of the pandemic by calculating the date of the last common ancestor of HIV-1 group M."

In a phylogenetic tree (see the definition I gave last time), the last common ancestor is the root of the tree: that's the "patriarch" of the sample if you will, the one sequence from which, one divergent event at the time, the whole sample originated. Phylogenetic analyses allow us not only to reconstruct the evolutionary history of the sequences, but also, if you have a rough idea of what the mutation rate is (i.e. how often new mutations arise) to time them. It's a technique often referred to as "molecular clock," which originated from the observation that the number of molecular differences between different lineages increases linearly with time and that substitutions accumulated according to a Poisson distribution.

Korber et al. used parallel computers to apply maximum-likelihood tree-building methods to the envelope sequences (the envelope is one of the HIV genes) from 159 individuals. They note:
"Although it is unrealistic to expect that HIV-1 evolution will always rigidly adhere to a molecular clock, it is, however, the average behavior of many sequences that we consider here, and our control estimates of known times were accurate."
To this they combined another data point: the year of sampling of the sequences used to reconstruct the tree.

(A) The phylogenetic tree used for the calculation. (B) The branch lengths from the tree plotted versus the year of sampling an dprojected backwards in time.

Once they reconstructed the phylogenetic tree, with the root sitting more or less in the middle, and thus at the same distance from the various HIV subgroups (the clusters marked with capital letters in panel A above), they plotted the branch lengths of the tree against time (panel B) and did a linear fit to extrapolate the time since the last common ancestor: 1931, with a 95% confidence interval of 1915 to 1941. Furthermore, testing a known HIV-1 group M isolate from 1959 gave an accurate estimate for the date of its origin, indicating that the assumptions of the method are reasonable.

Notice that 1931 marks the year the first HIV-1 lineage, the M-group, started to spread and diversify in humans. It does not tell us whether or not the virus was transmitted at the same time as it started to diversify. It could be possible that the virus cross-transmitted to humans earlier and remained isolated within a small population. Around the '30s socioeconomic changes would've allowed the spread of the virus:
"Strictly speaking, our estimate is neither an upper nor a lower bound on the date of the actual zoonosis. Rather, it is the approximate time of the bottleneck event that was the genesis of the M group and captures the moment of the beginning of the expansion of the M group. If the M group originated in humans, then this would date the founder virus of the pandemic."
Another important question is addressed in the following commentary by David Hillis:
"If HIV has been present in human populations since at least the 1930s (and probably much earlier), why did AIDS not become prevalent until the 1970s? The phylogenetic trees of HIV-1 indicate that the spread of the virus was initially quite slow‚ by 1950 there existed 10 or fewer HIV-1 M-group lineages that left descendants that have survived to the present. The epidemic exploded in the 1950s and 1960s, coincident with the end of colonial rule in Africa, several civil wars, the introduction of widespread vaccination programs (with the deliberate or inadvertent reuse of needles), the growth of large African cities, the sexual revolution, and increased travel by humans to and from Africa. Given the roughly 10-year period from infection to progression to AIDS, it was not until the 1970s that the symptoms of AIDS became prevalent in infected individuals in the United States and Europe."
B. Korber, M. Muldoon, J. Theiler, F. Gao, R. Gupta, A. Lapedes, B. H. Hahn, S. Wolinsky, and T. Bhattacharya. (2000). Timing the Ancestor of the HIV-1 Pandemic Strains Science, 288 (5472), 1789-1796 DOI: 10.1126/science.288.5472.1789

Tuesday, December 6, 2011

Ethnobotany, shamanism and extremophiles: Alison Sinclair talks about world-building in science fiction

My guest today writes science fiction "to indulge a passion for knowledge of all kinds and science and medicine in particular," and to "have an excuse to read about everything from oceanography to nanotechnology, from color theory to the history of microscopy." Alison Sinclair is the author of the Darkborn Trilogy as well as the novels Legacies, Blueheart, Cavalcade, and Throne Prince (co-authored with Lynda Williams). Her writing is beautiful and flawless, and her world-building is even more amazing, delving into "ethnobotany, shamanism and extremophiles (bacteria which live in hostile environments)"[1]. As she states in her bio, "I like to be able to ditch all assumptions and conventional wisdom and start entirely from scratch, running my fictional 'thought experiments' (Ursula Le Guin’s words) according to any parameters I please. Science fiction gives my imagination elbow room."

When she's not writing, Alison works at a hospital-based technology assessment unit, doing literature reviews and contributing to meta-analyses and cost-effectiveness analyses. I tried to keep track of all her degrees, but got lost after physics, biochemistry, and medicine...

I'm truly excited to welcome Alison to my blog today!

EEG: I'll start off with a geek question: I was browsing your blog and noticed that you had just posted some code in R, which of course I use all the time, so my jaw dropped in admiration...

AS: I've just (last year) finished a MSc in Epidemiology, and a number of our lecturers used R, so I did a lot of assignments using R (and some SAS, and some STATA). Since then, I've mainly used R for its graphics capabilities, but I have a couple of side projects that are on my list of things to get back to.

EEG: Do you think of yourself as primarily a scientist, a writer, or both?

AS: Still primarily a scientist, though I think the two have converged. "Scientist" was my ambition from childhood, although I was writing almost from the time I could hold a pen.

EEG: How much of your writing is influenced by your work as a scientist?

AS: I'm not sure what came first, the reading and TV viewing (Star Trek and other TV fiction, the Apollo and other space projects) influencing the career choice, or the career choice influencing the writing. Since I was intensely interested in science, when I discovered SF properly, I immediately started writing it. Pastiches of John Wyndham and Ray Bradbury, mainly, with forays into Clarke and Asimov. I had a long spell of trying to write mainstream (because I didn't connect with SF fandom until later), but it felt like a huge chunk of my reality was being left out.

As to how science influenced the writing: all of it. Characterization, problems, narratives, ethical framework. My first six novels were SF (3 published, 3 still unpublished). Legacies was the least influenced by my work as a scientist, because it was a world I'd been living with since childhood, though I started worldbuilding because I was keen on geology, physical geography and natural history. While I was writing Blueheart, I was working as a lab scientist, so along with my perennial interest in natural history and the sea, Blueheart got the molecular biology and bioengineering. While I was writing Cavalcade, I was at medical school, so the interest in medicine - and nanotechnology, and social dislocation - are all through it. Three of my main characters are scientists, and one of them does an autopsy under very strange circumstances. The next novel (unpublished) came out of a lecture in culture and ethics at medical school, talking about the culture clash between patients who came out of a different healing tradition - and I took to wondering what a physician from a different tradition would look like. As a novel, it wasn't a terribly successful experiment, and one of these days, I'll go back to it and see if distance has given me insight into its problems. Then, while working as a journal editor, I took my first course in epidemiology (if my lecturers only knew what I do with the knowledge they impart), which fed into two novels of what I describe as my "medical starship" series. And then I had an idea for a fantasy novel.

EEG: Tell us about the Darkborn Trilogy: how were the characters of Bal, Telmaine, and Ishmael born? (BTW, I love the names!)

AS: In irritation: I was reading a fantasy novel in which the whole light/dark trope was blatantly foregrounded in the characterization and the imagery, so much so that it annoyed me. So the first thing I thought about was the literal division into day and night, and then starting sympathetically with the "dark" characters. Balthasar and Floria and their paper wall came first. I'd been doing a lot of reading around the decades leading up to World War I for another project, and between having cut my teeth on fantasy written either before or between the wars, and feeling more comfortable with a more urban and developed setting, I came up with the Darkborn society. Balthasar seemed the kind of man who'd be married and settled, out of which came Telmaine, who is a contrast to him, temperamentally and socially. Then there was the question as to why she would be so determined to marry beneath her, which led to the nature of her magic. Ishmael was a bit of a surprise, when Telmaine tripped down the stairs and encountered him. He was originally supposed to be a kind of John Buchanesque hero, and never quite lost that, though he developed in his own direction. As part of the society I was building, I wanted names that were fairly elaborate. Balthasar and Ishmael came out of the Oxford Dictionary of First Names, as best I can recall. Telmaine's name was made up. I had a bit of an argument with myself over Ishmael's name, knowing it would have particular associations for US readers, but none of the alternatives would stick. Ishmael was Ishmael.

EEG: Interesting. I confess I had a similar experience, and my very first story (which sat in my head for decades!) finally came on paper after reading a book that triggered that kind of irritation. It made me realize that yes, it's good to read literary masterpieces, but the occasional crappy book can have a positive side, too!

Thanks so much for being here today, Alison. To find out more about Alison's books (and scientific work, too) visit her at

Monday, December 5, 2011

Timing the AIDS pandemic and why it made history (Part I)

This week I would like to discuss two Science papers that have marked a milestone in HIV research. In order to place them in the right context, I need to start with a brief historical digression. If you're interested in the history of the discovery of the AIDS disease, I highly recommend watching the movie And the Band Played On. It's very well done and realistically portrays how the medical investigation was conducted. For the purpose of my discussion here, though, I will start from the movement known as AIDS denialism.

From Wikipedia:
"AIDS denialism is the view held by a loosely connected group of people and organizations who deny that the human immunodeficiency virus (HIV) is the cause of acquired immune deficiency syndrome (AIDS). Some denialists reject the existence of HIV, while others accept that HIV exists but say that it is a harmless passenger virus and not the cause of AIDS."
Famous "denialists" include Nobel laureate Kary Mullis, UC Berkley professor Peter Duesberg (the first to isolate a cancer gene), and biologist Lynn Margulis (who discovered the origin of mitochondria through symbiosis). Oh, and I almost forgot Serge Lang, whose math books I revered back in grad school. (In case you didn't know, being a good scientist doesn't mean you get everything right.) There's some really sad stories associated to AIDS denialism, including a woman whose firm beliefs didn't falter not even after her three-year-old daughter died of AIDS complications. In fact, she even founded an organization to discourage HIV-positive pregnant women to take anti-HIV medications. Even sadder is what happened in South Africa: despite the fact that HAART therapy (a potent cocktail of anti-retroviral drugs) became available around the mid-nineties, the advent of the therapy was delayed because the then South Africa president Thabo Mbeki, along with the rest of the African National Congress party, convinced by the denialist movement, believed that AIDS was the result of poverty and malnutrition.

Part of the puzzle was that people didn't really know how the HIV virus had originated. There were various theories, often inconsistent or almost resembling sci-fi movies: these included several variations over the theory that HIV was a bio-warfare virus engineered by the US Government; another theory was that it had spread through the smallpox vaccination; and, finally, the most realistic was that it had spread through the polio vaccine, which had been developed on chimpanzee tissue, and there was a real possibility that the tissue could've been contaminated. Of course, the fact that nobody knew for sure, deepened the roots of AIDS denialism.

Now fast forward to January 2000, when Hahn et al. published a paper in Science [1] proving that HIV had been transmitted to humans from monkeys and had originated from the SIV virus. This is the paper I would like to discuss today. 2000 was the year South Africa's President Thabo Mbeki invited several HIV/AIDS denialists to join his Presidential AIDS Advisory Panel. That same year over 5,000 scientists and physicians signed the Durban Declaration in which they affirmed that AIDS was caused by HIV. Unfortunately, it didn't stop the estimated 300,000 AIDS deaths in South Africa that could have been prevented by introducing HAART therapy.

Hahn et al. analyzed the full-length genomic sequences of distinct primate lentiviruses from monkeys. These fell into five major, approximately equidistant, phylogenetic lineages:

"Evolutionary relationships of primate lentiviruses based on maximum-likelihood phylogenetic analysis of full-length Pol protein sequences. The five major lineages are color-coded. The scale bar indicates 0.1 amino acid replacement per site after correction for multiple hits."

The above figure is a phylogenetic tree, that is, a graphical representation of the genetic distances across the sample. Each leaf in the tree represents a genetic sequence, and sequences that are most similar are clustered together. As you move from the right to the left, you can reconstruct the evolutionary history of each sequence: for example, the two sequences HIV-1/LAI and HIV-1/ELI are roughly a few mutations away, which means they share a common ancestor. That common ancestor at some point originated the sequence HIV-1/U455. Each node represents a "coalescent" event, an event in which one sequence duplicated and a few new mutations were inserted. (What I just gave you is a schematic explanation, things can get more complicated than that, but let's keep things simple for the sake of the argument.) Phylogenetic trees are constructed using maximum-likelihood methods: basically you compute all possible trees and then choose the one that maximizes the probability function associated with it (the most likely tree). Obviously, this is not done by hand but by a computer program that goes through many iterations and hence takes a very long time. Today, supercomputing machines are utilized to speed up the process.

What can we learn from the above tree? First of all, notice that the lineages are color-coded. Each color represents one lineage found in one particular primate species, and the fact that colors tend to aggregate together in host-specific clusters tells us two things: (1) each lineage has been infecting their respective host for a relatively long time; (2) a "jump" from one host to another one represents a divergence in the evolution of the virus.
"HIV infections have also resulted from cross-species transmission events. Five lines of evidence have been used to substantiate the zoonotic origins of these viruses: (i) similarities in viral genome organization, (ii) phylogenetic relatedness, (iii) prevalence in the natural host, (iv) geographic coincidence, and (v) plausible routes of transmission."
Following the above logic, evidence collected from chimpanzees from Cameron led to conclude that the HIV-1 epidemic arose as a consequence of SIVcpz transmission from a particular chimpanzee subspecies, P. t. troglodytes, to humans.
"The seeds of the HIV-1 epidemic appear to have been planted in west equatorial Africa in the region encompassing Gabon, Equatorial Guinea, Cameroon, and the Republic of Congo (Congo-Brazzaville). It is only here that HIV-1 groups M, N, and O cocirculate in human populations and where chimpanzees (P. t. troglodytes) have been found to be infected with genetically closely related viruses."
The most likely transmission route from primates to humans would have been through blood exposure from butchering and consuming raw meat from infected animals. Such cross-species transmission are not unusual (several flu strains are often acquired that way), but they often represent an evolutionary dead-end for the virus as it may not be well-adapted to the new host. That was obviously not the case with HIV-1, whose high variability allowed it to readily adapt and dodge the human immune system.

In next post, I'll discuss the second Science paper that made history in this field.

Hahn, B., Shaw, G. M., De Cock, K. M, Sharp, P. M. (2000). AIDS as a Zoonosis: Scientific and Public Health Implications Science, 287 (5453), 607-614 DOI: 10.1126/science.287.5453.607

Friday, December 2, 2011

Another genetic puzzle: why is mitochondrial DNA only inherited from the mother's side?

Remember when I told you that bacteria have circular DNA? Well, we have it too, only not in the nucleus where the rest of our DNA sits. It's a rather interesting story, one that biologist Lynn Margulis proved in 1967 [1]: our cells contain organelles called mitochondria, which originally were separate organisms (prokaryotes), and at some point entered a symbiotic relationship with eukaryotic cells through endosymbiosis. As a result, they contain their own, circular DNA called mitochondrial DNA or, in short, mtDNA.

Circularity is not the only fascinating thing about mtDNA. It contains 37 genes, and, because it's not found in the nucleus, non-nucleated cells like precortical cells (found in hair shafts) can't be used for DNA analysis, but can indeed be used to extract mtDNA. However, whereas nuclear DNA is unique to each individual, mtDNA is not. That's because it's inherited exclusively through the maternal lineage. As you know, paternal and maternal chromosomes undergo recombination and then fuse together to make the unique DNA of a new individual. However, mtDNA does not undergo recombination and the only variation happening is due to random mutations when the cell splits. These are quite rare and in fact, it's not unusual to share identical mtDNA with our siblings, and/or to inherit it unchanged from our mothers.

Maternal mtDNA inheritance occurs in most eukaryotic species, indicating that, from an evolutionary point of view, it's an old and conserved mechanism. One might argue that paternal gametes (sperm) are much smaller than maternal gametes (eggs) and therefore contribute a limited amount of mitochondria, which then get lost. In fact, the general belief was that, at least in some species, paternal mitochondria was excluded due to the fact that only the head of the spermatozoon enters the oocyte's cytoplasm. Well, that doesn't quite explain the whole story, and the mechanism that allows the clearance of paternal mitochondria during early embryonegesis was not understood until recently.

Two studies published in the November 25 issue of Science [2, 3] used a Caenorhabditis elegans model to show that the degradation of paternal mitochondria is achieved through involvement of autophagosomes, double membrane vesicles that recruit the organelles, engulf them, and then destroy them. Rawi et al. also proved that autophagy is triggered in the mouse too, within minutes after fertilization, whereas, in the absence of autophagosomes (which they induced artificially in some animals), the paternal mitochondria persist in the embryos.

This is a great step forward, but many questions remain unanswered, as Levine and Elazar note in the accompanying perspective:
"The findings of Sato and Sato and Al Rawi et al. help to explain how paternal mitochondria and mtDNA are destroyed, but why they are destroyed remains a mystery. Is heteroplasmy, the occurrence of more than one mtDNA genotype, dangerous for the developing embryo? Or is the degradation of paternal mitochondria merely a primitive defense in which the fertilized oocyte views the paternal mitochondria as a potentially dangerous intruder that must be destroyed?"

[1] Sagan, L. (Margulis, L.) (1967). On the origin of mitosing cells Journal of Theoretical Biology, 14 (3) DOI: 10.1016/0022-5193(67)90079-3

[2] Sato, M., & Sato, K. (2011). Degradation of Paternal Mitochondria by Fertilization-Triggered Autophagy in C. elegans Embryos Science, 334 (6059), 1141-1144 DOI: 10.1126/science.1210333

[3] Al Rawi, S., Louvet-Vallee, S., Djeddi, A., Sachse, M., Culetto, E., Hajjar, C., Boyd, L., Legouis, R., & Galy, V. (2011). Postfertilization Autophagy of Sperm Organelles Prevents Paternal Mitochondrial DNA Transmission Science, 334 (6059), 1144-1147 DOI: 10.1126/science.1211878