Monday, May 18, 2020

Virus attenuation

Vaccines and IgA

It is widely held that attenuated live-virus vaccines are the best vaccines, at least for respiratory diseases, because they are able to induce an IgA response.  Plus, nearly all viruses enter the host via mucosal surfaces (oral cavity, nose, gut, lungs), where the first line of defense includes IgA.

Poliovirus is the exception that proves the rule.  Polio grows extensively in the intestine, hence its transmission by the fecal-oral route, resulting in a unique epidemiological role for swimming pools in advanced countries.  However, polio doesn't seem to cause much pathology there.  The major trouble occurs when it moves to the nervous system.

OPV (oral polio vaccine) induces IgA while IPV (inactivated polio vaccine) does not.  This restriction is not a serious problem for the inactivated virus vaccine because it seems that IgG is sufficient to prevent polio's movement to the nervous system.

There is some prospect for development of adjuvants that would drive B-cells to switch to producing IgA.  One of the best understood is cholera toxin, where binding to certain cells stimulates IgA-promoting cytokines (IL-1, IL-6, IL-10).  However these are too toxic for use in humans.  Perhaps someday we'll understand enough to be able get an IgA response with a killed virus, but that moment has not come yet.  ref

Virus attenuation

We are left with the fact that historically, live virus vaccines have mainly been produced via attenuation.  We can look at some modern approaches elsewhere, like smallpox or adenovirus or retroviruses that are already attenuated and can display viral antigens.

Yellow Fever (YFV) virus ("the black vomit") is now a disease of the tropics, but it was once common in the US.  10% of Philadelphia's population was lost in an epidemic in 1793, and even more in New Orleans some years later.

In the mid-1930s, Max Theiler found that YFV could grow in mouse embryos.  So it was passaged from one mouse embryo to another, and then it was found that, somehow, the virus acquired the ability to grow in chicken embryos.  So one general approach is to try to grow the virus in some kind of cells (anything):  human, if necessary, or monkeys or mice and then adapt them to grow in chicken cells.

The chicken cells can either be in a whole embryonated hen's egg (i.e with a growing embryo), or cells growing in a culture dish.  Here is a picture showing the different sites within the egg that are suited for different viruses (I believe these are mostly viruses that have been adapted to grow in eggs already).

I haven't read enough to know, but I would suspect that cultured cells for virus were usually CEF (chick embryo fibroblasts), which are easy to prepare.  You mince the embryo, first removing the head, and put the pieces in culture.  A few days later you harvest the growing cells by treatment with the enzyme, trypsin.  A few cells are transferred to a new flask.  Now, you have fibroblasts which will grow for a number of generations, and no more embryo.  These are called primary cultures.

Today a number of cell lines have been derived from chicken that will divide forever.  There is a famous cell line from humans called HeLa which you may have read about.

The measles virus (MV) was first grown in human kidneys in tissue culture, then in human placentas in tissue culture, and then in chicken eggs.  Later, it was adapted to primary cultures of CEF.  In rare cases cells derived from an aborted human embryo have been used.

Another approach is to adapt the virus to growth at lower temperatures.  Due to paywall restrictions, I haven't been able to read much of the literature on this, but I believe it was done in CEF growing at 25°C.  There is a well-known influenza live virus vaccine of this type.

So, a virus that normally infects humans and causes disease is adapted to grow in chicken cells, or adapted to grow at 25°C instead of 37°C, or both.  Afterward, you may find that the procedure yields a virus that no longer grows very well and does not cause disease in humans.

This may be because a virus can specialize in one or the other but not both.  Or it may be that during prolonged replication mutations accumulate that affect grow under the original condition, but these are not selected against as they would be in the original host or at the original temperature.

Molecular biology

Surprisingly little is known about the molecular basis of attenuation.  Probably that's because such work requires a system for reverse genetics.  That would be some DNA-based clone where the mutations to be tested could be introduced, followed by a method to produce the live (typically RNA) virus.  I've written about a new system of this type for SARS-CoV-2 where the clones are maintained in yeast.

In addition, the gold-standard would be to test the virus in primates like monkeys.  That's really expensive and would need to fully justified.  Without a strong need to know, it might be hard to get approval for such a study today.

One case where new antigens are substituted into an attenuated virus is the live influenza vaccine.

Influenza is a segmented virus.  If two different strains of influenza infect the same cell, you can get reassortment.  Suppose we start with a known attenuated mutant (due to changes in PB1 and PB2), and coinfect with an influenza virus whose HA and NA we want in the new vaccine.  Just take the progeny viruses, clone them (propagate descendants from isolated single viruses), and then choose the one with the right genes:  PB1 and PB2 from virus 1, and HA and NA from virus 2.

In summary, attenuation has been widely used but is still more magic than science.  The best thing would be if an attenuated vaccine for the original SARS had been developed.  Then you could just substitute the new RBD (receptor binding domain) and try it out.  Unfortunately, it doesn't seem that was ever accomplished.

Tuesday, May 12, 2020

Measles virus

This is the first of a short series of posts in which I give some basic information on viruses that cause serious disease in humans.  It's motivated by the current pandemic with SARS-CoV-2.

The focus is on viruses that are human-specific but may have "jumped" species, the development of live attenuated virus vaccines, and the general nature of host restriction.

In this post, we'll talk about measles.  Measles is commonly known as rubeola (not to be confused with rubella), but also red measles and "English measles".


Measles virus is a single-stranded negative-sense, enveloped RNA virus.  The measles virus is a member of the Paramyxoviridae, family Morbillovirus.  Below on the left is measles virus, its relative mumps virus (also a Paramyxovirus) is on the right.


Typically the first symptoms of measles include high fever of about 4 days duration, a characteristic rash, and what are called the "three C's":  cough, coryza (runny nose) and conjunctivitis.

Here are two pictures showing the characteristic rash.

The rash is called "flat" because
A maculopapular rash is a type of rash characterized by a flat, red area on the skin that is covered with small confluent bumps. It may only appear red in lighter-skinned people. The term "maculopapular" is a compound: macules are small, flat discolored spots on the surface of the skin; and papules are small, raised bumps. It is also described as erythematous, or red.
There is a special pathognomonic sign called Koplik's spots, seen inside the mouth on the cheek next to the molars.  Pathognomonic means it is characteristic enough to make a diagnosis by itself.  However, the spots are transitory and frequently missed.


Measles is a highly contagious infectious disease.  Nine out of ten people who are not immune and share living space with an infected person will be infected. People are infectious to others from four days before to four days after the start of the rash.

Not only are presymptomatic individuals infectious, but the virus spreads by aerosol, meaning that it can survive in an infectious state even in the very small droplets that waft around and don't fall to the ground within a few minutes.  The particles can stay airborne for hours.  This ability is unusual, and indicates the virus resists inactivation due to drying out.

The CFR (case fatality rate) ranges from 1-3/1000 for a well-nourished, healthy individual, to as much as 10% or more, for other populations.  Vitamin A-deficiency is very problematic, and supplementation is recommended.

According to wikipedia, measles killed 20 percent of Hawaii's population in the 1850s. In 1875, measles killed over 40,000 Fijians, approximately one-third of the population.

Typically the first tissue infected is the lining of the airways, but the virus eventually travels through lymph nodes, infects cells of the immune system, and then moves into the blood causing widespread viremia.

Bacterial pneumonia is one of the common sequelae, and that's what most people die from.  Other problems include ear infections, blindness, severe diarrhea, encephalitis (1/1000) and problems in pregnancy.  In very rare cases (1/1M), measles can reactivate years later to cause SSPE (subacute sclerosing pan-encephalitis).

Host restriction
Measles virus infection is presumed to be sustained through an unbroken chain of human-to-human transmission, and no animal or environmental reservoir is known to exist. However, nonhuman primates can be infected with measles virus and can develop an illness similar to measles in humans with rash, coryza, and conjunctivitis. Many primate species are susceptible to measles virus infection, including Macaca mulatta ... Much of the evidence for the susceptibility of these nonhuman primates comes from laboratory colonies and the use of nonhuman primates as animal models for the study of measles virus pathogenesis.
One of the most interesting aspects of measles epidemiology is that the virus is so infectious, it runs out of hosts if the population is too small.  With a larger population, it comes back every few years as a new crop of susceptible hosts develops.
To provide a sufficient number of new susceptibles through births to maintain measles virus transmission in humans, a population size of several hundred thousand persons with ∼5000–10,000 births per year is required
Surveys of wild populations have sometimes revealed non-human primates with antibodies to measles virus.  It is believed that the virus was spread from humans to one of these animals, followed by limited spread and then die-out.



The first laboratory to grow the virus was that of John Enders and colleagues.  They also were first to culture poliovirus, which lead to work on the vaccines by Salk and Sabin.  Enders et al received the Nobel Prize in 1954 for this work.

The vaccine strain is named for the boy from whom that virus was cultured, Edmonston.

The virus was weakened by successive culture in
- human kidneys
- human placenta
- hen's eggs
- chick embryos

Although significantly weakened by this serial culture,  it still caused rash and fever, sometimes high enough so that children had seizures.

The first thing Hilleman did was give the vaccine together with gamma globulin from people who had recovered from measles.  He then passed Enders' measles vaccine strain through
chick embryo cells more than 40 times.

Vaccinated is a biography of Hilleman.  It tells the story of Hilleman obtaining specially-bred chickens that were free of chicken leukemia virus.

The vaccine is highly effective.

Despite significant diversity of virus isolates, Measles virus remains a monotypic virus for which protective immunity is induced by vaccine strains first isolated in the 1950’s.


The genus Morbillivirus includes similar viruses that infect dogs, cats, whales, seals and cattle.  The disease of cattle is referred to by a Boer term:  Rinderpest.  Of the relatives, the rinderpest virus is the closest to measles.

At NCBI I searched for measles and found 363 genome nucleotide sequences, most of which appear complete.  I just chose one at random, NC_001498, and then got the sequence of its nucleocapsid gene, NP_056918.  A BLAST search gave a large number of hits with other Measles virus isolates down to 97% identity.

Restricting the search to Rinderpest (taxid: 11241), I got numerous hits as well, in the range of 75-80% identity.  But if you look at the alignments, there is a C-terminal region that diverges.  The N-terminal 400 aa (of 524) matches very well.  Restricting the search to 1-400 the matches were more like 88% identical, like this one:

Here is a phylogenetic tree of Morbilliviruses from this review:

One can often recognize present-day diseases in descriptions from ancient times, but measles is missing from those accounts.  The first systematic description of measles, and its distinction from smallpox and chickenpox, is credited to the Persian physician Rhazes (860–932), who published The Book of Smallpox and Measles.

By analyzing the diversity of the sequences of viral isolates, it is believed that the last common ancestor of Measles virus and Rinderpest occurred about 1000 AD plus or minus.  It is also thought that the virus "jumped" from cattle to humans, due to domestication of livestock and growth of the human population to a level that could support the virus.


Enders Nobel prize and lecture
Enders biography
Hilleman biography
Hilleman obit

Monday, May 11, 2020

Introduction to animal viruses

A long time ago, in a place far far away, I used to give lectures to medical students about microbial physiology and genetics.  I also gave two lectures introducing the viruses that infect humans:  not much about disease yet, but morphology and replication strategies, and so on.

Here is one figure I used, it is a cartoon of what various RNA viruses look like in the EM, drawn to scale.  (It's from Lange).  The morphology is quite diverse.

Our attention is currently focused on the one in the middle of the top row.  Now, there are a lot of properties that viruses have:  is the genome RNA or DNA, single- or double-stranded and so on.  Also shape of the capsid, lipid envelope or not.

I had a hard time remembering all this stuff (I actually could not), so I made up a picture that was successful.  This is the general organization for RNA viruses.

Here is how I used it.
So the idea is, you remember the order of the different viruses in the table, + sense on top, - sense underneath, with one double-stranded at the end of the second row.

Then you memorize a pattern of active dots for each property that you need to know.  I required them to know which viruses were enveloped, and which had segmented genomes.

There is another thing that's a bit of a detail, but some may find it useful.  There is so much material in lectures, especially to medical students, that it has been described as "trying to drink from a fire hose."  I am very sympathetic to them on this issue.  I adopted the strategy to color-code text on slides:  blue means you must know this, black means it's important and you may need to know this, and gray means it's something I want to talk about but you do not need to know this.  And then another color, like salmon, for the title of a slide to tell what it's about.

Here's another summary slide showing the Arboviruses.  These are "arthropod-borne" viruses (i.e. insect-borne).

This is the sum total of Coronavirus information (characteristics of the infectious process were taught later, in the context of lung infections).

Finally, here's another cartoon of DNA viruses.

I was pretty proud of myself for coming up with that aid to memory.

I have put links to the two lectures up on Dropbox.  I can't guarantee they'll stay up for ever, but we'll see.  Introduction to viruses   Virus systematics

Sunday, May 10, 2020


Take a rectangular tank and epoxy in electrodes at opposing ends, like this

Add an appropriate buffer, and then hook the electrodes up to + and - output from a regulated power supply, and you will get a voltage between the two wires that causes a current to flow.  Biological molecules carry electric charge, so they will move too, in an appropriate supporting medium.

Electrodes are made from thin platinum wire.  Since the molecules are negatively charged, they move toward the cathode (red).


Two materials have traditionally been used for gels:  polyacrylamide and agarose.  Acrylamide is on the left, and its polymerized form is on the right.

Acrylamide is cross-linked into a mesh by the inclusion of a small amount of bis-acrylamide (technically, N,N' methylene-bisacrylamide).

You mix a solution of acrylamide and bis-acrylamide plus the appropriate buffer, then the reaction is started by addition of a small amount of TEMED and ammonium persulfate (5-20% is a range for acrylamide).  The mixture is poured into a mold (glass plates separated by spacers, with something at the bottom to keep the liquid from running out.

Once it's set, the plug at the bottom is removed.  Electrical continuity is maintained by wedging a piece of sponge into the bottom.

A gel mold contains two glass plates, one of which is notched.  The "ears" of the notched plate have a tendency to get broken, so usually a set comes with an extra notched plate when you buy them.

The other material is agarose.  This is a purified form of the agar that is used for bacteriological plates (petris dishes).  Agar is a polysaccharide extracted from certain kinds of seaweed.  Agar has been used to solidify desserts for a long time, it was introduced to Koch's laboratory by the wife of one of his assistants, who knew about it.  He never publicly credited her with the idea.

[ It's amusing that the very first medium used for isolation of single colonies of bacteria was a potato, sliced through.  Appropriate for a German laboratory, I think. ]

Biochemically agarose is a repeating polymer of dimers of galactose plus a galactose derivative.

Agar and agarose have the property that when mixed with water, boiled and then cooled, the material sets into a gel (like jello, but generally stiffer) at around 45°C.  Once set, it can be heated a lot higher without losing its physical properties.  The stiffness depends on the concentration of agarose used.  0.8-1.5% would be usual

Agarose gel electrophoresis also requires an appropriate buffer (e.g. Tris-acetate).  This type of electrophoresis is extremely convenient.  The gel is non-toxic, easily prepared by boiling, and the gel can be poured flat (see the picture above).  You can't do this with polyacrylamide because oxygen inhibits the polymerization reaction.

Samples are loaded into wells under the surface of the buffer.  The aqueous samples are made dense by addition of glycerol to 10% or so.  Dyes (bromophenol blue and sometimes xylene cyanol) are also added into the samples, they move at characteristic rates and allow you to visualize the progress of the separation.

Separation in electrophoresis

For DNA at neutral pH, charge is carried by the phosphate groups, which contribute 1 or 2 negative charges (the average depends on the exact pH).  This means that the charge to mass ratio is constant for DNA or RNA of different lengths.

The reason that DNA or RNA molecules of different sizes separate is the existence of a retarding force that is greater for larger molecules.  Or maybe it's better to turn that around:  we observe that the log of the distance traveled is inversely proportional to the length of the molecule, and infer the existence of a force that depends on length.

Samples for protein gels are typically prepared by boiling a protein mixture in the presence of a detergent SDS (sodium dodecyl sulfate).  The hydrophobic part coats the protein and destroys its secondary structure.  The evenly spaced sulfate groups impart negative charge.  As with DNA,  the charge to mass ratio is constant for polypeptides of different lengths.  Separation occurs by means of the size-dependence of the retarding force.


The classic method for visualizing protein gels is to stain with a blue dye (Coomassie brilliant blue).  In this picture we can see a protein gel drying after the electrophoresis has been run.  The blue spots are proteins.

DNA gels were often stained with a fluorescent dye such as ethidium bromide.

Ethidium is moderately mutagenic, so substitutes have been developed.

Alternatively if the material is radioactive, you just expose the dried (or even wet) gel to X-ray film.  These days, they have fancy apparatus that records the emitted beta particles without the use of film.  I remember the revolution caused by the introduction of automatic film processors.


To get the best resolution, you want the bands of protein or nucleic acid to be as thin as possible.  Here is a gel with very nice resolution:

The thickness of the bands depends on how much of each protein is present in the sample.

To get a pretty gel (one with nice thin bands), for DNA or RNA the important thing is to have as little sample as possible and to run a thin gel (like 0.4 mm).

For protein gels, there is a trick, invented by Laemmli.  There is a combination of two gels, one on top called the stacking gel, and a larger one below called the running gel  The system has 3 different buffers.  The upper and lower tank buffers contain glycine as the mobile anion and are at pH 8.8.  The gels are

Stacking gel:  3% acrylamide, pH 6.8
Running gel:  5% - 20% acrylamide, pH 8.8

This system compresses a sample which might be almost a centimeter from top to bottom when first loaded, into a set of protein bands much less than one mm thick as they exit the stacking gel.

One last thing:  a lab running protein gels will have a characteristic smell of sulfur.  That's because a sulfhydryl reagent like beta-mercaptoethanol will be present in the samples to break disulfide bonds in the proteins.  It's minimally dangerous in small quantities, but these days the safety police make you boil your samples in a fume hood.


Paper chromatography

Chromatography was first used to separate plant pigments such as chlorophyll and carotenes, which is how it got its name, because these are colored molecules.

Take a piece of thick paper (cellulose), spot a sample of spinach extract on it, and then place the paper in a jar of solvent (spot above the liquid).  Capillary action moves the solvent up the paper, and if the components of the sample are soluble in the solvent, they will move up as it rises.  The components move at different rates, so they get separated.

Here is an idealized result:

A typical solvent would be a mixture of petroleum ether, acetone, and water (3:1:1).  Various protocols are available on the web. [link]  Luckily petroleum ether is not really an ether so it doesn't form explosive products, although it is flammable.


One advance was to manufacture derivatized cellulose where other chemical groups are attached.  Cellulose is just glucose units with a particular linkage.

Two very common substitutions are DEAE (diethylaminoethyl-cellulose), which is positively charged around pH 7.

Another advance is to pack the material into a column and use gravity or a small pump to move the solvent through the column under moderate pressure.

Here's an example from one of my old lectures where extracts of E. coli are fractionated on DEAE-cellulose to separate the different DNA polymerase activities.  On the x-axis is the fraction number, the first material to come out  (elute) from the column is on the left, last on the right.  The y-axis is the DNA polymerase activity.

The polA mutant lacks the primary DNA polymerase activity (Pol I), yet it grows fairly normally.  That suggests that one of the other activities is responsible for replication of the chromosome.  Mutants lacking Pol I are sensitive to mutagens such as UV light, which suggests that the primary role is in repair of damaged DNA.

Another very common substrate for protein purification is phospho-cellulose, where phosphate is attached.  This is negatively charged near pH 7.  DNA and RNA polymerases bind pretty tightly to this.

An important modification (used for the column above) is to change the composition of the buffer in the column as time goes on.  Here is a simple device for doing that, called a gradient maker.

You put say, low salt buffer in the left cylinder, and high salt buffer on the right.  Open the valve between the two, and pump out from the left hand side.  Mix well as you introduce high-salt into the low-salt chamber.  The result is a linear gradient from low salt to high salt.

Since proteins bind more or less tightly to DEAE (and P-cell) by interactions with charged amino acids, they will "come off the column" at different salt concentrations.

Saturday, May 9, 2020

DNA Sequencing: classical approaches

Recall that in Crick's phrase:  DNA makes RNA makes protein.

Of the three, proteins (polypeptide chains) were the first to be sequenced.

The very earliest nucleic acid sequences were of RNA molecules.  Robert Holley sequenced an alanine transfer RNA, 88 bases in length, and received the Nobel prize for this work in 1968 (pdf of Nobel lecture).  The problem with tRNA is complicated by the presence of modified bases such as pseudouridine.

(There is a small problem with the structure shown in his lecture.  This is yeast tRNA Phe, which is charged with phenylalanine).

There were other early pioneer chemists as well.  Fred Sanger received two Nobel prizes, one for determining the amino acid sequence of insulin, and a second for developing the dideoxy sequencing method for DNA.  He also worked on RNA sequencing methods.

RNA sequencing

Some basic tools they used were:  32-P labeling, various sequence-specific nucleases including ribonuclease T1, and thin-layer chromatography and paper electrophoresis.

Labeling:  The 32-P (pronounced P 32) isotope of phosphorous is radioactive (and energetic, emitting 1.7 MeV beta particles).  Since DNA and RNA have phosphodiester bonds in them they can be made radioactive, incorporating 32-P by various means.

These include:  growing cells in radioactive phosphate, (ii) using radioactive precursors for in vitro synthesis, and (iii) using the enzyme T4 polynucleotide kinase to add 32-P and label the 5' ends of molecules.

With a purified RNA, sequence analysis would usually start by making digests such as with T1 ribonuclease: which cuts specifically after G.  So if an molecule was


T1 nuclease digestion would give three products


which can be individually purified.  The sequences of the small fragments were then determined by a variety of methods.  A different method was used to produce small fragments that overlap the joints.

Here are a couple of figures from Sanger's autobiographical Annual Reviews article which I hope convey the flavor of that kind of analysis, if not all the detail.  The first shows a total digest of 5S rRNA.
If you have CpApApApUpCpA and obtain by hydrolysis the trinucleotide CpApA, you know a part of the sequence of the original compound.  By determining a number of these and finding the overlaps, the whole sequence may be assembled.

A powerful idea is partial hydrolysis of end-labeled material.

The sequence above can be read as TTACCCTT.  (This particular one is DNA).

lac operon control region

In 1975 the first natural DNA sequence was determined.

This was also the summer of the famous moratorium on recombinant DNA work.  It was obvious that a new era was beginning for biology.

Reznikoff et al. used a genetic trick in E. coli to transfer the lac operon's promoter/operator region onto lambda phage, transcribed across the control region in vitro, and then used a biochemical trick to enrich for sequences that were only from the lac control region.  The latter was to hybridize the RNA to separated strands of lambda phage without lac, and keep the stuff that didn't hybridize.

(PMID 1088926 --- this paper is locked behind a paywall at the journal Science, 45 years later, but I found a pdf linked on Bill Reznikoff's page at UW-Madison).

Maxam-Gilbert:  chemical method

In 1977 Wally Gilbert's lab introduced a chemical sequencing method for DNA.  I employed this method to sequence 70 bp of the promoter region for a late gene from phage T4 called gene 23.  I used to boast that mine was the shortest DNA sequence ever published but that's not actually true.  This one probably is.

The first step was to use alkaline phosphatase to remove 5' phosphates, then DNA kinase and 1 mCi of 32-P-ATP to label the ends.  If you do this with double-stranded DNA then both ends will be labeled.  You must somehow separate the strands or, cut with a restriction enzyme to get two different-sized pieces, and separate them by gel electrophoresis.  Here's a strand separation from the paper.

I used to do this on very thin acrylamide gels by denaturing the sample, loading it, and immediately cranking the power supply to 4000 V.  If you left it too long (> 15 sec), the heat would crack the glass plates.

The next steps were relatively simple organic chemistry
... the purines (A+G) are depurinated using formic acid, the guanines (and to some extent the adenines) are methylated by dimethyl sulfate [destabilizing the glycosidic bond], and the pyrimidines (C+T) are hydrolysed using hydrazine. The addition of salt (sodium chloride) to the hydrazine reaction inhibits the reaction of thymine for the C-only reaction. The modified DNAs may then be cleaved by hot piperidine; (CH2)5NH at the position of the modified base.
 What this does is to generate for each reaction a population of molecules, some of which end at the base in question.  So if the original molecule is


with * marking the 32-P, then after partial cleavage after A (adenosine) you have


Here is the wikipedia figure:

After gel electrophoresis and autoradiography (exposure of the gel to X-ray film), you get something like this:

And you can read the sequence.

Sanger sequencing

Fred Sanger developed a method for DNA sequencing.  It generates a population of molecules like above by using synthesis instead of degradation.  The synthesis is stopped short in some of the molecules by including a reagent which poisons the reaction, namely, a dideoxynucleotide triphosphate.

Here is the principle of the method in a figure from Sanger's (second) Nobel lecture.

In 1988, I spent about 3 months running about 100 sequencing gels to obtain about 3.5 kb of sequence data to determine the sequence of my favorite gene, hemA.  The data looked like this:

With somewhat better methods I spent about 6 months to obtain about 12 kb of sequence data in 1993-94.

Year  Amt         Time invested
1983  70 bp       months
1988  3500 bp     3 months
1994  12000 bp    6 months

Probably the biggest challenge in the method was to prepare the large polyacrylamide gels with no bubbles and a thickness at the top of 0.4 mm.

This era ended when commercial DNA sequencing services started.  Together with PCR, including PCR to amplify transposon-genome junctions, it revolutionized our work in the mid-1990s.  We would do a PCR reaction, send it in the mail, and 2 days later get a result by email.

Two very important adaptations were later made to the method.  First, rather than label the DNA, the dideoxy terminators were themselves labeled with fluorescent dyes.  Among other things, this meant that the reactions could be analyzed together in a single sample and analyzed by laser activation and a detector.

The other was replacement of polyacrylamide gels by capillary electrophoresis.  This extended the amount of information obtainable on one analysis of a sample from about 200 nt to about 1000 nt.

Machines do many reactions in parallel.  ABI 310.  You can buy a used one cheap, but caveat emptor.

That is the technology which was used to sequence the human genome, declared complete in 2003.

Reconstruction of SARS-CoV-2

A few days ago a paper was published in Nature after rapid review:

Rapid reconstruction of SARS-CoV-2 using a synthetic genomics platform.

It's quite amazing.  For starters, the work was completed in less than 6 weeks, since the first genomic sequence of the SARS-CoV-2 virus was released on Jan 10 and the paper was submitted on Feb 22.  Most of that time was preparation of the DNA fragments.

9 DNA fragments

The first step was to PCR amplify DNA segments spanning the genome.  This was done by RT-PCR.  Usually one verifies the sequence has not been mutated in the PCR reaction (I don't see this in the paper).  

Alternatively one can just order synthetic DNA of < 8 kb these days (with a 10 day turnaround)!  link NPR story


Some DNA fragments are unstable when cloned into E. coli.  So they used YACs (yeast artificial chromosomes).  An additional advantage is that all 8 fragments can be assembled into the final product in one step!  (TAR-cloning)

In vitro transcription

The clones are made with a promoter for phage T7 RNA polymerase upstream and poly A tail downstream followed by a restriction site for Pac I.  YAC DNA was prepared, cut with Pac I and transcribed with T7 polymerase.


The transcript was then transfected into mammalian cells, together with mRNA for one viral protein.  The system is a bit more complicated than that:  there are two cell types.  But it seems easy enough, and the result is that the transfected cell system produces virus.

The virus can then be cloned (classic terminology, progeny from a single individual produced) and grown in culture.  Simply amazing.

Attenuating mutations from viruses for the live-attenuated vaccine for SARS-CoV can be assembled into live SARS-CoV-2 and tested for growth properties and attenuation, and then investigated as vaccines.

Friday, May 8, 2020

Jonathan Swift

Falsehood flies, and the Truth comes limping after it.

Lewis Thomas on Streptococcal pneumonia

I've read a few case reports of COVID-19 cases. They talk about a slowly progressing disease, not particularly severe although the patients are miserable, until about day 10. At that time you get the severe pneumonia. I think it's probably a "cytokine storm", and I wanted to call it the "crisis", though I'm not sure that's an official term. Today I started to wonder about where I heard that before. And then I remembered:
"the patient complained of the sudden onset of chills and fever, cough, sometimes with blood-tinged sputum, and pain in one side of the chest; physical examination revealed dullness to percussion with one’s fingertips over the affected lung area and a characteristic change in the breath sounds heard with the stethoscope at the same spot. Given this amount of information you could begin making predictions. The prognosis for a young adult was the most surely predictable: an acute illness lasting ten to fourteen days, with a high fever each day, more chest pain and more cough, perhaps with alarming manifestations of exhaustion and debilitation near the end of this period, and then, suddenly and as triumphantly as the bright sunshine after a thunderstorm, one of the great phenomena of human disease — the crisis. On one day or another, after two weeks of his seeming to come closer and closer to death’s door, the patient’s temperature would drop precipitously within a few hours from 106 degrees to normal, and at the same time, with a good deal of sweating, the patient would announce that he felt better now and would like something to eat, and the illness would end, like that."
from Lewis Thomas, The Youngest Science.
Curious to know how they treated it? Streptococcal pneumonia was treated (1938) by first finding out what type of Streptococcus the patient had, and then injecting rabbit anti-Streptococcal serum. We wouldn't do that today, it's pretty risky. But I'm looking forward to convalescent patient serum, and perhaps humanized monoclonal antibodies for SARS-CoV-2.
If you've never read Lewis Thomas, you should. A biologist, an MD and a poet.

Originally from FB (2020-03-29).


You've probably heard about remdesivir, it's all over the news.  Tony Fauci was on TV being excited about a not very exciting study the other day that "showed promise."

I've been quite interested in this drug ever since I read in NEJM that the first patient to be diagnosed as SARS-CoV-2 positive in the US ("Snohomish"), was treated under an EUA with remdesivir when they thought he was going south, and the next day, he got better.  Much better.  Supplemental oxygen was discontinued that day.  There is plenty of anecdotal evidence on the web in addition to Snohomish.  I read a piece about an ER doc the other day but I can't find it now.  The patient said "that magic juice works, Doc."

Remdesivir has a weird name.  Many drugs do. In this case, it's a result of a naming convention that antivirals end in -vir, monclonal antibodies in -mab and so on.  It's a very useful convention because one can immediately tell the class of a drug from its name.  There's a whole list of them here.  I don't know that remdes- is significant itself.

Like many antiviral drugs, remdesivir is a nucleoside analog.  Here it is compared with ATP (remdesivir on the left, adenosine triphosphate on the right).

Remdesivir was initially developed for Ebola and Marburg viruses (filoviruses).  It has broad antiviral activity (in vitro) for other virus families including paramyxoviruses and coronaviruses such as the first SARS-CoV.  These viruses are all RNA viruses, and remdesivir inhibits the enzyme that replicates the genome, which is an RNA polymerase.

As part of its clinical testing for Ebola, the safety profile in humans is known to be good, although there are some things to watch for.

As with all antivirals, it is likely to be most effective if given early in the disease.  The problem with that is that there doesn't seem to be much of it.  The maker, Gilead Sciences, just contributed its entire stockpile of 1.6 M doses to the federal government, and they are busy sending most of it states that voted for Trump (I'm not kidding).

Remdesivir is apparently hard to make.  Here is an article about some chemists who were able, with a lot of effort, to synthesize one gram of it.  Although you can take that with a grain of salt (or a shot of tequila).  This story has twists and turns in it.


So Gilead is a company started in the late 1980s, the time when many people including my former professors John Abelson and Mel Simon had a startup to find drugs active against reverse transcriptase.  It is named after the Balm of Gilead.

Gilead developed a couple antivirals in the 1990s, and then (small world department) acquired a company called NeXstar Pharmaceuticals in Boulder (this is Larry Gold's company, another colleague from the phage T4 world).  If I'm reading wikipedia right, NeXstar was their key acquisition because they obtained a sales and marketing ability for Europe and other markets.

In any event, Gilead is the fish that keeps eating other fish its size or even a little larger, exactly the right ones, and then gaining weight.  Big enough to give away a lot of money.  According to the wikipedia article, "Charitable donations to HIV/AIDS and liver disease organizations totaled over 440 million in 2015."

One of the companies they acquired had developed sofosbuvir, an antiviral for Hepatitis C virus.  They paid 11 B and yet Forbes calls it ""one of the best pharma acquisitions ever".

You may have heard about these treatments.  HCV (hepatitis C virus) is essentially a death sentence.  Gilead's drug will cure you, but it'll cost you your house.

the cat story

So finally, there's a weird tie-in with cats, which I read this morning in the Atlantic.  (In fact, I've been so impressed with them, especially recently, that I purchased my first magazine subscription since Mad magazine).

Apparently remdesivir has a cousin, a closely related drug called GS-441524, also developed by Gilead.  It was discovered by a research scientist working on the disease FIP, that GS-441524 cures cats of this fatal GI illness, which is caused by a coronavirus.  Apparently, in spite of the pleas of legions of cat-lovers, Gilead will not try to license GS-441524 for this purpose, apparently because
any adverse effects uncovered in cats might have to be reported and investigated to guarantee remdesivir’s safety in humans.
So there is a shady black market in China for GS-441524.  I kid you not.  Read the story, it's great stuff.


This is just my two cents, which means little.  I am highly skeptical that a vaccine can be successfully developed in 18 months, let alone 6.  As I said in another post today, the optimistic experts say "maybe we will get lucky."

But I am very optimistic about therapeutics and particularly remdesivir, given early.  The thing about COVID-19 is that the lung damage is so severe.  Even if you survive it takes a long time to heal.  And antivirals always work best given early.  See oseltamivir (another Gilead drug).

DNA and RNA vaccines for SARS-CoV-2

People ask about the prospects for a vaccine for SARS-CoV-2.

The NYT had a good article about it about a month ago.

The history of vaccine development is a series of long drawn-out and frequently failed attempts.  The idea that we throw everything at this problem and it's just going to work, is pie-in-the-sky.  The optimistic experts say "maybe we will get lucky."

The proposals that would move the fastest are for DNA and RNA vaccines.  These are vaccines that would contain (for example) the gene for the Spike surface protein of the virus, with accessory elements giving high expression.

They would either be mRNA (modified to be resistant to breakdown) or DNA.

But you should be aware that no DNA or RNA vaccine has been licensed for humans, period.  There is one for West Nile Virus in horses.  Many have been tried but they frequently don't elicit neutralizing antibodies.

One problem is efficiency.  The nucleic acid just doesn't get into cells very well.  People are trying gene "guns" to improve efficiency:  the nucleic acid is bound to small metal microbeads, and propelled into your arm with compressed helium.

Moderna is an RNA vaccine that is in ongoing Phase 1 clinical trials, and has been approved to start Phase 2 even though the results from Phase 1 aren't known yet.  Phase 3 is expected to begin in early summer.


Phase 1:  Small N.   Is the vaccine grossly toxic or relatively safe?
Phase 2:  Medium N.  Is the vaccine effective?
Phase 3:  Large N    Is the vaccine very safe and also effective?

Analysts are even wondering where they are going to get all the vials they will need to put the vaccine in, and the syringes and needles for injections.  It's a real problem.

Thursday, May 7, 2020

Links to recent posts inspired by the pandemic

Covid-19 pathogenesis and epidemiology:
link Unsupported claim of increased pathogenicity (2020-04-23)
link Covid-19 deaths by county (2020-04-29)
link South Carolina cases (2020-04-30)
link Covid-19 CFR (2020-05-01)
link SARS-CoV-2 transmission (2020-05-03)
link Covid:  good news, bad news (2020-05-05)

SARS-CoV-2 virology:
link Phylodynamics, NextStrain, and "Snohomish" (2020-04-30)
link Rapid reconstruction of SARS-CoV-2 using a synthetic genomics platform

General virology:
link Introduction to animal viruses
link Measles virus

link Overview of vaccination (2020-05-07)
link DNA and RNA vaccines for SARS-CoV-2 (2020-05-08)
link Virus attenuation (2020-05-18)

link Remdesivir

link Covid tests:  PCR (2020-05-06)
link Covid tests:  LAMP + CRISPR (2020-05-06)
link Sensitivity, specificity and Bayes (2020-05-06)
link DNA sequencing:  classical methods (2020-05-08)
link Chromatography (2020-05-10)
link Electrophoresis (2020-05-10)

link Lewis Thomas on Streptococcal pneumonia (orig FB 2020-03-29)

Stupid stuff debunked here (links):
link Lyme disease is a U.S. government-developed bioweapon (false)
link Mouse retrovirus associated with Chronic Fatigue Syndrome (false)
link Vaccines contain heavy metals (other than mercury) (false)
link Judy Mikovits (crazy person) + this
link Didier Raoult and HCQ (crazy person)
link No saline placebo trials for vaccines (false) --- No true Scotsman fallacy

A connection between thimerosal in vaccines and autism is refuted by systematic studies but also by the simple fact that since the removal of thimerosal from childhood vaccines, rates of autism diagnosis have not decreased as predicted but actually increased.

Overview of vaccination

A month ago there was an article in Nature that summarized current efforts toward a vaccine for SARS-CoV-2.  A total of 115 different projects were identified, 73 of them clearly moving forward.  Not much specific is publicly known as yet about these different projects.

In this post, I want to start with some history about vaccination and a broad overview of vaccine strategies.

We'll take a look at specific projects for Covid in another write-up.  I'm particularly interested to read up on modern approaches and what their chances of success are.


Smallpox has been a greatly feared viral disease that was eliminated from the earth in the 1970s.  This guy is the last person to ever have smallpox.

(Except that the Russians kept samples, and apparently that lab had a fire recently).

Both Washington and Lincoln had smallpox, Washington as a young man when he visited the West Indies with his tuberculous brother Lawrence, and Lincoln about the time of the Gettysburg address.

Variolation involves taking pus from a smallpox patient and inoculating it to an uninfected person using a needle.  Often the resulting case would be mild.  It is an old practice dating back at least to the 1500s if not before.  It was brought to England in the early 18th century.
Maitland conducted an experimental variolation of six prisoners in Newgate Prison in London. In the experiment, six condemned prisoners were variolated and later exposed to smallpox with the promise of freedom if they survived. The experiment was a success, and soon variolation was drawing attention from the royal family, who helped promote the procedure throughout England.
The problem with variolation was that sometimes the smallpox that developed was severe.  This led Jenner to his famous development of the first vaccination (from the Latin vaccus, cow).  Supposedly he noticed that milkmaids had clear complexions, completely lacking the smallpox scars that were quite common for other people.  About 1798, he took samples of a virus related to smallpox, cowpox, and used that for inoculations with great success.


Classical vaccines are of three types.  The first to be developed were inactivated proteins.  These were made from protein (toxins) secreted by pathogenic bacteria as they are grown in culture.  [link]  The toxins are collected from the supernatant, concentrated and inactivated with formaldehyde.
Diphtheria once was a major cause of illness and death among children. The United States recorded 206,000 cases of diphtheria in 1921, resulting in 15,520 deaths. Diphtheria death rates range from about 20% for those under age five and over age 40, to 5-10% for those aged 5-40 years. Death rates were likely higher before the 20th century. Diphtheria was the third leading cause of death in children in England and Wales in the 1930s. 
Since the introduction of effective immunization, starting in the 1920s, diphtheria rates have dropped dramatically in the United States and other countries that vaccinate widely. Between 2004 and 2008, no cases of diphtheria were recorded in the United States. 
There is a famous dog named Balto, part of a sled team that brought the diphtheria vaccine to Nome, Alaska during an outbreak in the 1920s.  I'm not sure he'd be happy about it but Balto was stuffed and currently resides in a museum in Cleveland.

[Update:  the treatment for diphtheria was not a vaccine, but an "antitoxin", that is, antibodies to the toxin protein.  Antitoxins are commonly made in large animals, especially horses.  There is a vaccine, but that came later. ]

The other types of vaccine use the actual infectious agent.  There is a vaccine for tuberculosis that is an attenuated strain of Mycobacterium tuberculosis called BCG.

A vaccine against a virus like polio may be inactivated or alternatively, a live but attenuated virus.  These vaccines depended originally on the development of mammalian cell culture and its use to produce virus in quantities large enough to immunize large numbers of people.

Inactivation is by treatment with chemicals (e.g. formaldehyde) or possibly UV.  Attenuation means that the virus grows, but does not cause disease, either because it grows too slowly, or because it is unable to infect a particular tissue (poliovirus in the nervous system).  Finding an attenuating mutant is a laborious process.  In testing pathogenicity, there is no substitute for tests with live animals.

Besides tetanus, the first group of toxoid vaccines includes diphtheria.  An example of the second is the Salk polio vaccine, and the third class includes many important vaccines including

- Yellow fever
- MMR (measles, mumps and rubella)
- Varicella (chickenpox)
- the oral (live) Sabin polio vaccine
- live (nasal spray) influenza vaccine
- Smallpox

In the last 30 years or so, vaccines have been developed for Streptococcus pneumoniae using its capsular polysaccharide, which may be chemically attached (conjugated) to a protein to make it more immunogenic.  The capsule is important in preventing engulfment (phagocytosis) by cells of the immune system called neutrophils.  Antibodies to the polysaccharide capsule allow phagocytosis and prevent disease.

Most viruses are first encountered by the host on mucosal surfaces in the mouth, the gut (digestive tract) or the airways (respiratory tract).  In these environments a particular type of antibody called IgA is predominant.

Vaccine dogma holds that to produce strong immunity, a good IgA response is essential.  The vaccine type which is best at achieving a good IgA response is thought to be an attenuated live virus vaccine, because it is the gift that keeps on giving.  Continuous presence of the antigen drives antibody producing cells (B-cells) to switch to making IgA.

Three vaccines


The news of the success of the killed polio vaccine developed by Jonas Salk was greeted with joy in 1955.  This vaccine is called the Salk vaccine or IPV (inactivated poliovirus vaccine), and it's based on three wild, virulent reference strains:  Mahoney (type 1 poliovirus), MEF-1 (type 2 poliovirus), and Saukett (type 3 poliovirus).  The three viruses are grown in tissue culture (Vero cells, a type of monkey kidney cell), and then collected and inactivated with formaldehyde (formalin).

Salk's approach was widely criticized as dangerous, and indeed, in that same year a batch from the Cutter Laboratories in Berkeley, California was not properly inactivated and it resulted in 200 cases of polio and 11 deaths.


A different vaccine using attenuated live virus was developed by Albert Sabin, called the oral polio vaccine (OPV).  It came into use around 1961.  The OPV contains the three prevalent serotypes of poliovirus which have been passaged for many generations in tissue culture and accumulated mutations that do not interfere too much with growth in the gut, but prevent neurovirulence, growth in the nervous system.  The individual components of the MMR vaccine were also developed in the 1960s.

It seems that the main reason the oral polio vaccine won out at that time was that the Salk vaccine is an injection, while the live virus is administered orally, often a drop or two on a sugar cube.  Kids prefer the sugar cube.  Also, sterile syringes are in short supply around the world.  As an attenuated live virus, the Sabin vaccine is expected to be better at inducing a good IgA response.  There is also evidence that the Sabin vaccine interferes with shedding of live poliovirus when they are both present, causing the live virus to die out.

The attenuated virus in the Sabin vaccine has been sequenced (years later).  Although there had been speculation about deletion mutations, apparently there are just a number of single nucleotide substitutions.  Only a few of these are responsible for the attenuation phenotype in any given vaccine strain.  [review]

It happens rarely that the OPV causes authentic polio, a phenomenon called VAPP (vaccine-associated paralytic polio).  The sequence of some VAPP strains shows reversion of the mutations in a one of the vaccine strains to change at least some of the attenuating mutations back to wild type.  Also, it turns out that there are other naturally occurring viruses in the same family as poliovirus (Picornaviridae, for small RNA virus), which occur in nature and these may sometimes recombine with the live Sabin virus to give a recombinant that has recovered the ability to grow well in the nervous system (ref).

In any event, this reversion is problematic not so much for the vaccine recipient (who is well on the way to immunity) but for other residents of an area with poor sanitation.  The transmission of vaccine strains to others in poor communities was originally held to be a feature, not a bug.

VAPP, which is extremely rare, finally became of crucial importance when polio was eliminated from most parts of the world.  Then, having a risk of developing authentic polio out-weighed the advantage of the oral vaccine.  When our son was born, we wrestled with the question of which vaccine to use for immunization.  We chose Salk's IPV.  Later, the vaccine schedule was changed to have the killed virus first and the live virus for boosters.

Here is a slide that shows the difference between IPV and OPV in the IgA response.

Although an IgA response is desirable, it is obviously not essential.  IPV works well.  That may well have something to do with the real pathogenesis happening in neural tissue.


The third vaccine to talk about is the live influenza vaccine.  As a virus, influenza is unusual because it has a segmented genome, each gene is on a separate piece of RNA within the virus particle.

In epidemiology, influenza is different than most viruses because it mutates rapidly enough to escape the immune response within two or three years.

As with all viruses, a major component of the immune response is directed against proteins (antigens) on the surface of the virus.  For influenza these are HA (hemagglutinin) and NA (neuraminidase).

HA (hemagglutinin) is a viral protein that binds to the receptors for influenza virus on animal cells.  These receptors are proteins with sugar chains attached to them and with the special sugar sialic acid attached at the end.  HA binds to sialic acid.  There are subtle differences in the structure of different forms of sialic acid between tissues or say, comparing human and avian hosts.  Some HA bind better to human cells.  HA is sometimes further abbreviated as H.

There is a second surface protein called NA (neuraminidase), further abbreviated N.  Its role is to cleave sialic acid from the sugar chain when newly synthesized virus is leaving the cell.  Different strains of influenza virus have different types of H and N.

The influenza virus from the 1918-20 pandemic, that has circulated in various permutations ever since, is H1N1.  The virus of the 1956-57 pandemic was H2N2, and that for 1968-69 was H2N3.

This antigenic drift means that a new vaccine must be formulated to best match the viruses that are circulating in any particular year.  The segmented genome makes it fairly easy to produce a new vaccine strain by coinfecting cells with the virus whose HA and NA you want, plus the old vaccine strain.  Reassortment of viral genome segments produces different combinations which can be screened to find the desired type that has attenuation mutations but the target HA and NA.

However, this also means that it is fairly easy (in evolutionary time) for completely new HA and NA genes to be transferred from avian flu strains to human flu strains.  This is called antigenic shift, and it happened successfully in 1918, 1956 and 1967.  The 2009 virus came from pigs (although it must be noted that we gave it to them first, in 1918, so it's not really fair to complain).

Some more slides from one of my old lectures:

Other approaches

HPV (Human papilloma virus) comes in many serotypes.  These viruses cause warts, but certain types are also associated with various cancers.

HPV is a small DNA virus and the proteins of the virus can self-assemble into a virus particle.  One HPV vaccine is made from hollow "virus-like-particles" assembled from proteins made by recombinant DNA methods, without the genome inside.

There are other approaches to making vaccines that have the promise of more rapid scale-up including DNA and RNA vaccines, and recombinant proteins (e.g. the S or Spike protein).  Also of importance are viruses that have been modified to not be harmful but can be used as carriers to express the desired cornavirus protein antigens.  These include adenovirus and retrovirus-based vaccines.

For many of these, there is a concern that they will not boost the immune system as much as is necessary to give a good response.  Various compounds (or even killed bacterial cells) act as adjuvants, treatments that increase the immune response.  Many different combinations are possible and probably most of them are being tried.

It is noteworthy that the subunit vaccine (recombinant proteins) for pertussis (whooping cough) doesn't seem to be as active as the old vaccine, which had the drawback that most kids had soreness and redness at the site of the injection for several days.


We do not yet have successful vaccines for several important targets.  These include HIV and malaria.  In the case of HIV, the success of therapeutics may have something to do with the lack of progress on a vaccine, but it is a problem where many approaches have failed.


There are a couple of negative points to mention about vaccines.

One is that it is important to keep the vaccine formulation sterile.  A tragic early event was the death of some Australian children due to injection with diphtheria vaccine that was contaminated with Straphylococcus aureus (the Bundaberg disaster).  The famous Australian scientist Macfarlane Burnet had a role in solving this mystery.

A second issue is that occasionally, for a particular vaccine, the immune response makes an infection with the virus worse.  The example we have of this is a Dengue called Dengvaxia.  However, this hasn't been seen with other viruses, or even other vaccines for Dengue.


Finally, there is the issue of Thimerosal.  After problems with vaccine preparations which had been stored and allowed bacterial growth, it became the practice to add low concentrations of a compound containing mercury called thimerosal.

In the 1990s anti-vaccine activists popularized the notion that thimerosal in vaccines may have a connection to autism.  In response, thimerosal was removed from vaccines for children by 2001.  Rates of autism haven't changed, and there was never any evidence connecting thimerosal to autism.  One prominent activist, Andrew Wakefield, was found guilty of scientific fraud.  (He was motivated by payments by a group of lawyers who hoped to profit by suing vaccine manufacturers).

Non-human vaccines

There are a number of important vaccines for animals, including pets.  The "core" vaccines recommended for all dogs are rabies, distemper, parvo, and adenovirus.