Hidden Dimensions of Vibrio cholerae Pathogenesis

What happens during an infection is really unknown. Microbial virulence factors are very small, and there’s not very many of them, so the act of detecting them might actually perturb or destroy them. It’s thought that many bacteria, including V. cholerae, changes its gene expression pattern upon entering the host!

Over the years, we have developed methods to observe host-pathogen systems to find out what happens during infection. In vivo expression technology(IVET) and signature-tagged mutagenesis (STM). We have been using these things to learn more about Vibrio cholerae pathogenesis

A little history lesson:

Since the times of Hippocrates, people have been suffering from cholera. Cholera is this really narly disease that essentially causes aggressive diarrhea such that its victims eventually die of horrible dehydration. It’s spread though the fecal-oral route.

Before it was known that Vibrio cholerae causes cholera, people used to have all kinds of folk remedies. Then Mr. John Snow came around and drew his famous map. He triangulated the disease cases and noticed that a well in the center of town seemed to be the center of the outbreak. Low and behold, he goes to the well, and finds a poop-filled diaper which he disposes of, thereby curing the whole town.

Now, we need to go much further in our understanding of this disease and the factors that influence it. While some cholera outbreaks coincide with seasonal algal blooms (cholera clings to the surface of aquatic plants and crustaceans, the factors that enable the bloom and spread of the pathogen are not well understood.

What do we want? Vaccines!

Above all else, we would love to develop a vaccine for cholera. The V. cholerae lives on the mucosa of the small intestine, and so it’s thought that a vaccine would need to trigger a host response there. People have created vaccines that do in fact protect from cholera, however, they also cause diarrhea, cramps, nausea, and vomiting (just not deadly). The challenge then, is to uncouple the factors that make the vaccine work (induce immunity) from those that make the vaccine get an aggressive reaction from the body. Luckily, we have a few fancy methods which will help us look in vivo at V. cholerae

Analytic Methods

The technology IVET (in vivo expression technology) lets us create a system in which 1. you have a whole library of pathogen variations and 2. only the pathogen strains that have “induced transcriptional fusion” (i.e. they express the gene you inserted) survive.

The technology STM (signature-tagged mutagenesis) involves randomly inserting bits of DNA and tagging that inserted DNA with a short sequence tag. At every step of the procedure, you can do PCR with primers for those tags, and then hybridize the DNA to a “master dot blot” that has all the complementary tags lined up in order, so we can see which insertions are still around. Then, after infection, you collect all the surviving bacteria and check which tags are still around. If they’re not around, they were out-competed by the infectious bacteria and so the gene that was knocked out it probably useful for infection!

But really, let’s get to the star of the show: RIVET

Recombination-based in vivo expression technology. The goal here is to detect in vivo induced genes ( ivi). Just like a lot of the other technologies for genetic manipulation, this tech takes advantage of a reporter gene, in this case tnpR, which codes for the site-specific DNA resolvase enzyme Tngd. Let’s remember that a resolvase is an enzyme that will perform recombination, so this is an enzyme that will only perform recombination at a specific site or sequence. Tngd mediates recombination between sites known as res sites.

  1. So we insert two res sites into the cholerae genome surrounding a reporter gene like antibiotic resistence (presumably using standard restriction enzyme insertion).
  2. Use restriction enzymes to digest the entire microbe’s genome, and clone each into a conditional plasmid just upstream of a promotorless tnpR gene. This gene cannot express itself, and will only be expressed if the inserted gene is expressed
  3. Introduce the conditional plasmid into the microbe with the ‘res-ed up’ genome. The conditional plasmids will integrate into the main genome by homologous recombination??? I’m very unclear on this step
  4. If the inserted gene is expressed during infection, you will also get tnpR expression. The enzyme Tngd will then do and cut at the res sites, excising the reporter gene (antibiotic resistance).
  5. * You need to carefully select strain of the microbe (cholerae) with the correct level of resolution in vitro (?? What is resolution??) My guess is that resolution means you get excision of the reporter gene even when there is not infection happening in vitro. If you select strains that have no activity in vitro, they probably won’t have activity in vivo either. But if you select strains with some activity in vitro, then the gene you identify as being involved in infection may be noise. They say err on the side of inactivity.
  6. * Once they are screened for activity in vitro, you also have to screen them for pathogenecity. You put the buggies into animals, and then retrieve them back from infected tissues.
  7. *TLDR: If the inserted gene is expressed during infection, the strain goes from antibiotic resistant to sensitive.
  8. ???I don’t understand how killing them helps select them
  9. Also, adding in a res cassette that incorporates the sacB gene
    (makes bugs die on sucrose) in addition to the antibiotic resistance gene allows use to do both positive and negative selection. If your inserted gene is expressed, you get sacB and antibiotic resistence cut out of your genome. Now you die on antibiotics, but you survive on sucrose. This solves the problem of figuring out which strains actually had tnpR activity (which used to be done by replica plating)
  10. What if the genes that are being turned on during infection are actually still expressed at baseline in vitro? We create different varieties of the tnpR gene, each with mutations in the ribosome binding site that make them expressed less for a given amount of transcript. ???Somehow, this lets us detect the change in transcript abundance for these genes induced by infection ???


They are using the tech to identify genes in V. cholerae that are turned on when the bacteria colonized the cyanobacterium Anabaena. The idea is to figure out what enables the V. cholerae to survive in aquatic environments. They have also found all of these genes that are expressed when V. cholerae colonizes the intestinal space, and were able to figure out when those genes are expressed (relative to time of infection). More later if relevant


BERT – Embedding Algorithm

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding


Pretty…I will keep it

Basic Overview:

BERT is a pre-trained neural network that has learned to take in sentences with some words obscured, and predict those obscured words (their dictionary id, at least). It is a deep, bidirectional network, and can be fine-tuned to a specific question/answer or other language task with a “small” amount of training data. Thus far, it is part of the ensemble method that is winning the SQuAD challenge, the question answering dataset out of Stanford.

BERT is meant to be used for transfer learning (fixing the weights of a neural network, adding a few layers on top, then training just the last few layers to a specific task). It is NOT a set of embeddings.

The model architecture is a multi-layer bidirectional Transformer encoder. We’ll find out what a Transformer encoder is in a moment, but it is available in the tensor2tensor library: http://nlp.seas.harvard.edu/2018/04/03/attention.html