Wednesday, June 14, 2017

rectal cancer have you checked the distance









male speaker:please join me in welcoming today’s speaker: dr. elaine mardis. [applause] elaine mardis:thanks. all right. thanks so much, andy. i always feel so much better about myself afteri hear your introductions. it's fantastic to be here, and thank you for joining me todayto talk about my first love, which is sequencing technologies, so i think we should just getgoing without hesitation here and jump right in. now, i do have one conflict to announce,just to keep in mind throughout the presentation, is that i am on the supervisory board forqiagen, which is a company based over in europe.


okay, so let's just jump right in, i think,and talk about what i like to refer to as massively parallel sequencing. you'll alsohear me slip up sometimes and talk about next-generation sequencing. they're sort of the same thing.i just want to sort of walk you through the basics behind this, then move forward intosome newer sequencing technologies, and finish up with just a little vignette on how we'reusing all of these technologies, the associated bioinformatics pipelines, and really beginningto make inroads in changing the course of outcomes for cancer patients. so i'll justleave you with that little teaser and you can hopefully enjoy it at the end. so, as andy said beautifully in his introduction,you know, massively parallel sequencing and


next-generation sequencing have really transformedbiomedical inquiry. you can see this output per instrument run figure shown here froma little perspective that i wrote for nature in 2011, cited at the bottom of the slide,that really shows the magnificent jump in the amount of sequence data that we couldgenerate in the advent of next-generation sequencing devices between 2004 and 2006.but above and beyond just the sequence output, which has continued to climb in a radicalway as i'll talk about in just a moment, there are really other, you know, sort of proceduralaspects of next-generation sequencing that have freed us from some of the old ways andreally contribute to this overall acceleration in our ability to generate sequencing data.so, just for purposes of illustration, you


can see sort of sequencing as i learned itback in the day, where we did a lot of bacterial work with sub-cloning, plating, and dna preparationof individual sub-clones, and, importantly, we did a separate set of sequencing reactionson those sub-clones, followed by a separate electrophoresis and detection step. so, reallydecoupling the molecular biology of sequencing away from the actual sequence data generation. the contrast here is remarkable. if you lookin panel b, which sort of illustrates the step-wise process for next-generation sequencingwhich starts with just standard fragmentation of dna, which is done using sound waves orother shearing-based approaches, we do some repair on the ends of the dna and adapt -- putadapters onto the ends with a ligase enzyme,


attach these to some kind of a surface, amplifythose in situ, and then proceed, importantly, to a combined molecular biology and sequencingdetection step. so, rather than the separate process up here in old-style sanger sequencing,next-generation or massively parallel sequencing does everything together at the same time.let me just illustrate the differences in a little bit more detail between -- in termsof how massively parallel dna sequencing works. first of all, as i already alluded to, youhave to create a library to do sequencing on, but actually library generation is a veryrapid process. it can be completed even by a high school senior with a reasonable attentionto detail -- my daughter is a testament to this back in the day -- and can be done sortof in the period of time of an afternoon or


maybe even a day if you're not in a super-bighurry. so this library approach is really just using these custom linkers, or adapters,as they're called. they're attached to the ends of the dna with a ligase enzyme as imentioned, and over time the genesis of different library kits has led to kits with really muchmore efficient ligation procedures. this is important for low-input dna which we'll talka little bit about. and this is really sort of now a wide-open field for additional commercialdevelopment in terms of improving library kits into next-gen sequencing platforms. theother aspect which i showed in that mini-figure at the bottom of the last slide is that wedo need an amplification step for these resulting adapter-ligated fragments. why is that?


well, unlike some of the technologies i'lltalk about in a few minutes, this is sort of sequencing of a population of fragments,so we require amplification of each one of these library fragments so that the downstreammolecular biology and detection actually works. in other words, the instruments that i'llbe talking about there for the next few minutes aren't sufficiently sensitive to sequencefrom a single molecule. rather, they need the amplification of that molecule into multiplecopies in order to generate sufficient signal to be seen by the imaging optics or otherdetectors that are on the sequencing instrument. the way this amplification is accomplishedis by enzymology. first, there is attachment to a solid surface;this can be either a bead which is round,


of course, spherical, or a flat silicone-derivedsurface, and this just depends on the different technologies, as i'll illustrate in a fewminutes. and the way that this attachment happens is that the surface of the bead orthe flat glass is actually covalently -- has these complementary adapters on its surface,and so they're available there for the hybridization of those library fragments followed by anenzymatic amplification, and so it's a really straightforward way to attach these libraryfragments onto the surface and get them amplified up so that you can see them. the next step is really this combined molecularbiology of sequencing, often referred to as sequencing by synthesis or sbs, with the detectionof the nucleotide base or bases that have


been incorporated in the molecular biologyreaction, and so this is a step-wise process, as you'll see from the illustrations thatfollow, where you sort of provide the substrates for sequencing, let the sequencing reactionhappen, and then detect it as a subsequent step in the process. and it -- really, thething that distinguishes massively parallel sequencing from sanger sequencing, as i'vealready alluded to, is that we're not sort of sequencing 96 reactions at a time, whichwas sort of the maximum per-machine throughput in the past for electrophoresis and detectionapproaches; rather, because we can sort of use multiple beads or decorate the surfaceof this flat silicone glass with hundreds of thousands of these library fragments, wecan literally generate the dna sequence needed


to sequence an entire organism's genome ina single run of the instrument. so you're really talking about massively parallelsequencing as being hundreds of thousands to hundreds of millions of reactions thatare detected happening in the step-wise process on the instrument all at the same time. sothe throughput acceleration, as you saw from that graph, is extraordinary over what weused to be able to do with sanger, where we would just buy more and more machines themore sequencing that we needed to do. and then, lastly -- and i'll talk about thisa little bit, but keep in mind that because each one of these amplified fragments startsas an original single molecule, you're really getting digital read type information. whatthat means is that if we have, for example,


a portion of the human genome that's amplified-- so, multiple copies beyond the normal diploid are there; a great example would be her2 amplificationin specific subtypes of breast cancer. you can literally go in and count the full amplificationof that locus in the genome relative to diploid regions and really understand with exquisitedetail the number of amplifications that are there, the number of extra copies, and actually,in whole genome sequencing, the boundaries of the exact start and stop regions on thatchromosome where the amplification has occurred. so this is incredibly powerful. and, goingto rna, you can also calculate very exact digital expression values for given genesfrom rna sequencing data, which starts with rna, converts to dna, and goes through verysimilar processes for the sequencing, as i'll


describe for dna here in just a moment. lastly -- and we'll get to this importantpoint in just a moment when it comes to talking about bioinformatics or the analysis of sequencingdata -- unlike the sanger sequencers of the past, one of the downsides, or one of theconfounders of massively parallel sequencing, is that the overall length of read, the numberof bases that you sequence from any dna fragment, is actually quite a bit shorter than we'reused to seeing from conventional sanger sequencing. typically, back in the day, the sanger readlengths were on the order of about 800 base pairs, let's just say for the sake of argument.most capsular or most massively parallel sequencers will give you roundabout 100 to 200, maybe300, base pairs; so, significantly less. and


when we're talking about analyzing data froma whole human genome, this can actually lead to some significant consequences in the analysisof that data. okay, so let's get deep in the weeds a littlebit on the molecular biology steps and other aspects of massively parallel sequencing.i mentioned the need for constructing a library prior to sequencing. as i talked about already,we fragment the dna into smaller pieces, starting from high molecular weight isolated genomicdna, for example. there are a variety of different steps that are listed here that are enzymaticsteps that are a workup to this adapter ligation that's important for the purposes of the subsequentamplification of fragments and sequencing as well.


these are really just the step-wise processes.we also can now include in this adapter a so-called dna barcode, which is a stretchof eight or more nucleotides that have a defined sequence to it. what that means is that wecan ultimately take fragments from different libraries, mix them together into an equalmolar pool, and sequence those all together with this increasing throughput that i toldyou about and actually generate data from a multitude of different individuals all atthe same time, and then deconvolute that pool using the dna barcode information once wehave the sequence data available to us. so this just takes the place as another readthat is actually sampling the dna barcode information to identify the library that thatactual fragment came from. once we have these


libraries created, we do a quantitation. thattells us the dilution of the library that should go onto the sequencer or into the pool,and then we either proceed directly to whole genome sequencing, or we can perform exomesequencing or specific gene hybrid capture approaches which i'll tell you about justnext. so i want to talk first about this amplificationof the library that's required, which uses pcr in the adapter sequences to just increasethe number of fragments that are in the sequencing population, and the reason for mentioningthis is because there are some confounders here that you have to know about in termsof the downstream data analysis, so let me walk you through this.


obviously, pcr has been with us for quitesome time now, since the mid-'80s, and it's a very effective way for amplifying dna, butthere are some downsides to it. in massively parallel construction, because we're doingpcr after the adapter ligation, you can actually get preferential amplification; this is sometimesreferred to as jackpotting. and what this means is just smaller fragments tend to amplifybetter and so you may get an overrepresentation of those fragments in the read populationthat can lead to duplication. the problem here is that if you also incorporate, forexample, a polymerase error early on in the pcr amplification, the multiple componentsthat are coming from duplicate reads can make that masquerade as a true variant when yougo downstream to analyzing the data, so we


need to be aware that jackpotting can occur.this used to be a problem, but we now have good algorithms to essentially look for exactsame start and stop sites from aligned reads onto the genome and eliminate all but onerepresentative copy of those duplicate reads. so this is algorithmic now, whereas we usedto have to do very careful examination of the alignment reads back in the day. i thinkabout jackpotting a lot in my work because in cancer samples you often receive a very,very tiny amount of tissue from which dna can be extracted, and of course the less dnayou have to put into a library, the more of a problem this becomes. and with formalinfixation and paraffin embedding, often the dna is often fragmented into small pieceseven before the fragmentation step, and smaller


pieces actually lead to an increase in jackpottingand duplicate reads, so this is a real concern for the data analysis piece. i've already kind of alluded to this in myprevious comments about jackpotting, but we can get some false positive artifacts becausepcr is an enzymatic process and it can introduce errors; it's not perfect. if it occurs inearly pcr cycles, this will appear as a true variant. and then, lastly, as i'll talk about,cluster formation or amplification; once you get these fragments onto the solid surfacethat you're going to do the amplification on, it's actually a type of pcr, and thiscan introduce bias in amplifying high or low g + c content fragments. since there's noguarantee of the base content of any given


fragment, some will actually amplify betteror worse, depending upon the percentage gc, and as a result you may not actually detectthese fragments from the library as well as something that has a more balanced atgc componentto it. so these are all considerations that comefrom the use of pcr. the trend over time now has been actually to make the dna sequencinginstruments more sensitive so that you actually have to do fewer cycles of pcr in the libraryamplification and solid surface amplification processes, and this can lead to a reductionin jackpotting, and also better representation of high gc/low gc fragments. but i would arguethat it's not still perfect just yet. so these are things to be aware of.


now, i allude to hybrid capture. just historically,when we first had next-generation sequencing instruments, whole genome sequencing was sortof the only option. but around about 2009, several groups actually developed this approachin various forms and flavors, and it gives us an opportunity to actually use dna/dnaor rna/dna hybridization kinetics to subset the genome into just the regions of the genomethat we care about. this is often referred to as exome sequencing, where the probes thatare used in hybrid capture correspond to all of the genes annotated in the reference genome,but of course you don't have to look at all the genes. you can actually look at a subsetof the exome such as alkinases, for example, and generate sort of custom hybrid captureprobe sets to do this. so how does this work?


well, like everything, we start from wholegenome sequencing libraries, but in this case, the way the sub-setting happens is we combinethe whole genome library first with this sort of capture reagent which really consists ofjust synthetic probes that are synthesized to correspond to exomes of exons of the genethat you're interested in and the secret sauce here, these blue circles that are shown onthe surface of the probes that really reflect the presence of biotinylated nucleotides thatare part of those probe sequences. in mixing together the synthetic probes with the wholegenome library under appropriate conditions for dna/dna or dna/rna probe hybridizations-- sometimes these probes can be rna instead of dna -- you effect a hybridization betweenthe dna fragments in the library and the corresponding


sequences of the probes which represent theregions you want to focus on in your sequencing. the secret sauce, biotin, comes into playwhen you now actually want to isolate out the hybrid capture fragments of your library,and this is by mixing the whole hybridization mixture with streptavidin-linked magneticbeads. of course, biotin binds very tightly to streptavidin, so then, by applying thishorseshoe magnet or some facsimile thereof, you can actually pull down selectively thehybridized fragments from your library and throw away all of the regions of the genomethat you don't want to sequence. this is imperfect, so we do several washes to actually releaseadditional spurious hybridized fragments, and then we can just simply denature awaythe library fragments from these captured


probes. they stay with the magnetic beads,and the hybridized library fragments that have been selected out float free in solutionand then can be amplified and sequenced after they're quantitated. this has actually reached a fairly high degreeof art, again, where we can take these barcoded dna libraries now, make an equal molar pool,and just go against one aliquot of the exome capture reagent for all of the library moleculesfrom all of the individuals that we want to sequence; sequence those out on a single laneof the sequencer, for example; and then deconvolute the data into different pots that correspondto the dna barcodes and, indeed, to the different individuals that have been sequenced. so thisis now very, very high throughput where we


can combine 12 to 14 individuals from an exomecapture into a single lane of a high throughput sequencer and get that information very, veryrapidly. and then, just to finish, as i mentioned, we don't have to do the whole exome; we candesign custom capture reagents just for specific loci genes of interest, and only study thosefrom the standpoint of winnowing down the whole genome to the regions that we care mostabout. now, that's not a perfect or infinite winnowing,so, just to be clear, we -- there's a sort of lower limit of about three or four hundredmegabases of sequencing information that you can efficiently sequence through a hybridcapture process. below that, you really need to go, for purposes of sequencing efficiencyand decreasing the amount of spurious hybridization


or off-target effects, to something that'smore like this. so, back to our old friend pcr. we now have ways to design probes -- sorry,primers for pcr amplification across different loci in the genome to amplify out small numbersof genes in their entirety, and take these multiplex pcr products, turn those into alibrary, and sequence those directly, and these is really best for very small regionsof the genome -- as i said below, about 300 megabases -- where you don't want to pay aprice in off-target sequencing effects. this is also not a perfect approach, because, ofcourse, it's hard to come up with pcr primers that all play well together in the same pcramplification, you know, gc bias and those sorts of things coming to play. but actually,most manufacturers of these primer sets now


have the ability to either sell you somethingthat's already configured and well, you know, sort of tested in terms of giving good representation,or you can actually work with the manufacturers to design a custom set of multiplex pcr ampliconprimer sets as well. okay, so back to the actual sequencing reactionnow that we've either decided to do old genome sequencing, exome or a subset, or a multiplexpcr. how does the actual sequencing reaction work? and since i'm a chemist by training,this is the part that i really enjoy talking about, so hopefully you'll indulge me here.in this illustration, what we're going to be focused on is really the illumina sequencingprocess in particular, so let me walk you through this.


now, earlier, i said that we have this flatsilicone surface that has adapters ligated -- covalently, rather, attached to the surface,and those sequences correspond to the adapter sequences on our library. okay? so once we'vegot our library sort of quantitated, the instrument will introduce the library fragments ontothe silicone surface, and you'll get just hybridization under the appropriate conditionsof individual fragments. and, of course, the reason for carefully diluting these fragmentsis so that you get the right distribution of these amplified clusters across the surfaceof the silicone that's then going to be viewable by the instrument optics. what follows nextare a series of amplification steps which i referred to earlier as bridge amplification.


the reason for that is that in the courseof this amplification, the free end of the library fragments finds the complement downon the surface of the chip, and then you get essentially a polymerase step-wise processthat builds increasing numbers of fragments in situ for each one of these hundreds ofmillions of library fragments that are down on the surface of your chip. so, at the endof this bridge amplification cycle, you might end up with a cluster of fragments that lookslike this, sort of on the order of a hundred thousand or so copies of the exact same molecule.if you image this cluster, it would look like this sort of bright dot, and if you look ata bunch of clusters that are all together in one small area of that chip, they wouldlook a little bit like a star field. and indeed,


the oldest versions of the software for thistype of sequencing were really derived from individuals who had previously been studyingdeep space images, so it's a little bit like deconvoluting that, where you have to identifythe cluster and then isolate its signal from all other adjacent clusters to the best possibleso that you get the truest set of signals coming out of it. so we don't sequence this amplified clusterlike this; we actually have to go through a series of steps that releases one free endof all of the molecules in the cluster. there's just a single representative here, but onthat freed end, we then hybridize a sequencing primer that corresponds to part of the adaptersequence, and this is pointing now down towards


the surface of the chip. and as you'll seehere in this blow-up of this sequenced -- to-be-sequenced fragment, we can then get a polymerase moleculeto recognize that dna/dna hybrid, and now, with the inclusion of sequencing substratessuch as these labeled deoxynucleotides, we can start our sequencing process. this isnow the amplified fragment shown in isolation, but imagine that there are hundreds of thousandsof copies of this in the cluster. they've all been hybridized by this very specificprimer sequence here, and we've got at its end a free three prime [spelled phonetically]hydroxyl for the polymerase to begin adding on nucleotides. in the illumina process, these nucleotidesare very specialized, as they are in other


platforms, but in particular they have twoattributes that are shown here. one is that they have a fluor [spelled phonetically] that'sspecific for the identity of the nucleotide, so a fluoresces at a different wavelengththan c, g, and t. the second thing that's specialized about them is the three primehydroxyl group is actually blocked with a chemical blocker. the reason for this is thatin each of the sequencing by synthesis steps for the illumina process you just want toideally add in a single nucleotide at a time, detect it with the optics, and then removethe block from the three prime end so that you can now bring in the next nucleotide -- g,in this example -- and cleave the fluor so that when this new g nucleotide gets imagedby the optics of the instrument, there's no


leftover residual t fluorescents to interferewith the identification that that is, in fact, a g that's been incorporated. if we look atthis sort of, then ideally what we would end up with at the end of these two cleavage stepsis a free three prime hydroxyl and the absence of a fluorescent group where there was one,so that the next step of incorporation can be successfully detected. now, this is a point where you might be askingyourself, "well, this sounds really great. why can't we just sequence this entire fragment,you know, and make the fragments even longer than 300 or 400 base pairs? then we couldget really, really long reads out of this technology and our lives would be simpler."would that be the case? i would love it. the


limitation here is signal to noise. okay?two things contribute to that. one, chemistry is never 100 percent, so although you tryto cleave all of these fluors off, there will be some residual fluorescence that remains,and that will interfere with subsequent imaging cycles. they might disappear in subsequentcycles, but they may be there to interfere nonetheless. it's unclear and not 100 percent,as i pointed out. similarly, there may actually be the absence of a blocking group on someof the nucleotides, so rather than just incorporating the t in this first cycle, i might actuallyincorporate a t or set of ts without the blockers and then gs can come in right away becauseeverything is supplied at once in this type of sequencing, and then i would get a setof fragments that are so-called out of phase.


that means they're now sequencing one nucleotideahead of everybody else in the population, and over time, this is an increasing phenomenology.what happens over time in increasing cycles of incorporation with this approach is thatnoise increases and at some point becomes equal to the signal that's being producedby all of these fragments that are being sequenced in this cluster, and so you begin to losethe ability to define with high accuracy which nucleotide just got incorporated into thefragment. this is increased over time; the first illumina selecta [spelled phonetically]sequencers that we used back in the day in 2007 had read lengths of about 25 base pairs;the current read lengths are now 150 base pairs, so there has been an improvement overtime in the read lengths that are available,


and similarly, after we go through one setof sequencing like i've just showed you, coming from this end of the adapter, we can actuallygo through now with some additional amplification cycles and release the other end through differentchemical cleavage, prime it with a different primer, and now sequence the opposite endof the fragments. this is so-called paired end sequencing, where we can now collect 150base pairs from each end of the fragment, and those pairs, as i'll talk about in a minute,can map back to the genome of interest. now, just one -- a couple last slides on illumina.their overall approach has changed; whereas they used to just have these empty lanes onthe silicone-derived surface shown here, now they're actually patterned into these littlepits on the surface of the flow cell, so each


lane consists of hundreds of millions of thesethat are in a very defined order, sort of like a honeycomb, and this is a so-calledpattern flow cell. and what this allows you to do is now pack the clusters very, veryclosely together to one another and also to not have to find the clusters. you know wherethey are, essentially, based on how that flow cell sits in the instrument and the fact thatthis set of patterned pits on the surface is a very uniform array. so now what we get,and this is a highly idealized shot from their website, is just sort of this is where theamplification reaction takes place, and in a best possible world, all of the regionsaround this particular portion of the pattern flow cell are entirely clean, so you get avery clean, distinct signature from that and


all of its companion wells as well. okay, just to finish with illumina -- andthis is just a shot from their website to show you one thing, which is that you cansequence a little or you can sequence a lot, i think, just basically to cut to the chase.this is sort of their highest throughput instrument, the hiseq x, which can sequence on the orderof -- i think it's like 12 human genomes, you know, in a 24-hour period. so it's very,very high throughput. this is more like the desktop sequencer, and if you talk to peoplein the field about illumina's, you know, strengths and weaknesses, you'll find that the accuracyof the sequencing is high, so less than one percent error rate collectively on both reads.there's a range, as you can see, of capacity


and throughput. some of these platforms -- themyseq has very -- relatively long read lengths; you can get 300 base pairs paired in readsfrom it, so that improves part of the problem as well, but it's a lower throughput sequencer,so you can't sequence a whole human genome on it. and then there are some improvementsthat have been coming along over time, including the ability to do cloud computing, which we'lltalk about in a minute. now let me shift gears to a different typeof sequencer that the fluorescence-based illumina sequencer, just for the series of completeness.this is using a different idea which is actually the fact that when you incorporate a nucleotideinto a growing chain that's being sequenced, there's actually the release of hydrogen ions.this is using that release of hydrogen ions,


the result in change in ph, to actually detectwhen and how many nucleotides have been incorporated in the sequencing reaction. this is offeredout in the form of an ion torrent sequencer which is available commercially as well. sothe idea here is bead-based amplification; you can see the round bead here with the derivatizedsurface having these adapters to which the library molecules are individually amplified.the best-case scenario is that each bead represents multiple copies, again, of the same libraryfragment. this is done in an emulsion pcr approach where you mix together and make micellesthat contain, in the best-case scenario, the single bead, a single library fragment, andall of the pcr amplification reagents that are necessary going through pcr type cyclesto decorate the surface of this bead with


each of the library fragments that you wouldlike to sequence. these are then loaded onto a chip, which is this sort of idealized structurehere, and consists of two parts. the upper part, where the bead sits and the nucleotidesflow across, is really sort of the molecular biology part of the action, if you will, andthe lower part is really just a very miniaturized ph meter that senses the release of thesehydrogen ions in flows of different nucleotides and registers the corresponding amount ofsignal to tell you which nucleotide was incorporated, based on which one is flowing across at thetime, and how many of those were incorporated. well, how does it get to how many? now, these are native nucleotides, so theyhave no fluorescent groups, no modifications


whatsoever, so keep that in mind. so if youhave a string of as, for example, in your template that's being sequenced, you can incorporateas many ts as possible to correspond to the number of as that are there. the way thatwe discern what got incorporated and how many is based on a key sequence which is shownhere. this is on the adapter itself that's used to make the library and is the pointat which the sequencing begins. so when we flow through a defined set of the four nucleotideswe will get a signal from each one of these incorporations that's equivalent to a singlebase worth of ph change, if you will, and that sets the standard for what a single nucleotideincorporation looks like, so that's the key sequence which then forms the template offof which all subsequent incorporations are


gauged. so, in that sense, where you havefour a nucleotides in a row, you'll have the signal that's approximately four times greaterin terms of the ph change compared to a single nucleotide incorporation, and the softwarecan go through after the run and evaluate this, and the resulting sequence comes outand is available for downstream interpretation. you can kind of see this idealized here, wherewe have the key sequence and then multiple incorporations, some of which spike higherdegrees of signal, ph change, than others do. this is really the way that the sequencingtakes place in the ion torrent system. this has two platforms available; again, moreof a sort of desktop sequencer called the pgm and then a larger throughput sequencercalled the proton, and you can see the different


attributes here. just to point out, this isnot paired end sequencing, but, because of the read lengths here up to about 400 basepairs, you just sequence from a single end. so there's not a read pairing that happenswith this type of an approach, but the read length is longer, and if you look across thevarious attributes of the sequencing platform, we really like to use this oftentimes to counter-checkthe illumina sequencer because it has a very low substitution error rate. since each nucleotidetype flows one at a time with a wash in between, you almost never see a substitution errorwhere the wrong nucleotide got incorporated because of the way that reagent flow works.however, because of the sort of relative ph change that i just talked about, when youget above a certain number of the same nucleotide


occurring in a row you can actually lose thelinearity of response, and so this masquerades as a problem with insertion/deletion errorsin a single read; however, an averaging approach with multiple coverage across that regionof a homopolymer run can usually get you to the right answer. so it's a consensus accuracy approach that'sreally needed in this type of sequencing, and this is a relatively inexpensive, fastturnaround platform for data production. as i said, we typically use this in the lab forfocused sets like the multiplex pcr that i talked about earlier, and also for just datachecking for variants that we want to proceed with from our illumina sequencing pipeline.


okay, let's talk just a little bit about bioinformatics.i don't want to get too deep in the weeds here, but to give you an appreciation of thechallenges of short read sequencing. i'll focus my attention here on the human genome,which is the one that i study the most, but you can substitute any reference genome inplace of the human genome in these slides. really, what we're doing here now with theseshort read technologies, unlike back in the day with sanger long reads, is we're not doingan assembly of the sequencing reads where we try to match up long stretches of similarnucleotide sequences and sort of build a contig or a fragment of long sequence over time. here, with short read sequencing and especiallywith genomes that are as complicated as the


human genome sequence, you actually have toalign reads onto the sequence rather than assemble reads. the lower you get in termsof your genome sequence size -- so, viruses and some simple bacteria genomes -- you actuallycan do assembly, but for large, complicated genomes it's really not a formal possibility.so, alignment of reads of the human or other reference sequences is really the first stepbefore you go and identify where the variants exist. and in the spectrum of using pairedend reads, we actually can identify, for example, a chromosomal translocation where a groupof reads on one end actually maps to one chromosome and the other end of the fragments maps toanother chromosome, thereby identifying that there's something that's gone on there tomarry up those two chromosomal segments together.


and i alluded to rna sequencing earlier; rnasequencing data goes through the exact same process of alignment followed by downstreaminterpretation. there are no differences, and in fact, across all the things you cando with omic data, alignment to the reference genome is really always the first step. just think about it like this, because thehuman genome is large -- three billion base pairs; lots of repeats; about 48 percent repetitive-- so it looks a little bit like this jigsaw puzzle, where here's all the repeat spacein the grass and the sky, and the tree is the genes that are, you know, sort of interspersedinto all of this. so here are all of your short read sequence data that you have tofigure out where they originally came from


where and when you made your library, anda lot of the pieces look a lot like each other, so it's sort of difficult to figure out exactlywhere they go, but when you find this tree, you can accurately place that with a prettyhigh degree of certainty. so how do we deal with this sort of confusion about mappingwhere and how accurately? that's really just because we've been able to come up with avariety of statistical measures of certainty that tell us that if a read can map here orhere, our best possible guess comes from a variety of mapping scores that sort of tellus that the read is most likely to map here as opposed to the other places. this has been tremendously enhanced, i shouldpoint out, by paired end sequencing, because


oftentimes you can get a read end that mapsinto a repetitive sequence, but as long as that other read from the other end of thefragment maps out into a unique sequence, you increase the certainty of mapping forthat particular sequence in the genome. so, once we have the read aligned, then we haveto go through a series of steps that i won't spend a lot of time on, but just by the sheernumber, to let you know that this is an important aspect before you go actually into variantdetection. if you're interested in structural variation or other aspects of read pairing,you have to go find the read pairs that are properly mapped at the distance you expectbased on your library fragment size on average, and the ones that are not if you want to thengo subsequently to identify structural variations,


as i talked about. you also want to eliminate -- mark and eliminatethe duplicate reads -- this is done through an algorithmic approach -- correct any localmisalignments -- this is just getting rid of sequences that are aligned improperly -- calculatequality scores, and then, finally, go through a variant detection process which is, again,an algorithmic look at how well the sequence of your target maps back to the reference.this allows you to identify single nucleotide differences between your sequence and thereference sequence, and on average in the human genome you see about three million ofthese per individual sequence. we then need to evaluate coverage, because coverage iseverything. coverage really reflects the number


of times that you've oversampled that genome,how deep is the sequencing on any given region of the genome. what this really goes to ultimatelyis your certainty of identifying a variant that's real as opposed to something that simplydoesn't have enough coverage to really support that that variant is accurate. one of the things that we do in terms of evaluatingcoverage is we compare these snips that we've identified from our next-gen coverage to snipsthat come from array data, like a genotyping array, so if you have a high concordance thereyou probably have sufficient coverage on your genome to look at those data downstream inan interpretive way. we can also compare snips from tumor to normal in the cancer sequencingrealm, where we've sequenced both the cancer


genome and the normal genome and identifiedthe number that are shared between those two genomes, because, of course, the constitutionalsnips of any individual will come through in that tumor genome as well as the somaticvariants that are unique to the tumor itself. we can also look at the data. this is comfortingto people like me who, over the years, were used to looking, you know, autoradiogramsand then chromatograms on a computer screen, and now you're sort of like, "what do i lookat?" but, as i'll show you in a minute, there's a nice viewer that we often use for what wecall manual inspection of sequencing data, and we can also generate tools to give usinformation about coverage, as i'll show. and when all of these things, then and onlythen can we finally analyze the data to interpret


the variants that we find there. here's just a quick look at igv, which issort of the commonly used tool across next-gen sequencing laboratories to look at coverageand other aspects. you can get a whole chromosome view, or you can zoom in to specific area.you can see here in the gray bars all of the coverage that's resulted in this area, andyou can even get down to the single nucleotide level to identify a clear presence of a variantcompared back to the human reference genome. and here's another igv shot just showing whati'd normally do, which is look at the normal coverage and the tumor coverage, where you'reclearly identifying a somatic variant here that's unique to the tumor genome itself.and then, lastly, just to give a plug for


a tool that we've come up with -- there'sa very long list here, but my slides are available -- or a url here. this is just a tool thattakes the bulk of a bunch of capture data that i talked about earlier and really comparesthe coverage levels, according to a variety of color-coded depth levels here, and looksat the breadth of sequencing coverage and also the amount of enrichment that you'vebeen able to do. these bulk tools are really necessary in high throughput sequencing sothat you can rapidly evaluate whether these are data that now need to go downstream tosubsequent analysis or back to the sequencing queue to generate additional coverage if thecoverage levels are inadequate. one of the things i get asked a lot is sortof "well, what's better, whole genome sequencing


or exome sequencing?" i guess my typical answeris it depends on what you want to do with the data, so i won't try to bias you one wayor the other, but just give you some facts and figures here. this is just sort of lookingat the sequencing data, how much -- exome is about six gigabases; whole genome, muchlarger; and this can increase when you're sequencing a cancer genome because more coverageis always better there. you can see, obviously, the different target spaces, and this numbervaries depending upon the exome reagent that you're using, so this is sort of an averagedamount. mapping rates are higher for exome than whole genome because whole genome isharder to map, because of all the repetitive sequences that i mentioned already. duplicationrates tend to be a little higher for exome


sequencing; it's less of the unique set ofmolecules; and these are the kinds of coverages that we typically achieve, although they canbe higher if you want. it's really just a matter of economics. how many of the coding regions are cds arecovered at greater than 10x. on average, you get about this much coverage for exome, significantlyhigher for whole genome, and that's just because where the probes hybridized from in the genomeis sometimes differential, and you may get better coverage just by sequencing acrossthe whole genome as well. and then really important is what you can get from exome sequencing,which is good point mutation in indel culling, but not so much resolution on copy number,and really it's difficult but not impossible


to cull structural variants, whereas prettymuch everything is available from whole genome, which is intuitive. the most challenging thingis culling structural variants, just because there's a very high false positive rate tothis type of an approach. and then, lastly, you know, what do you worry about? well, itdepends on if you're in a clinical setting or in a research setting. in a clinical settingwe worry about false negatives because we don't want to miss anything, and most of theseare actually due to lack of coverage. so an exact examination or post-variant filteringapproach to remove a lack of coverage read that are indicating mutations that we've missedis a problem. false positives are more important maybe in the research base, where these aresome of the sources of false positivity, but


we've actually found over time that goingback and revisiting these sites can lead us to filters that can actually remove the sourcesof false positivity such as variants that are only culled on one strand or right atthe end of the read where your signal to noise is starting to approach each other. i'll end here with the bioinformatics diatribethat i just want you to sort of appreciate in a microcosm all of the factors that gointo this, and why incorporating massively parallel sequencing, especially in the clinicalspace, is actually fairly problematic, because you have to not only understand and appreciateall of these factors, you actually have to build into your sequence analysis pipelinesthe ability to deal with all of them so that


the result you get out is as high qualityand high certainty as possible, because often therapeutic decisions and other types of clinicaldecisions are being made off of this. one of the current trends in the marketplace isto sort of package together for clinical utility multiple systems that allow you, like thisqiagen system shown here, to sort of produce the library -- sorry, produce the dna or rna,produce the library, do the sequencing, and then have all of the bioinformatics and analysispackaged for you at the end of it, so it's sort of a sample-to-insight, as they callit, type of solution. this is just one example where you have modules for all of these thingsand then you have onboard analysis and interpretation software for clinical utility of sequencingdata.


okay, i want to turn my attention to thirdgeneration sequencers and focus really on single molecule detection now as opposed tobulk molecule detection, which is what we've been talking about. what you're staring athere is the so-called smrt cell of the pacbio sequencer which is used for real-time sequencingof single dna molecules. this is the surface of the smrt cell that is the action part ofthe operation here, and consists of about 150,000 so-called zero-mode waveguides, littleisolated pockets that individual dna and polymerases can fit down into and can be sequenced andwatched in real time as the sequencing reaction occurs. so how does this work? first of all, you make a dna polymerase complexand this gets immobilized down at the bottom


of the zmw where the bottom is specificallyderivatized. the sides of the zmw are not, and so, on average, what ends up is that thispolymerase is sitting right at the bottom of the zero-mode waveguide. well, why do wewant it there? because we want to add in now fluorescent labeled nucleotides that the polymerasecan incorporate relative to the strand that it's glommed onto, and these fluorescent nucleotidesare evaluated as they come into the active site of the polymerase which is in the viewingarea of the optics of the instrument, which is focused on each one of these 150,000 zero-modewaveguides. what happens when you get an incorporation, of course, is that the nucleotide sits forlong enough in the active site of the polymerase for the fluorescents to be excited by theimposing wavelength that's coming into the


bottom of the zmw; you get a fluorescent readoutthat's captured by the optics in the detection system of the instrument; and the phosphategroup is actually cleaved during incorporation and it contains the fluorophore, so it diffusesaway, and now you're ready for the next incoming nucleotide to sit in that active site longenough to also excite its fluorescence and get a readout. so you get a base by base readoutthat occurs based on the fluorescent emission and wavelength of each of the nucleotides,which are specifically labeled, and this is a process that's occurring in parallel inall of the zmws that contain a polymerase with a dna fragment that's properly primed.and what you're essentially doing with this device is really taking a movie of all thesezmws over a defined period of time, during


which the data is accumulated. now, i'm not going to walk through this extensiveworkflow in any way, shape, or form, but just to give you an appreciation of what it takesto process the library -- some of these steps are very similar to what we've already talkedabout for massively parallel sequencing instruments, but the big difference here is that you actuallypolymerase -- mix the polymerase, the primer, and the library fragments together, and thenapply that onto the sequencing instrument. as you can see here, we collect data in thearea of four to six hours of collection time from the zero-mode waveguides in the smrtcell. this is a really different departure from what we've been talking about for a varietyof reasons. first of all, the idea here is


to start with, again, high molecular weightgenomic dna, but we actually shear to very long read length sizes, about 30 to 50, evenup to 80 kilobases. the reason why is that during that four- to six-hour run, from individualfragments that go into the library prep that are this long we can literally generate sequencereads that are in excess of 30,000 base pairs. the average is about 15,000, so, as you cansee, this is diametrically opposed to all of that short read sequencing that we werejust talking about. we've worked out methods for consistent shearing of dna to these longread lengths; as you can imagine, it's reasonably fragile, so there are a variety of devicesthat we use to maintain the stability, and also to make the library as high quality aspossible, and this does take a pretty good


amount of dna. to sequence a whole human genome,as i'll talk about in a minute, is a considerable investment that really, at this point in time,does not render to, you know, an individual core biopsy from a cancer sample, for example.now we're talking about sequencing really here from cell lines where you can generatelots and lots of dna to get the kind of coverage that we need, sixty-fold or higher, to sequencethe human genome. this is just an example of some of the readlengths that are attainable from some recent examples where you can see the mean read lengthsare sort of in the 13,000/15,000 base pair read length, and some of these bins go extraordinarilyhigh towards generating this. and the reason for pointing this out is now we're movingfrom a realm where we need to align reads


back to a human reference genome to, rather,coming up with algorithms that can assemble these reads into portions of entire humanchromosomes, and this is, of course, really important for a variety of reasons. one thati'll talk about today is that we're trying to use these long read technologies to notonly improve the existing current human genome reference, but to also produce additionalhigh quality human genomes to sort of spread out the knowledge about diversity in the humangenome across different populations across the world, and also to really understand theunique content in genomes that you can't get by simply aligning reads back to a fixed reference.so this is just a shot from our website that talks about our reference genomes improvementproject which is funded by nhgri, and shows


you that we plan to produce gold referenceand already have produced some platinum reference genomes. the difference here is that theseare haploid genomes; they come from an abnormality called a hydatidiform mole where you get asingle enucleated egg that's fertilized by a sperm and this really grows out to a certainstage and then gets turned into a cell line. so these are haploid high quality human genomes.these will be from diploid individuals, as you can see, across different populationsof the world. the plan is sort of outlined here, and thisis, again, linkable to our website, the url which i just showed. it starts from pacbiosequencing reads, de novo assembly of those reads which will be contiguous across longstretches, but not perfect; and then using


a different technology called the bionanowhich makes maps of the human reference genome, we can actually get an accurate representationof how good our pacbio assembly is and how big the gaps are that we still have left tofill. and in some cases, as i'll show you, we're also using the pacbio sequencer to sequencefrom bacterial artificial chromosomes made from these same cell lines; we fill in thegaps using these data here and come up with a high quality gold reference genome that'shighly contiguous to the extent possible across the chromosomes. so this is just an exampleof the first gold genome that we produced. this is from a yoruban individual in the genomesproject, and you can see all kinds of metrics here for the quality of the assembly, butour biggest contig is 20 million base pairs


of assembled sequence data. that's prettyremarkable when you stop and think about it. and on average the contigs are about six millionbase pairs. we can align this, as i said, to the mapsthat are created using the bionano, which really takes similar long pieces of dna, doesa restriction digest, and then calculates the restriction fragment sizes and maps thoseback. and when you compare, in this particular example there's a conflict between the pacbioassembly and the bionano data which you can see is now resolved by alignment of theseback reads using the pacbio long read assembly approach here, and we can resolve significantlydue to this very complex region of the genome which involves the segmental duplication.these are historically the hardest parts of


the human genome to actually finish the highquality and contiguity, but here we've actually done it. these are the approaches that we'reusing -- sorry, the different genomes, the source or origin, and the level of coveragethat's planned, and you can sort of see a current snapshot from just a couple of weeksago of where we're at with producing these data which of course will be available foruse from the ncbi. okay, i just want to finish up here with acouple of new technologies to mention. this is, again, 10x genomics, a company that'sreally aimed at high quality contiguity but using a different approach than long readtechnology. so how does this work? well, what -- the idea here is to generate these littlesegments, so starting with a long piece of


dna similar to what we would be using fora pacbio library, he combined this in an isothermal incubation in micelles, so similar to theamplification approach that we used for ion torrent; so, oil and buffer micelles whereyou have these little molecular barcodes that sit down on the surface of this long moleculeand then get extended for a certain period of time. you then turn these into full sequenceablelibraries by adapting -- sorry, ligating on these adapters to these ends of -- with themolecular barcode at one end and then the adapter at the other end. you can sequenceand analyze these, and then, using bioinformatics, take these finished sequence reads and actuallycombine them back using a linked read approach into a full contig similar to what i justshowed from the assembly of pacbio reads.


so this is getting long range informationfrom short read technology, and indeed, this platform -- this uses the illumina platformto read out. you're starting with this gel and bead emulsion,or gems, approach that i just showed you to take long molecules into a micelle partition,amplify off of these different molecular barcodes in sequence, and then use these the sequencinglibrary that then gets read out, and the barcodes then identify those individual short sequencereads as having come from that original long fragment that's isolated in that micelle.and so using this approach, you can actually generate very long contiguity by just mappingup the barcodes that -- from the short reads and linking them together using specific algorithms.so there are lots of things that you can do


with this. i see that this didn't animateproperly, so sorry about that, but just go to the website for this approach and it willshow you the different things that you can do, including long-range haplo typing informationand then getting information, like i just talked about, from diploid de novo assembliesusing the supernova assembler that's been created by the 10x genomics crew. last but not least, i'll just talk about oxfordnanopore sequencing briefly to give you an update. this is a protein-based nanopore whichis meant to pull a dna fragment through using a variety of mechanisms. each nanopore islinked to a specific application-specific integrated circuit that collects data andbasically fits the data to a model of what


each nucleotide combination looks like insideof that pore to cull the base sequences. so during the run you basically get a littleoutput that looks like this, which reflects the translocation of the dna sequence throughthe pore, and you can also sequence these reads twice, one direction, and then the other,to get higher accuracy information. the read lengths from this are variable, so there'sno sort of set read length; it's really just sequence until you have the amount of sequenceinformation that you need for the coverage that you need. and as i mentioned, this datacollection really is based on electrical current differential across this membrane that thepore is sitting in, so as the dna translocates, different combinations of nucleotides givedifferential changes in the electrical current,


and then this can be fit back to a model ofall possible multimers. this has been evolving over time. the error rates initially werequite high; with improvements in pores and software, the newest iterations of this typeof approach are error rates of around 10 percent for the dual read where you sample the sequencetwice, and about 20 percent if you're just able to sequence through that sequence dataone time. so, you know, probably in the same realm at this point in time as the error rateon pacbio sequencing which, i apologize, i forgot to mention earlier. this is kind of what the device looks like.this is a small -- sort of the size of a stick drive or a thumb drive that you might putin your computer usb port. it actually fits


and connects to the computer via usb. andthen this is the actual sequencing device here, the flowcell which contains the arrayof nanopores that the dna translocates through and that the data is collected and fed intothe associated computer and then can be analyzed after the run is completed to give you information.and in a promised next version, you know, we'll basically -- the company will basicallyput together a lot of these different nanapore devices into a very large sort of computecluster device that's shown here, which has the formidable name of prometheon. okay, i'm going to just finish up with threemore slides and then we'll open for questions to just give you a feel for kind of wherethings are going with regard to the application


of next generation sequencing in the clinicalcancer care of patients, and this just refers to what i call immunogenomics. in the past,people who know a lot more about immunity in cancer than i do -- thierry boon, hansschreiber, and others -- actually predicted that because you have specific mutations,as we've talked about, in tumor genomes that produce proteins that are mutated and thereforehave a different sequence, these proteins actually might look different to the immunesystem if you could sort of tell the patient's immune system about them, if you will. thiscould happen through some sort of a vaccine-mediated approach or otherwise that could alert theimmune system to the presence of these abnormal cells and lead to their destruction. the problemin the past is that identified these neo-antigens,


these proteins or peptides that look mostdifferent to the patient's immune system, was extraordinarily difficult, as i'll illustratein a few quick slides here. this has largely been overcome by next generation sequencingand bioinformatic analysis. in the immunogenomics realm, in identifyingthese neo-antigens we have three sources of data. we have exome sequencing from cancerand normal to identify cancer-unique peptides; we can also obtain from next-gen sequencingthe hla haplotypes of the individual -- so what are their specific hla molecules thatwill bind these cancer-unique peptides and present them to the immune system? and then,what are the rna sequencing data? because most of the samples we'll be talking aboutare very high mutation level samples, so not


every dna mutation results in an expressedgene or that expressed mutation. we combine all three of these data types into an algorithmicapproach that compares the binding of these altered peptides, the mutants, to the wildtype peptides and gives us the ones that look most different to the immune system basedon just binding to the mhc, and we call these tsmas or neo-antigens and this can describethe cancer's neo-antigen load. just for those of you in the bioinformatics realm, we havea github available pipeline for doing everything that i just described, including the rna filteringand coverage-based filtering. that was just published in january of this year. so why do we care about this? well, there'sevidence from the medical literature that


actually patients with the highest mutationrate, the highest neo-antigen load, are the ones that are most responsive to a remarkablenew class of drugs that are commonly referred to as checkpoint blockade therapy. these arethe types of molecules that antibody-based drugs that release the brakes on the immunesystem and actually allow t-cells to infiltrate, identify, and selectively kill cancer cellsin most cases. these are just two publications from the literature showing what i just toldyou, which is that the response curves looked dramatically different in patients with highversus low mutation loads. here's just another study from the hopkins group, again lookingat patients with either mismatch repair-proficient in red or mismatch repair-deficient, whereyou have a germline susceptibility or sometimes


a somatic sort of tendency to develop many,many, many more mutations. we refer to these people as ultra mutators, whereas msi-stablegenomes tend to not actually respond to these checkpoint inhibitor drugs. this is an importantthing to be able to monitor from the genomics of a tumor because people with high neo-antigenloads may be suitable for checkpoint blockade therapies, many of which are now approvedby the fda, including just last week the genentech drug atezolizumab was approved in bladdercancer, which is another high mutation load tumor. this is just a quick case history becausei love n of 1s and i try to present one every time i come here to give this lecture; italways has to be a different one. this is


the most recent one, so let me walk you throughthis real quickly. this is a male patient who presented this glioblastoma with a historyof colon polyps. he's much too young to be having colon polyps, i think we would allagree. his gbm was diagnosed while he -- when he started having seizures while on vacation,and was removed at ucsf, and part of the sample from this tumor was shared with us for thegenomic studies i'll tell you about. he received, as do most gbm patients, a drug called temozolomide,which is an alkylating chemotherapy that's often given in the context of radiation therapyfor gbm patients, but when he was back in st. louis, was diagnosed with a spinal metastasisin the cerebral vertebra that was identified through some additional sequela and conditionsthat he was experiencing. this went to foundation


medicine, which is a commercial testing forcancer tissues, and came back with practically every gene on their cancer panel having somesort of a mutation, which should tell you something. it did to the oncologist, and sohe sent a blood sample to a colorectal cancel panel and this individual has a mutation ina gene called polymerase epsilon which is important in dna repair processes in the normalcell. this is a known variant that causes defects in the polymerase epsilon activity. based on all of this, and based on the clinicalevidence that i just showed you, he was assigned to take an anti-pd1 checkpoint immunotherapycalled pebrolizumab. curiously, just a few weeks, or two weeks after he started on thecourse of anti-pd1 therapy, a second spinal


metastasis was identified, so there was someconcern that he was resistant to this drug, but actually, when we went back to the initialmris -- not me, but the radiologist identified that this was probably just physically toosmall to accurately pick up the second metastasis. it was also removed and we studied all ofthese samples compared to the blood normal of the individual using exome sequencing andalso some immunohistochemistry which i'll show you in a minute. this is sort of clonalevolution over time, not much time, but admittedly in different presentation of this individual'stumors. you can see a very complex tumor across the spectrum of all variants that are identified.and this is the clonality plot showing that there's a founder clone set of mutations andthen three detectable additional sub-clones


that are present in the disease. in the spinalmetastasis, now you see a much more complex post-temozolomide, and we know this is a phenomenologythat's associated with alkylating chemotherapy, so a very complex disease. and now, look atthe sort of winnowing influence that using the anti-pd1 checkpoint blockade has on thisindividual's tumor that was removed from the second spinal metastasis. it's now a muchsimpler genomic -- from a genomic standpoint, and shows the impact of the drug. why is this important? these are antibody-basedtherapies, as i mentioned. there has been long speculation that because they're antibodiesand not small molecules they wouldn't cross the blood-brain barrier, and i think thisis evidence to the contrary. if you need more


evidence, then you can look at immunohistochemistryfor staining of different immune molecules in the t-cell repertoire which is absolutelynot there in the first metastasis. this is, again, post-temozolomide but before the useof anti-pd1, and once we go through a couple of cycles of anti-pd1, the removed secondmetastasis now is full of immune infiltrate and indicative of why this subsequent tumorhas a significantly different genomic context than the previous one that was removed fromthe spine. this patient remains without evidence of diseaseso far on the pebrolizumab therapeutic, and really sort of also raises a very other interestingquestion medically, which is that if patients like this with these polymerase epsilon constitutionalmutations are identified early, they may actually


benefit from so-called prophylactic therapyin the checkpoint blockade realm where it helps them to stave off developing tumorsbecause of this activation of the immune system against the tumors, as you see here. so that'sa formal possibility, but, you know, hasn't been tested, obviously. i should also pointout that as we were evaluating this patient, a report for two sib pairs that had also germlineconstitutional dna repair defects and had developed brain cancers came out in the jcofrom a group in toronto, so this is now two pieces of evidence in their response to acheckpoint blockade therapy, and in this individual that this is suitable for patients in theneurological cancer space. and then a recent mri indicated that a remaining lesion thathad been left behind during the neurosurgery


on this patient initially to remove this verylarge gbm was actually diminishing over time with additional mri imaging. so he's clearlyresponding in that area, the brain, as well. i think this is sort of an exciting new area.we're also developing personalized vaccines, which i didn't have time to talk about, butwhich you can read about. we published this in science 2015, a manuscript describing ourfirst in-human melanoma trials where we used the same approach, but then used the neo-antigensthat we identified to design a personalized vaccine for patients with advanced melanoma,and we're expanding this out into other tumor types as well. so, just a little vignetteon how genomics now is really having an impact in the clinical setting for the care of cancerpatients. there are obviously multiple other


examples from people's work, but i think thisis particularly exciting. i'll finish up and just thank my colleaguesat the genome institute, including rick wilson who's the director of our institute. i alsowant to give a nod to my very important collaborators who provided the samples that i just talkedabout, and bob schreiber, who taught me everything i know about immunology, which still isn'tmuch, but he's absolutely the expert and i really value his collaboration. and then ireally want to say also thanks for all of the different components of the technologiesthat i talked about, to all of these individuals who contributed slides into the mix. and lastly,with respect to the last vignette, i want to acknowledge patients and families thatcontribute valuable samples so that i can


present exciting information like this toyou all. thanks for your attention. [end of transcript]










Drop 5lb of belly fat in 3 day [1-2lbs PER DAY!]





What if I told you there is one simple tweak you can implement today that will force your body to get rid of stubborn, unwanted belly fat everyday for the rest of your life... Would you be interested in that? Good news is...



You can literally lose 1-2lbs of belly fat every single day with this!



This is a new diet “trick” from a european doctor that can help you lose 1-2 lbs of belly fat per day! So...




Check It Out Here And Lose 5lbs 













No comments:

Post a Comment

Popular Posts