Inligting

Hoe skakel mutasies en proteïensintese met kanker?

Hoe skakel mutasies en proteïensintese met kanker?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Hoe skakel mutasies en proteïensintese met kanker?

Ek weet dat 'n mutasie in DNA kan veroorsaak dat die drieling-kode op die mRNA verander sodat verskillende aminosure gemaak word en 'n ander volgorde beteken 'n ander proteïen word gemaak, maar hoe skakel dit met kanker wanneer kanker deur onbeheerbare mitose veroorsaak word.

Dankie

Hierdie vraag is op GCSE-vlak, dus eerstejaars-/sophomore-vlak.


Mutasie in DNA kan voordelig, skadelik wees, of mag geen effek hê nie. Dit hang af van die geenligging waar hierdie mutasie plaasgevind het.

Verskeie gene wat aan kanker gekoppel is, is:

Tumoronderdrukker gene: Hierdie gene reguleer selgroei deur seldeling dop te hou, wanpassende DNA te herstel en seldood te reguleer. Mutasie in hierdie tipe geen lei tot ongereguleerde selgroei en lei tot die vorming van gewas.

Voorbeelde:

  • TP53 is die mees algemeen gemuteerde tumor supresor geen. Meer as 50% van kankers word veroorsaak as gevolg van mutasie in hierdie geen.
  • Mutasie in BRCA1- of BRCA2-gene is verantwoordelik vir bors- en eierstokkanker. Dit kan ook die risiko van pankreaskanker en melanoom verhoog.

Proto-onkogene: Hierdie gene help selle om normaal te groei. Mutasie in hierdie geen veroorsaak dat die permanente aktivering daarvan en ongereguleerde selgroei plaasvind en lei tot kanker. Hierdie abnormale geen word onkogeen genoem.

Voorbeelde:

  • HER2 is verantwoordelik vir bors- en eierstokkanker.

  • RAS-familie van gene, verantwoordelik vir selkommunikasie, selgroei en seldood.

DNA herstel gene: Hierdie gene speel 'n rol in die herstel van DNA, wat tydens replikasie nie ooreenstem nie. Defek in DNA herstel geen, lei tot sintese van verkeerd gevormde DNA en uiteindelik verkeerde proteïensintese.


Gene en proteïene reguleer mitose. Die mees algemene voorbeeld van 'n proteïen wat die selsiklus reguleer en kanker voorkom, is P53.

As jy meer wil leer, kyk na selfikluskontrolepunte.


Mutasies en siektes

DNA is voortdurend onderhewig aan mutasies, toevallige veranderinge in sy kode. Mutasies kan lei tot ontbrekende of misvormde proteïene, en dit kan lei tot siektes.

Ons begin almal ons lewens met 'n paar mutasies. Hierdie mutasies wat van jou ouers geërf word, word kiemlynmutasies genoem. Jy kan egter ook gedurende jou leeftyd mutasies opdoen. Sommige mutasies vind plaas tydens seldeling, wanneer DNA gedupliseer word. Nog ander mutasies word veroorsaak wanneer DNS beskadig word deur omgewingsfaktore, insluitend UV-straling, chemikalieë en virusse.

Min mutasies is sleg vir jou. Trouens, sommige mutasies kan voordelig wees. Met verloop van tyd skep genetiese mutasies genetiese diversiteit, wat bevolkings gesond hou. Baie mutasies het geen effek nie. Dit word stille mutasies genoem.

Maar die mutasies waarvan ons die meeste hoor, is diegene wat siektes veroorsaak. Sommige bekende oorgeërfde genetiese afwykings sluit in sistiese fibrose, sekelselanemie, Tay-Sachs-siekte, fenielketonurie en kleurblindheid, onder vele ander. Al hierdie afwykings word veroorsaak deur die mutasie van 'n enkele geen.

Die meeste oorgeërfde genetiese siektes is resessief, wat beteken dat 'n persoon twee kopieë van die gemuteerde geen moet erf om 'n afwyking te erf. Dit is een rede waarom huwelike tussen nabye familielede ontmoedig word. Twee geneties soortgelyke volwassenes is meer geneig om 'n kind twee kopieë van 'n defekte geen te gee.

Siektes wat veroorsaak word deur net een kopie van 'n defekte geen, soos Huntington se siekte, is skaars. Danksy natuurlike seleksie is hierdie dominante genetiese siektes geneig om mettertyd uit populasies te verwyder, omdat aangetaste draers meer geneig is om te sterf voordat hulle voortplant.

Wetenskaplikes skat dat elkeen van ons tussen 5 en 10 potensieel dodelike mutasies in ons gene het - die goeie nuus is dat omdat daar gewoonlik net een kopie van die slegte geen is, hierdie siektes nie manifesteer nie.

Kanker spruit gewoonlik uit 'n reeks mutasies binne 'n enkele sel. Dikwels is 'n foutiewe, beskadigde of ontbrekende p53-geen te blameer. Die p53-geen maak 'n proteïen wat keer dat gemuteerde selle deel. Sonder hierdie proteïen verdeel selle ongemerk en word gewasse.


Epigenetiese veranderinge in kanker

Algemeen in kankerselle, stilmaakgene, wat deur epigenetiese meganismes plaasvind, sluit veranderinge aan histoonproteïene en DNA in.

Leerdoelwitte

Beskryf die rol wat epigenetiese veranderinge in geenuitdrukking in die ontwikkeling van kanker speel

Sleutel wegneemetes

Kern punte

  • Die DNS in die promotorgebied van gene wat stilgemaak is in kankerselle word gemetileer op sitosien DNS-residue in CpG-eilande.
  • Histoonproteïene wat die promotorgebied van stilgemaakte gene omring, het nie die asetileringsmodifikasie wat teenwoordig is wanneer die gene in normale selle uitgedruk word nie.
  • Wanneer die kombinasie van DNA-metilering en histoon-deasetilering binne kankerselle voorkom, word die geen wat in daardie chromosomale gebied teenwoordig is, stilgemaak.
  • Epigenetiese veranderinge wat in kanker verander word, kan omgekeer word en kan dus nuttig wees in nuwe geneesmiddel- en terapie-ontwerp.

Sleutel terme

  • epigeneties: die studie van oorerflike veranderinge in geenuitdrukking of sellulêre fenotipe wat veroorsaak word deur ander meganismes as veranderinge in die onderliggende DNA-volgorde
  • metilering: die byvoeging van 'n metielgroep tot sitosien- en adenienreste in DNA wat lei tot die epigenetiese modifikasie van DNA en die vermindering van geenuitdrukking en proteïenproduksie
  • asetilering: die reaksie van 'n stof met asynsuur of een van sy derivate die inbring van een of meer asetielgroepe in 'n stof

Kanker en epigenetiese veranderinge

Kanker epigenetika is die studie van epigenetiese modifikasies aan die genoom van kankerselle wat nie 'n verandering in die nukleotiedvolgorde behels nie. Epigenetiese veranderinge is so belangrik soos genetiese mutasies in 'n sel’s transformasie na kanker. Meganismes van epigenetiese stillegging van tumoronderdrukkergene en aktivering van onkogene sluit in: verandering in CpG-eilandmetileringspatrone, histoonmodifikasies en disregulering van DNA-bindende proteïene.

Epigenetiese veranderinge in kankerselle: In kankerselle is die stilmaak van gene deur epigenetiese meganismes 'n algemene verskynsel. Meganismes kan wysigings aan histoonproteïene en DNA insluit wat met hierdie stilmaakgene geassosieer word.

Die stilmaak van gene deur epigenetiese meganismes is baie algemeen in kankerselle en sluit veranderinge in aan histoonproteïene en DNA wat met stilgemaakte gene geassosieer word. In kankerselle word die DNA in die promotorgebied van stilgemaakte gene gemetileer op sitosien-DNA-residue in CpG-eilande, genomiese streke wat 'n hoë frekwensie van CpG-plekke bevat, waar 'n sitosiennukleotied langs 'n guaniennukleotied voorkom. Histoonproteïene wat daardie streek omring, het nie die asetileringsmodifikasie (die byvoeging van 'n asetielgroep) wat teenwoordig is wanneer die gene in normale selle uitgedruk word nie. Hierdie kombinasie van DNS-metilering en histoon-deasetilering (epigenetiese modifikasies wat tot genestilte lei) word algemeen in kanker aangetref. Wanneer hierdie modifikasies plaasvind, word die geen wat in daardie chromosomale gebied teenwoordig is, stilgemaak. Wetenskaplikes verstaan ​​toenemend hoe hierdie epigenetiese veranderinge in kanker verander word. Omdat hierdie veranderinge tydelik is en omgekeer kan word (byvoorbeeld deur die werking van die histoon-deasetylase-proteïen wat asetielgroepe verwyder te voorkom, of deur DNA-metieltransferase-ensieme wat metielgroepe by sitosiene in DNS voeg) is dit moontlik om nuwe middels en nuwe terapieë om voordeel te trek uit die omkeerbare aard van hierdie prosesse. Inderdaad, baie navorsers toets hoe 'n stilgemaakte geen weer in 'n kankersel aangeskakel kan word om te help om normale groeipatrone te herstel.

Gene betrokke by die ontwikkeling van baie ander siektes, wat wissel van allergieë tot inflammasie tot outisme, word ook vermoedelik deur epigenetiese meganismes gereguleer. Soos ons kennis verdiep oor hoe gene beheer word, sal nuwe maniere om hierdie siektes en kanker te behandel na vore kom.


Wat is die verband tussen proteïensintese en mutasies

Wat is die verband tussen proteïensintese en mutasies!?

MUTASIE: 'n Verandering in die volgorde van basisse in DNS

'n Verandering van een of 'n paar nukleotiede kan 'n verandering in die geen wat uitgedruk word, veroorsaak!

Mutasies kan óf wees: i) Voordelig (+) ii) Neutraal ( ) iii) Skadelik (-)

Interessante dieremutasies…. Voordelig? Neutraal? of skadelik? Tweekoppige skilpad

Haarlose fret Haarlose kat

Siamese Twins -bekroonde sumostoeiers-

Silver Wood Ducks -waterhoenders kleur mutasies-

Sneeuvlok by die Barcelona-dieretuin Albinisme: 'n Versteuring wat die kleur van hierdie gorilla se vel-/haarkleur aantas. Snowflake is in 2003 aan kanker dood toe hy ongeveer 37 jaar oud was. Hy het 'n aantal nageslag gehad wat almal nie albinisme gehad het nie.

Wat is die verband tussen mutasies en evolusie? Mutasies is die basis van evolusie! Sonder mutasies is daar geen verandering oor tyd nie, dus geen evolusie nie! Veranderinge in DNA kan kodeer vir veranderinge in beide voorkoms en in gedrag. Natuurlike seleksie bevoordeel (“selekteer”) of benadeel sulke veranderinge omdat dit die vermoë om te oorleef en voort te plant (fiksheid!) beïnvloed.

Mutasies kan óf wees: i) Voordelig (+) ii) Neutraal ( ) iii) Skadelik (-)

Voordelige mutasies Bv: sekelselanemie rooibloedselle is sekelvorm nie rond nie Een kopie van hierdie geen maak 'n persoon bestand teen dodelike malaria

Bv: Bakterieë wat olie kan verteer Word gebruik om oliestortings op te ruim Bv: Superbugs... Mutasie stel sommige bakterieë in staat om weerstand te bied / nie doodgemaak te word deur antibiotika wat voordelig is vir die bakterieë wat sleg is vir die mens nie!

Skadelike mutasies Bv: sistiese fibrose -> slym ophoop in die longe longinfeksies kan dodelik wees

Diabetes-insulienproteïen word nie behoorlik gemaak nie. Bloedsuikervlakke is buite beheer

Hoe vind mutasies plaas? I. Raamverskuiwing-mutasies: Byvoeging of uitwissing van 'n nukleotied DIE KAT EET DIE ROT Uitvee van 'C' veroorsaak: DIE ATA TET HAAR BY Die hele boodskap word geaffekteer (alle kodons!)

II. Puntmutasies 'n Enkele nukleotied word verander 1 kodon word slegs aangetas i) Stil- geen effek GAT GAC – beide kodeer vir leucine ii) Missense- proteïen se finale vorm & funksie beïnvloed CTT CAT – valien vervang glutamaat ii) Onsin- proteïen kan nie funksioneer nie AGT ATT – serien vervang deur STOP

mutagene: omgewingsfaktore wat mutasies veroorsaak 1. Hoë-energiebestraling: x-strale, gammastrale, UV-lig 2. Chemiese mutagene: benseene, dioxine, sommige stowwe in sigaretrook en in plaagdoders

https: //www. kankernavorsing. org/oor kanker/oorsake-van-kanker/kanker-kontroversies/plastiekbottels-en-voedselhouers

Verwerkte vleis en risiko van kinderleukemie https://www. ncbi. nlm. nih. gov/pubmed/8167267


DNA-volgordes

In die kern word twee stringe DNS saam gehou deur stikstofbasisse (ook genoem nukleobasisse of basisse). Vier basisse – sitosien, guanien, adenien en timien – vorm die letters van die woorde in die DNS-resepteboek.

Een string DNS hou die oorspronklike kode. As die instruksies van hierdie kode noukeurig gevolg word, kan 'n spesifieke korrekte polipeptied buite die kern saamgestel word. Die tweede DNA-string – die sjabloonstring – is 'n spieëlbeeld van die oorspronklike string. Dit moet 'n spieëlbeeld wees aangesien nukleobase slegs aan komplementêre vennote kan heg. Byvoorbeeld, sitosien pas altyd net met guanien en timien pas slegs met adenien.

Jy sal waarskynlik kodes soos CTA, ATA, TAA en CCC in verskeie biologie-handboeke gesien het. As dit die kodons (stelle van drie basisse) van die oorspronklike DNA-string is, sal die sjabloonstring hieraan heg deur hul maats te gebruik. Dus deur die gegewe voorbeelde te gebruik, sal sjabloon-DNA aan die oorspronklike DNA-string heg deur GAT, TAT, ATT en GGG te gebruik.

Messenger RNA kopieer dan die sjabloonstring. Dit beteken dat dit uiteindelik 'n presiese kopie van die oorspronklike string skep. Die enigste verskil is dat mRNA timien vervang met 'n basis genaamd uracil. Die mRNA-kopie van die sjabloonstring deur die gegewe voorbeelde te gebruik, sal CUA, AUA, UAA en CCC lees.

Hierdie kodes kan gelees word deur oordrag RNA buite die kern die resep kan verstaan ​​word deur 'n molekule wat nie die taal wat in die oorspronklike gebruik word ten volle verstaan ​​nie (dit verstaan ​​nie timien nie, net uracil). Oordrag-RNA help om die regte dele na die monteerlyn van die ribosoom te bring. Daar word 'n proteïenketting gebou wat ooreenstem met die instruksies in die oorspronklike DNS-string.


Effek van mutasie op proteïenstruktuur | Genetika | Biologie

Puntmutasies behels byvoeging, delesie of vervanging van basispaar in 'n geen.

2. Chromosomale mutasies:

Byvoeging of verwydering van DNA veroorsaak mutasies tydens oorkruising.

3. Springende gene veroorsaak ook verandering in proteïenstruktuur deur hul ligging van een chromosoom na 'n ander te skuif.

Die abnormale mRNA wat deur mutasies geproduseer word, lei tot 'n verandering in proteïenstruktuur en -funksie. Raamverskuiwingmutasie wat veroorsaak word deur byvoeging of verwydering van een of twee basisse sal lei tot die vorming van heeltemal 'n nuwe polipeptied.

Mutasie kan 'n normale kodon na 'n terminatorkodon verander, wat sal lei tot die vorming van 'n onvolledige polipeptied. Die veranderde of onvolledige polipeptied kan onaktief wees en kan dodelik wees vir die sel. Byvoorbeeld, sekelselanemie word veroorsaak deur 'n enkele basisvervanging. Dit veroorsaak die vervanging van glutamiensuur deur valien op posisie 6 in die ketting van hemoglobien.

Maar 'n verandering in die derde basis in 'n drieling mag nie enige verandering in die polipeptied veroorsaak nie. Dit is omdat die kodon kan interaksie met die antikodon van die ooreenstemmende tRNA. Hierdie verskynsel staan ​​bekend as Wobble-hipotese. Dit is in die jaar 1966 deur Crick voorgestel.

Volgens hierdie hipotese bind slegs die eerste twee basisse van die tRNA-antikodon met die eerste twee basisse van die mRNA-kodon. Die derde posisie word wiebelposisie genoem. Mutasies wat geen verandering in die proteïen veroorsaak nie, word stille mutasies genoem.


Proto-onkogene

Die gene wat kodeer vir die positiewe selsiklusreguleerders word proto-onkogene genoem. Proto-onkogene is normale gene wat, wanneer dit gemuteer word, word onkogene-gene wat veroorsaak dat 'n sel kankeragtig word. Oorweeg wat kan gebeur met die selsiklus in 'n sel met 'n onlangs verworwe onkogeen. In die meeste gevalle sal die verandering van die DNS-volgorde 'n minder funksionele (of nie-funksionele) proteïen tot gevolg hê. Die resultaat is nadelig vir die sel en sal waarskynlik verhoed dat die sel die selsiklus voltooi, maar die organisme word nie benadeel nie omdat die mutasie nie oorgedra sal word nie. As 'n sel nie kan voortplant nie, word die mutasie nie voortgeplant nie en die skade is minimaal. Soms veroorsaak 'n geenmutasie egter 'n verandering wat die aktiwiteit van 'n positiewe reguleerder verhoog. Byvoorbeeld, 'n mutasie wat toelaat dat Cdk, 'n proteïen betrokke by selsiklusregulering, geaktiveer word voordat dit behoort te wees, kan die selsiklus verby 'n kontrolepunt stoot voordat aan al die vereiste voorwaardes voldoen word. As die resulterende dogterselle te beskadig is om verdere seldelings te onderneem, sal die mutasie nie voortgeplant word nie en geen skade aan die organisme kom nie. As die atipiese dogterselle egter verder kan verdeel, sal die daaropvolgende generasie selle waarskynlik nog meer mutasies ophoop, sommige moontlik in bykomende gene wat die selsiklus reguleer.

Die Cdk-voorbeeld is slegs een van baie gene wat as proto-onkogene beskou word. Benewens die sel-siklus-regulerende proteïene, kan enige proteïen wat die siklus beïnvloed op so 'n manier verander word dat die sel-siklus kontrolepunte oorheers word. Sodra 'n proto-onkogeen sodanig verander is dat daar 'n toename in die tempo van die selsiklus is, word dit dan 'n onkogeen genoem.


Sommige menslike siektes word deur spontane mutasies veroorsaak

Baie algemene menslike siektes, dikwels verwoestend in hul uitwerking, is te wyte aan mutasies in enkele gene. Genetiese siektes ontstaan ​​deur spontane mutasies in kiemselle (eier en sperm), wat na toekomstige geslagte oorgedra word. Byvoorbeeld, sekelsel-anemie, wat 1 uit 500 individue van Afrika-afkoms affekteer, word veroorsaak deur 'n enkele missense mutasie by kodon 6 van die β-globien geen as gevolg van hierdie mutasie, word die glutamiensuur by posisie 6 in die normale proteïen verander na 'n valien in die mutante proteïen. Hierdie verandering het 'n diepgaande effek op hemoglobien, die suurstofdraerproteïen van eritrosiete, wat bestaan ​​uit twee α-globien en twee β-globien subeenhede (sien Figuur 3-11). Die gedeoksigeneerde vorm van die mutante proteïen is onoplosbaar in eritrosiete en vorm kristallyne skikkings. Die eritrosiete van aangetaste individue word styf en hul deurgang deur kapillêre word geblokkeer, wat erge pyn en weefselskade veroorsaak. Omdat die eritrosiete van heterosigotiese individue bestand is teen die parasiet wat malaria veroorsaak, wat endemies in Afrika is, is die mutante alleel gehandhaaf. Dit is nie dat individue van Afrika-afkoms meer geneig is as ander om 'n mutasie te verkry wat die sekelsel-defek veroorsaak nie, maar die mutasie is eerder in hierdie populasie in stand gehou deur kruisteling.

Spontane mutasie in somatiese selle (d.w.s. nie-kiemlyn liggaamselle) is ook 'n belangrike meganisme in sekere menslike siektes, insluitend retinoblastoom, wat geassosieer word met retinale gewasse by kinders (sien Figuur 24-11). Die oorerflike vorm van retinoblastoom is byvoorbeeld die gevolg van 'n kiemlynmutasie in een Rb alleel en 'n tweede somaties-voorkomende mutasie in die ander Rb alleel (Figuur 8-7a). Wanneer 'n Rb heterosigotiese retinale sel ondergaan somatiese mutasie, dit word gelaat met geen normale alleel as gevolg, die sel prolifereer op 'n onbeheerde wyse, wat aanleiding gee tot 'n retinale gewas. 'n Tweede vorm van hierdie siekte, genoem sporadiese retinoblastoom, gevolg van twee onafhanklike mutasies wat beide ontwrig Rb allele (Figuur 8-7b). Aangesien slegs een somatiese mutasie benodig word vir tumorontwikkeling by kinders met oorerflike retinoblastoom, kom dit teen 'n baie hoër frekwensie voor as die sporadiese vorm, wat die verkryging van twee onafhanklik voorkomende somatiese mutasies vereis. Daar is getoon dat die Rb-proteïen 'n kritieke rol speel in die beheer van seldeling (Hoofstuk 13).

Figuur 8-7

Rol van spontane somatiese mutasie in retinoblastoom, 'n kindersiekte wat gekenmerk word deur retinale gewasse. Tumore ontstaan ​​uit retinale selle wat twee mutante dra Rb − allele. (a) By oorerflike retinoblastoom ontvang 'n kind 'n normale Rb + alleel van (meer. )

In 'n latere afdeling sal ons sien hoe normale kopieë van siekteverwante gene geïsoleer en gekloon kan word.


Abstrak

Die lys van genetiese siektes wat veroorsaak word deur mutasies wat mRNA-vertaling beïnvloed, groei vinnig. Alhoewel proteïensintese 'n fundamentele proses in alle selle is, toon die siektefenotipes 'n verrassende mate van heterogeniteit. Studies van sommige van hierdie siektes het interessante nuwe insigte verskaf oor die funksies van proteïene wat betrokke is by die proses van translasie, byvoorbeeld bewyse dui daarop dat verskeie ander funksies het benewens hul rolle in vertaling. Gegewe die talle proteïene betrokke by mRNA-translasie, is dit waarskynlik dat verdere oorgeërfde siektes veroorsaak sal word deur mutasies in gene wat betrokke is by hierdie komplekse proses.


RESULTATE

Validasie toets

Om die vermoë van die FIS-telling te toets om die funksionele impak van 'n mutasie te voorspel, het ons die berekeningsprotokol toegepas op bekende 'siekte-geassosieerde' en 'algemene polimorfisme' variante en mutasies soos geannoteer in UniProt (HUMSAVAR, vrystelling 2010_08). Deur die FIS-tellings vir siekte-geassosieerde en algemene polimorfiese variante te vergelyk, het ons die hipotese getoets dat siekte-geassosieerde variante en mutasies tipies 'n negatiewe uitwerking op proteïenfunksie het en dus meer geneig is om waargeneem te word in posisies met lae koerse van mutasiefiksasie, dws. in bewaarde posisies, terwyl funksioneel neutrale of swak skadelike polimorfiese variante meer geneig is om in evolusionêr promiskue posisies waargeneem te word. Ons vereis dus dat die evolusionêre bewaringtelling tussen siektegeassosieerde en polimorfiese variante onderskei en as 'n maatstaf van funksionele impak van mutasies gebruik kan word. Om te verken hoe om die beste aan hierdie vereiste te voldoen, het ons 1000 diskrete drempels in FIS getoets en die aantal siekte-geassosieerde en polimorfiese variante aan weerskante van die drempel getel. Die herkenning akkuraatheid, gedefinieer as die persentasie van korrek toegeken variante is 79% wanneer vals positiewe en vals negatiewe is gebalanseer (gelyke breuke) (Figuur 2). Die opsommende resultate van die valideringstoetse en 'n kalibrasiekurwe van die funksionele telling is in Figuur 2 en 3 en in die Aanvullende Data (Aanvullende Tabel S1).

Skeiding van siekte-geassosieerde en polimorfiese variante deur funksionele impaktelling. ( A ) Genormaliseerde gladde verdelings van die waardes van die funksionele telling soos bereken vir 19 179 bekende 'siekte-geassosieerde' en 35 608 'gewone polimorfisme' variante en mutasies wat in UniProt geannoteer is (HUMSAVAR, vrystelling 2010_08 http://www.uniprot.org/ docs/humsavar ). ( B ) Die kumulatiewe verspreidings van die tellingwaardes bereken vir siekte-geassosieerde en polimorfiese variante, dieselfde data as in (A). 'n Ewe gebalanseerde skeiding (79%) tussen die twee variantklasse word bereik by 'n tellingdrempel van FIS~1.9. By hierdie drempel word ~79% van alle siekteverwante variante hoër as hierdie drempel behaal en ~79% van alle polimorfiese variante word laer behaal. Die maksimum skeiding (~80.3%) tussen die twee klasse word bereik by die drempelwaarde van 2.26 by hierdie drempel, ~70% van siekte-geassosieerde variante word hoër aangeteken en 86% van polimorfiese variante word laer behaal.

Skeiding van siekte-geassosieerde en polimorfiese variante deur funksionele impaktelling. ( A ) Genormaliseerde gladde verdelings van die waardes van die funksionele telling soos bereken vir 19 179 bekende 'siekte-geassosieerde' en 35 608 'gewone polimorfisme' variante en mutasies wat in UniProt geannoteer is (HUMSAVAR, vrystelling 2010_08 http://www.uniprot.org/ docs/humsavar ). ( B ) Die kumulatiewe verspreidings van die tellingwaardes bereken vir siekte-geassosieerde en polimorfiese variante, dieselfde data as in (A). 'n Ewe gebalanseerde skeiding (79%) tussen die twee variantklasse word bereik by 'n tellingdrempel van FIS~1.9. By hierdie drempel word ~79% van alle siekteverwante variante hoër as hierdie drempel behaal en ~79% van alle polimorfiese variante word laer behaal. Die maksimum skeiding (~80.3%) tussen die twee klasse word bereik by die drempelwaarde van 2.26 by hierdie drempel, ~70% van siekte-geassosieerde variante word hoër aangeteken en 86% van polimorfiese variante word laer behaal.

ROC-analise van klassifikasie tussen siekte-geassosieerde en polimorfiese variante. Die waargenome tellingreeks (-6, 6) is in 1000 diskrete drempels verdeel, en vir elk van die drempels is persentasies siekte-geassosieerde en polimorfiese variante bo en onder die tellingdrempel bepaal. Die persentasie siekteverwante variante bo die tellingdrempel word gedefinieer as 'ware positiewes', terwyl die persentasie polimorfiese variante bo die tellingdrempel as 'vals positiewes' gedefinieer word. Die ROC-kurwes is vir twee toetsstelle gebou: in die eerste stel is alle beskikbare ~55.7 K variante (~19.2 K siekte-geassosieerde en ~36.5 K polimorfies) gebruik. Die tellings van die variante wat op streke val met geen volgorde homologie was gelyk aan nul geneem in die tweede stel, is die tellings vir 'n verminderde stel van ~27.4 K variante (~13.7 K siekte-geassosieerde en ~13.6 K polimorfies) bereken deur gebruik te maak van belynings van 75 of meer rye.

ROC-analise van klassifikasie tussen siekte-geassosieerde en polimorfiese variante. Die waargenome tellingreeks (-6, 6) is in 1000 diskrete drempels verdeel, en vir elk van die drempels is persentasies siekte-geassosieerde en polimorfiese variante bo en onder die tellingdrempel bepaal. Die persentasie siekteverwante variante bo die tellingdrempel word gedefinieer as 'ware positiewes', terwyl die persentasie polimorfiese variante bo die tellingdrempel as 'vals positiewes' gedefinieer word. Die ROC-kurwes is vir twee toetsstelle gebou: in die eerste stel is alle beskikbare ~55.7 K variante (~19.2 K siekte-geassosieerde en ~36.5 K polimorfies) gebruik. Die tellings van die variante wat op streke val met geen volgorde homologie was gelyk aan nul geneem in die tweede stel, is die tellings vir 'n verminderde stel van ~27.4 K variante (~13.7 K siekte-geassosieerde en ~13.6 K polimorfies) bereken deur gebruik te maak van belynings van 75 of meer rye.

Stewigheidsanalise

'n Beduidende fraksie van polimorfiese variante val in streke met 'n lae dekking van volgorde homoloë (Figuur 2). Uit die totale ~55 K variante val ~10.5 K variante in streke met lae homologie dekking (MSA het <10 rye). Onder hierdie variante was ~90% polimorfiese variante en slegs ~10% was siekteverwante variante. Per definisie kry variante met lae homologiedekking lae tellingwaardes. Hoe beïnvloed die ongelyke verspreiding van variante met lae homologie dekking die algehele akkuraatheid van skeiding tussen siekte-geassosieerde en polimorfiese variante? Hoe hang die akkuraatheid van skeiding tussen siekte-geassosieerde en polimorfiese variante af van die grootte van 'n meervoudige volgorde-belyning? Om hierdie vrae te beantwoord, het ons die tellingverdelings vergelyk vir siektegeassosieerde en polimorfiese variante wat homologiedekking van 1 tot 600 of meer reekse in 'n familiebelyning het. Ons het gevind dat die verryking van lae-homologie polimorfiese variante verdwyn, wanneer die minimale belyningsgrootte 75 rye per gesin oorskry. Met dekking van 75 of meer reekse word ~14 K siekte-geassosieerde en 14 K polimorfiese variante (~1/2 van alle getoetsde mutasies) geskei met 'n akkuraatheid van beter as 76% en 'n AUC in ROC-analise 0.83 (Figuur 3) . Ons het hierdie toets herhaal met groter groottes van die minimale belyning. Die akkuraatheid van skeiding tussen siekte-geassosieerde en polimorfiese variante het dieselfde gebly, wanneer die minimale belyningsgrootte van 30 tot tot 350 rye gewissel het. Dus, die waargenome verryking van polimorfiese variante in streke met lae homologie dekking in die praktyk beïnvloed nie die diskriminasie van siekte-geassosieerde en polimorfiese variante nie. Bykomende besonderhede van die herkenningstoetse by verskillende belyningsgroottes word in Aanvullende Tabel S1 gegee.

Validasie van die FI-telling op eksperimenteel getoetste TP53-mutasies

Gevalle waarin die voorspelde funksionele impak van mutasies vergelyk kan word met direkte metings van funksionele aktiwiteit is van spesiale belang vir validering van die metode. Daarom het ons die FI-telling getoets op data verkry uit eksperimentele studies van die funksionele impak van TP53-mutasies soos versamel in die IARC TP53-databasis (55). TP53 mutante in kanker kan lei tot beide 'verlies van funksie' en, in sommige gevalle, 'wins of function' (56). Alhoewel die biologiese effekte van TP53-mutasies in kanker deur verskeie post-transkripsionele faktore (56) beïnvloed word, is direkte metings van transkripsie-aktiwiteit van mutante TP53 baie nuttig vir die assessering van die impak van TP53-mutasies in kanker.

Vir elk van die 2314 mutasies, gee die 'TP53MUTfunction2R15'-tabel van IARC TP53-databasis agt promotor-spesifieke transkripsie-aktiwiteite gemeet in gis funksionele toetse en uitgedruk as persentasie van wild-tipe aktiwiteit. Alhoewel hierdie agt genormaliseerde aktiwiteite oor alle bestudeerde mutante gekorreleer is, kan die individuele aktiwiteite gemeet vir 'n spesifieke TP53 mutant aansienlik verskil. Veral die gemiddelde waarde van die standaardafwyking van agt aktiwiteite is ~25%. Om die mutasie-impakafwykings en eksperimentele foute te verminder, het ons gemiddelde waardes van transkripsieaktiwiteite bereken en die FIS-verspreidings vir mutasies vergelyk, waarvan die gemiddelde aktiwiteite in agt afsonderlike bakke geval het: [0-20], [20-40], [40 –60], [60–80], [80–100], [100–120], [120–140], [140–250] (Figuur 4). Die aktiwiteit van die normale TP53 is gelyk aan 100.

FIS-verspreidings van mutasies in TP53 is in agt klasse ingedeel gebaseer op mutasie-impak. Die genormaliseerde transkripsie-aktiwiteite van 2314 TP53-mutante is gemiddeld en, afhangende van die gemiddelde aktiwiteitswaarde, is die mutasies in agt klasse ingedeel, die reekse van die gemiddelde transkripsie-aktiwiteit word onder die bin-punte gegee. Die FIS-verspreidings word aangebied deur die blokkies plotte dik swart lyne wys die mediane van die verspreidings elkeen van die blokkies is geteken tussen die onderste en boonste kwartiele van die verspreidings die stippellyne strek tot by die minimum en maksimum waardes van die verspreidings. Die mutasies met groter funksionele impak, dit wil sê hoër of laer as normale transkripsionele aktiwiteit ('verlies van funksie' of 'wins van funksie') is geneig om die hoër waardes van die FIS-telling te hê.

FIS-verspreidings van mutasies in TP53 is in agt klasse ingedeel gebaseer op mutasie-impak. Die genormaliseerde transkripsie-aktiwiteite van 2314 TP53-mutante is gemiddeld en, afhangende van die gemiddelde aktiwiteitswaarde, is die mutasies in agt klasse ingedeel, die reekse van die gemiddelde transkripsie-aktiwiteit word onder die bin-punte gegee. Die FIS-verspreidings word aangebied deur die blokkies plotte dik swart lyne wys die mediane van die verspreidings elkeen van die blokkies is geteken tussen die onderste en boonste kwartiele van die verspreidings die stippellyne strek tot by die minimum en maksimum waardes van die verspreidings. Die mutasies met groter funksionele impak, dit wil sê hoër of laer as normale transkripsionele aktiwiteit ('verlies van funksie' of 'wins van funksie') is geneig om die hoër waardes van die FIS-telling te hê.

Dit is duidelik dat mutasies in bakkies [0-20] en [80-100] aansienlik verskil deur hul funksionele impak. Mutasies in bin [0-20] verminder opvallend die transkripsionele aktiwiteit van TP53, terwyl mutasies in bin [80-100] naby aan normaal is. Die verskil tussen twee mutasieklasse word duidelik uitgebeeld deur die ooreenstemmende FIS-verdelings. Die mutasies van bin [0-20] kry aansienlik hoër punte as mutasies van bin [80-100]. Die oppervlakte onder die ontvanger-bedryf-kenmerk-kromme vir twee-klas onderskeid tussen mutasies van bin [0-20] en mutasies van bin [80-100] is naby aan 0.93. Die FIS-verspreidings van bakkies [20-40], [40-60] en [60-80] word verskuif van die hoër na die laer waardes, wat in ooreenstemming is met die toename in transkripsie-aktiwiteit van TP53. Die FIS-verspreidings in bakkies [100-120], [120-140], [140-250] word verskuif van laer waardes na hoër waardes, ook in ooreenstemming met die toename in die transkripsionele aktiwiteit van TP53. Dus, die funksionele impak telling is gekorreleer met eksperimenteel gemeet funksionele impak van mutasies: die telling is hoër vir mutasies wat lei tot 'verlies van funksie' en in 'wins of function' van TP53. Meer besonderhede oor die FIS-verspreidings van TP53-mutasies word in Aanvullende Data S4 gegee.

Funksionele mutasies in die COSMIC databasis

Daar is tans ~10.7 K nie-sinonieme puntmutasies in verskeie gewasse gelys in die Katalogus van Somatiese Mutasies in Kanker (COSMIC, v49). Baie van hierdie mutasies is eksperimenteel bestudeer en hul funksionele impak en rol in kanker is redelik goed gekarakteriseer. Vir die meerderheid van mutasies bly hul funksionele impak en rol in kanker egter onbekend.

Cancer mutations can be ranked by the number of occurrences of particular mutations similarly, the genes implicated in cancer can be ranked by the total number of mutations detected for a particular gene. Obviously, particular numbers of mutations depend on sampling. However, in general, mutations that promote cancer will be selected more frequently than neutral mutations, and therefore, recurrent mutations and recurrently mutated genes are likely to play a key role in cancer. Thus, mutations of frequently mutated genes are likely to be functional.

In this study, we ranked mutations and mutated genes by their potential role in cancer by combining several factors: the predicted impact of a mutation on protein function, the occurrence of an individual mutation in different tumors, the total numbers of mutations detected for a particular gene and the gene’s role in cancer (tumor suppressor or oncogene) provided by a Cancer Gene resource at MSKCC ( 57 ).

To substantiate ranking of mutations and genes, we conducted three computational tests. In the first test, we tested the hypothesis that recurrently observed cancer mutations are significantly enriched by mutations of predicted high functional impact and therefore can be differentiated from single mutations, many of which are passenger mutations with low functional impact. In the second test, we tested a similar hypothesis that is mutations of frequently mutated genes are enriched by predicted functional mutations as compared to mutations of solitary mutated genes. In the third test, the scores of mutations in tumor suppressors (TS) or oncogenes (OG) were compared to the scores of mutations in the genes non-annotated as TS or OG. We tested the hypothesis that mutations in primary cancer genes (TS and OG) are enriched by high scoring functional mutations as compared to the mutations of non-cancer genes and therefore can be differentiated from all other mutations.

The ‘case and control’ sets of mutations used in these tests are not completely independent because many of recurrent mutations affect multiply mutated genes with key roles in cancer (TS or OG). However, conducted together, these tests give a more complete presentation of the distribution of functional mutations in cancer than each of the tests conducted individually.

The results of the tests are in Figures 5 and 6 .

Cumulative score distributions computed for recurrent cancer mutations in the COSMIC database (release 49, September, 2010), the scores were computed for 10 005 unique non-synonymous point mutations affecting 3630 genes. Recurrent cancer mutations observed two or more times (1828) and highly recurrent mutations observed five or more times (712) are scoring significantly higher compared to mutations observed only once (8177) the ROC analysis (not shown) of separation of recurrent mutations from one-time-observed mutations gives AUC = 0.75 the accuracy of separation is ∼69%, when a percentage of false positives is equal to a percentage of false negatives.

Cumulative score distributions computed for recurrent cancer mutations in the COSMIC database (release 49, September, 2010), the scores were computed for 10 005 unique non-synonymous point mutations affecting 3630 genes. Recurrent cancer mutations observed two or more times (1828) and highly recurrent mutations observed five or more times (712) are scoring significantly higher compared to mutations observed only once (8177) the ROC analysis (not shown) of separation of recurrent mutations from one-time-observed mutations gives AUC = 0.75 the accuracy of separation is ∼69%, when a percentage of false positives is equal to a percentage of false negatives.

Cumulative score distributions computed for mutations of multiply mutated genes in the COSMIC database mutations in COSMIC are distributed non-uniformly across genes: one mutation per gene is detected in 1349 genes two or more mutations are detected in 620 genes, three or more—in 265 genes, five or more in 96 genes, 10 or more—in 51 genes, 19 or more—in 37 genes. Multiply mutated genes (mutated two or more times) are enriched in high score mutations compared to single mutated genes and polymorphisms.

Cumulative score distributions computed for mutations of multiply mutated genes in the COSMIC database mutations in COSMIC are distributed non-uniformly across genes: one mutation per gene is detected in 1349 genes two or more mutations are detected in 620 genes, three or more—in 265 genes, five or more in 96 genes, 10 or more—in 51 genes, 19 or more—in 37 genes. Multiply mutated genes (mutated two or more times) are enriched in high score mutations compared to single mutated genes and polymorphisms.

The FIS score distributions of Figure 5 show that cancer mutations collected in COSMIC are more significantly enriched in high score mutations than are polymorphic variants. Interestingly, recurrent mutations (observed in two or more samples) have a score distribution very close to the score distribution of disease-associated variants, while highly recurrent mutations (observed in five or more samples) are even more significantly enriched in high-score mutations than disease-associated variants ( Figure 5 ). We also found that mutations of singly mutated genes in COSMIC are two times more enriched in high scoring mutations than are polymorphic mutations.

These results confirm the hypothesis that recurrent mutations are likely to be functional mutations and can be differentiated from single mutations by the evolutionary derived functional impact score.

The score distributions of Figures 6 and 7 also confirm that mutations of multiply mutated gene and mutations in annotated tumor suppressors and oncogenes are enriched in functional mutations: multiply mutated genes (mutated two or more times) are more enriched in high score mutations than singly mutated genes and polymorphisms.

Cumulative score distributions computed for mutations in genes annotated as tumor suppressors and oncogenes in the COSMIC database 4413 mutations in tumor suppressors and oncogenes are enriched in high-scoring mutations compared to 5592 mutations in genes non-annotated as TS and OG. The ROC analysis (not shown) of separation of recurrent mutations from one-time-observed mutations gives AUC = 0.6745 accuracy of separation is 64%, when the percentage of false positives is equal to the percentage of false negatives.

Cumulative score distributions computed for mutations in genes annotated as tumor suppressors and oncogenes in the COSMIC database 4413 mutations in tumor suppressors and oncogenes are enriched in high-scoring mutations compared to 5592 mutations in genes non-annotated as TS and OG. The ROC analysis (not shown) of separation of recurrent mutations from one-time-observed mutations gives AUC = 0.6745 accuracy of separation is 64%, when the percentage of false positives is equal to the percentage of false negatives.

We found that the more mutations are observed in a gene, the bigger the fraction of high scoring mutations in this gene ( Figure 6 ). However, the portion of high-scoring mutations in multiply mutated genes is smaller than in disease-associated variants or in recurrent individual mutations. We found similar results for mutations detected in key cancer genes—tumor suppressors and oncogenes ( Figure 7 ). Mutations in TSs and OGs are scoring significantly higher than mutations in genes non-annotated as TSs or OGs. However a fraction of high-scoring mutations in TSs and OGs is less, than in a reference set of disease-associated mutations. Taking into account that the score generally correctly distinguish functional and non-functional mutations ( Figures 2–5 ), this difference emphasizes the fact that not all mutations in multiply mutated genes or in known cancer genes are automatically functional and, hence, not all of them play role in cancer. Thus, a functional analysis of mutations is necessary to narrow down a list of potential driver mutations.

Ranked list of cancer mutations and cancer genes

Ranking mutations by a functional impact score makes possible the determination of mutation sets that are enriched by either functional or non-functional mutations. Obviously, there is no strict value of the score that can definitely separate functional and non-functional mutations. However there is a score threshold that separates sets of likely functional and likely non-functional mutations. Using this threshold, one can assess a number of functional mutations in a given set of mutations.

Using available sequence data, the automated procedure could assess a functional impact of ∼10 K unique mutations of the total ∼10.7 K unique mutations of COSMIC database (release 49). Based on the computed scores and the optimal separation threshold ( Figure 4 ), a portion of mutations of high and medium impact can be estimated as ∼51%. A summary of functional analysis of missense mutations from COSMIC database is given in Table 1 and Figure 8 . Note significant enrichment of predicted functional mutations in recurrent mutations, cancer genes (TS or OG) and in genes with multiple mutations.

Ranking mutated genes by significance for cancer. The cancer gene ranking score (R s ), derived from information reported in the COSMIC database, is defined as Rs = log 2 ( Nm * Nc ), waar Nm is a number of unique cancer-associated mutations reported in the gene, and Nc is a number of different cancer types with mutations in this gene. All analyzed 3629 genes were divided into four categories depending on presence or absence of predicted functional mutations and known association to cancer (gene is considered as cancer associated, if it is annotated as TS or OG, or it interacts with one or more of TS or OG). Cancer associated genes are enriched with predicted functional mutations ( P < 10 −20 in two-tail Fisher test) compared to genes with unknown cancer association. Using a reasonable cutoff, one nominates a list of 957 genes with significance for cancer (arrow). A gene is above the cut either because it is observed to be multiply mutated ( Rs > 1, three or more mutations) or, for Rs = 1 (two mutations), if at least one of the mutations in the gene is predicted as functional. Detailed statistical information on mutated genes is in Supplementary Table SM2 . The higher proportion of genes with at least one predicted functional mutation (orange or brown) in frequently mutated genes (peak at left) is not surprising—in fact, a fair number of these mutations have been functionally validated in the literature. A particularly interesting set of genes (998, bottom left) are those that (so far) have been observed just once ( Rs = 0) but contain a mutation predicted to be functional. Such genes may be rare, but functionally significant, contributors to oncogenesis and are good candidates for experimental follow-up.

Ranking mutated genes by significance for cancer. The cancer gene ranking score (R s ), derived from information reported in the COSMIC database, is defined as Rs = log 2 ( Nm * Nc ), waar Nm is a number of unique cancer-associated mutations reported in the gene, and Nc is a number of different cancer types with mutations in this gene. All analyzed 3629 genes were divided into four categories depending on presence or absence of predicted functional mutations and known association to cancer (gene is considered as cancer associated, if it is annotated as TS or OG, or it interacts with one or more of TS or OG). Cancer associated genes are enriched with predicted functional mutations ( P < 10 −20 in two-tail Fisher test) compared to genes with unknown cancer association. Using a reasonable cutoff, one nominates a list of 957 genes with significance for cancer (arrow). A gene is above the cut either because it is observed to be multiply mutated ( Rs > 1, three or more mutations) or, for Rs = 1 (two mutations), if at least one of the mutations in the gene is predicted as functional. Detailed statistical information on mutated genes is in Supplementary Table SM2 . The higher proportion of genes with at least one predicted functional mutation (orange or brown) in frequently mutated genes (peak at left) is not surprising—in fact, a fair number of these mutations have been functionally validated in the literature. A particularly interesting set of genes (998, bottom left) are those that (so far) have been observed just once ( Rs = 0) but contain a mutation predicted to be functional. Such genes may be rare, but functionally significant, contributors to oncogenesis and are good candidates for experimental follow-up.

Prediction of the functional impact of mutations observed in cancer (COSMIC database a )

Mutations/Genes . All scored (taken as 100%) n . Scored (FIS ≤ 0.8) neutral impact b , n (%) . Scored (0.8 < FIS ≤ 1.9) low impact, n (%) . Scored (1.9 < FIS ≤ 3.5) medium impact, n (%) . Scored (FIS > 3.5) high impact, n (%) .
Mutations total 10 005 2049 (20) 2814 (28) 3748 (37.5) 1349 (13.5)
Mutations observed ≥2 times 1828 89 (5) 254 (14) 862 (47.2) 623 (34.1)
Mutations observed 1 time 8177 1960 (24) 2560 (31) 2886 (35.3) 771 (9.4)
Mutations in one-time-mutated genes not annotated as TS c or OG c 2324 699 (30) 777 (33) 693 (29.8) 155 (6.7)
Mutations in genes mutated ≥5 times 5174 616 (12) 1198 (23) 2314 (44.7) 1046 (20.2)
Mutations in TS or OG 4413 477 (11) 996 (23) 2000 (45.3) 940 (21.3)
Mutations in genes not annotated as TS or OG 5592 1572 (28) 1818 (33) 1748 (31.3) 454 (8.1)
Genes total d 3629 841 (23) 1115 (31) 1268 (34.9) 405 (11)
Genes annotated as TS or OG 338 50 (15) 124 (37) 92 (27.2) 72 (21)
Genes mutated ≥5 times 188 2 (1) 9 (5) 96 (51.1) 81 (43)
Mutations/Genes . All scored (taken as 100%) n . Scored (FIS ≤ 0.8) neutral impact b , n (%) . Scored (0.8 < FIS ≤ 1.9) low impact, n (%) . Scored (1.9 < FIS ≤ 3.5) medium impact, n (%) . Scored (FIS > 3.5) high impact, n (%) .
Mutations total 10 005 2049 (20) 2814 (28) 3748 (37.5) 1349 (13.5)
Mutations observed ≥2 times 1828 89 (5) 254 (14) 862 (47.2) 623 (34.1)
Mutations observed 1 time 8177 1960 (24) 2560 (31) 2886 (35.3) 771 (9.4)
Mutations in one-time-mutated genes not annotated as TS c or OG c 2324 699 (30) 777 (33) 693 (29.8) 155 (6.7)
Mutations in genes mutated ≥5 times 5174 616 (12) 1198 (23) 2314 (44.7) 1046 (20.2)
Mutations in TS or OG 4413 477 (11) 996 (23) 2000 (45.3) 940 (21.3)
Mutations in genes not annotated as TS or OG 5592 1572 (28) 1818 (33) 1748 (31.3) 454 (8.1)
Genes total d 3629 841 (23) 1115 (31) 1268 (34.9) 405 (11)
Genes annotated as TS or OG 338 50 (15) 124 (37) 92 (27.2) 72 (21)
Genes mutated ≥5 times 188 2 (1) 9 (5) 96 (51.1) 81 (43)

a Of the total 10716 missense mutations in COSMIC 45, 10 005 mutations were mapped on sequences of UniProt, and determined as unique non-synonymous.

b Approximately 50% of polymorphic variants and ∼7% of disease-associated variants got FIS score <0.8 ∼27% of polymorphic variants and ∼14% of disease-associated variants got FIS score between 0.8 and 1.9 ∼17% of polymorphic variants and ∼50% of disease-associated variants got FIS score between 1.9 and 3.5 ∼21% of polymorphic variants and ∼79% of disease-associated variants got FIS score >1.9 ∼3% of polymorphic variants and ∼30% of disease-associated variants got FIS score >3.5.

c TS and OG stand, respectively, for tumor suppressor and oncogene.

d A gene is scored according to the highest FIS bin of any of its mutations.

Prediction of the functional impact of mutations observed in cancer (COSMIC database a )

Mutations/Genes . All scored (taken as 100%) n . Scored (FIS ≤ 0.8) neutral impact b , n (%) . Scored (0.8 < FIS ≤ 1.9) low impact, n (%) . Scored (1.9 < FIS ≤ 3.5) medium impact, n (%) . Scored (FIS > 3.5) high impact, n (%) .
Mutations total 10 005 2049 (20) 2814 (28) 3748 (37.5) 1349 (13.5)
Mutations observed ≥2 times 1828 89 (5) 254 (14) 862 (47.2) 623 (34.1)
Mutations observed 1 time 8177 1960 (24) 2560 (31) 2886 (35.3) 771 (9.4)
Mutations in one-time-mutated genes not annotated as TS c or OG c 2324 699 (30) 777 (33) 693 (29.8) 155 (6.7)
Mutations in genes mutated ≥5 times 5174 616 (12) 1198 (23) 2314 (44.7) 1046 (20.2)
Mutations in TS or OG 4413 477 (11) 996 (23) 2000 (45.3) 940 (21.3)
Mutations in genes not annotated as TS or OG 5592 1572 (28) 1818 (33) 1748 (31.3) 454 (8.1)
Genes total d 3629 841 (23) 1115 (31) 1268 (34.9) 405 (11)
Genes annotated as TS or OG 338 50 (15) 124 (37) 92 (27.2) 72 (21)
Genes mutated ≥5 times 188 2 (1) 9 (5) 96 (51.1) 81 (43)
Mutations/Genes . All scored (taken as 100%) n . Scored (FIS ≤ 0.8) neutral impact b , n (%) . Scored (0.8 < FIS ≤ 1.9) low impact, n (%) . Scored (1.9 < FIS ≤ 3.5) medium impact, n (%) . Scored (FIS > 3.5) high impact, n (%) .
Mutations total 10 005 2049 (20) 2814 (28) 3748 (37.5) 1349 (13.5)
Mutations observed ≥2 times 1828 89 (5) 254 (14) 862 (47.2) 623 (34.1)
Mutations observed 1 time 8177 1960 (24) 2560 (31) 2886 (35.3) 771 (9.4)
Mutations in one-time-mutated genes not annotated as TS c or OG c 2324 699 (30) 777 (33) 693 (29.8) 155 (6.7)
Mutations in genes mutated ≥5 times 5174 616 (12) 1198 (23) 2314 (44.7) 1046 (20.2)
Mutations in TS or OG 4413 477 (11) 996 (23) 2000 (45.3) 940 (21.3)
Mutations in genes not annotated as TS or OG 5592 1572 (28) 1818 (33) 1748 (31.3) 454 (8.1)
Genes total d 3629 841 (23) 1115 (31) 1268 (34.9) 405 (11)
Genes annotated as TS or OG 338 50 (15) 124 (37) 92 (27.2) 72 (21)
Genes mutated ≥5 times 188 2 (1) 9 (5) 96 (51.1) 81 (43)

a Of the total 10716 missense mutations in COSMIC 45, 10 005 mutations were mapped on sequences of UniProt, and determined as unique non-synonymous.

b Approximately 50% of polymorphic variants and ∼7% of disease-associated variants got FIS score <0.8 ∼27% of polymorphic variants and ∼14% of disease-associated variants got FIS score between 0.8 and 1.9 ∼17% of polymorphic variants and ∼50% of disease-associated variants got FIS score between 1.9 and 3.5 ∼21% of polymorphic variants and ∼79% of disease-associated variants got FIS score >1.9 ∼3% of polymorphic variants and ∼30% of disease-associated variants got FIS score >3.5.

c TS and OG stand, respectively, for tumor suppressor and oncogene.

d A gene is scored according to the highest FIS bin of any of its mutations.

In Supplementary Table SM1 , we present a list of COSMIC mutations (10 005) for which the functional impact score was computed. For each of the mutations, the table provides its sequence and genomic coordinates, the functional impact characteristics of the mutation, the characteristics of the protein domain family, the statistics of cancer mutations in a gene and the basic oncogenic annotations, the URL links presenting mutation in context of MSA and homologous 3D structures of PDB.

We used our assessments of mutation impact to rank genes by their significance for cancer. To that end, we divided all genes into four categories taking into account the presence or absence of predicted functional mutations in a gene and gene’s known associations with cancer. Genes annotated as TS or OG and genes interacting with TS or OG genes were defined as associated with cancer gene interactions were taken from the PIANA database ( 58 ) cancer annotations were taken from the Cancer Gene resource at MSKCC ( 57 ).

Genes were classified into the following categories: (i) genes with functional mutations and known cancer association (ii) genes with functional mutations and no available associations with cancer (iii) genes with no functional mutations and with known cancer association and (iv) genes with no functional mutations and no available associations with cancer. It is reasonable to assume that the more unique mutations are detected in a gene, and the more cancer types are affected by these mutations, the more important this gene is for development of cancer. Therefore we used as a gene ranking score the product of the ‘number of unique mutations’ and the ‘number of different cancer types’ affected by these mutations. Note that truncating mutations, i.e. premature stop codons (so-called non-sense mutations) are not taken into account.

The ranked list of 3629 genes is given in Supplementary Table SM2 . Distributions of the gene ranking scores are in Figure 8 . Genes with multiple mutations and genes with cancer associations are at the top of the list. Based on this ranking, we nominated ∼957 genes as genes with very likely cancer implications ( Figure 8 ). These genes are of primary interest for experimental cancer genomics projects. The specific oncogenic roles of many of the top-scoring mutated genes are known or being studied. However, there are many genes of ‘moderate significance’, i.e. mutated approximately two times in approximately two cancers, for which the specific oncogenic roles in cancer is not yet well determined. A particularly interesting set of genes ( Figure 8 , bottom left) are those that have been observed just once, as reported in the COSMIC database, but contain a mutation predicted to be functional. Such genes may be rare, but functionally significant, contributors to oncogenesis and are good candidates for experimental follow-up.

Switch-of-function: a new type of functional impact?

The effect of a functional mutation can be described as a change of the specificity (selectivity) of interactions between a mutated protein and its specific interactors—proteins, nucleic acids or small molecules. One can imagine a set of free energies of interactions between a given protein and all other proteins and ligands. As a result of a mutation, the native spectrum of the binding free energies will change. An extreme example of a strong functional impact of a mutation is a destabilization of a protein globule resulting in the complete loss of the specificity, i.e. a ‘loss of function’ (LOF). The opposite example of a functional impact is a ‘gain of function’ (GOF), which can result from a change in the specificity of particular protein-substrate interactions or a change in the specificity of interactions with regulatory proteins. Both LOF and GOF mutations assume changes of free energies of interaction with inheems binders.

However, a mutation impact can also result in a ‘switch of function’ (SOF), which is an acquisition of new specific interactors and, consequently, a new biological function. A mutation in a protein-binding site can result in new specific interactions. One mutation in a binding site of isocitrate dehydrogenase 1 (IDH1) that resulted in a switch of molecular function was recently discovered in glioma ( 44 ). Mutations of R132 to C, H, L or S alter the activity of IDH1, such that isocitrate is no longer converted to alpha-ketoglutarate, but, instead, alpha-ketoglutarate is converted to R(-)-2-hydroxyglutarate, which elevates the risk of brain tumors ( 44 ). The affected position, R132 is highly conserved in the protein family alignment and all the above mutations get a high FIS score.

In cells, there are many families of homologous proteins (and protein domains) each protein in such a family has its own specific function and specific interactors. Mutations can switch the specific interaction between a protein family members resulting in a drastic impact on the phenotype. Mutations in evolutionarily selected specificity residues are likely candidates for SOF. Switch of the specific signaling of Rho GTPases caused by mutations was studied experimentally Heo and Meyer ( 59 ). All of the experimentally studied functional mutations reported in ( 59 ) have a high specificity component of the FIS score. In Figure 9 , we show the multiple sequence alignment and the 3D position of one the key mutations that switch the Rac1-signaling phenotype (lamellipodia) to the Cdc42-signaling phenotype (filopodia) ( 59 ). This mutation affects one of the key predicted specificity positions of Rho family.

Functional mutation in a predicted specificity position of RAC1 (Ras-related C3 botulinum toxin substrate 1). ( A ) The mutation affects a residue that is conserved as A (Ala) in subfamily #1 (top sequences, close homologues of RAC1) and as E (Glu) in subfamily #2 (bottom sequences, close homologues of CDC42) Uniprot name, species identifier, residues number range and subfamily number are in left columns. The sequence subfamilies and specificity scores (vertical bars at top) were computed from a non-redundant MSA (multiple sequence alignment) of 274 sequences using CEO clustering. The mutation A95E of RAC1 has a high specificity score in RAC1. ( B ) The position affected by the mutation is in the binding interface of RAC1 in contact with the T-lymphoma invasion and metastasis factor 1 (Tiam1) (PDB code 1foe).

Functional mutation in a predicted specificity position of RAC1 (Ras-related C3 botulinum toxin substrate 1). ( A ) The mutation affects a residue that is conserved as A (Ala) in subfamily #1 (top sequences, close homologues of RAC1) and as E (Glu) in subfamily #2 (bottom sequences, close homologues of CDC42) Uniprot name, species identifier, residues number range and subfamily number are in left columns. The sequence subfamilies and specificity scores (vertical bars at top) were computed from a non-redundant MSA (multiple sequence alignment) of 274 sequences using CEO clustering. The mutation A95E of RAC1 has a high specificity score in RAC1. ( B ) The position affected by the mutation is in the binding interface of RAC1 in contact with the T-lymphoma invasion and metastasis factor 1 (Tiam1) (PDB code 1foe).

The mutation-caused rewiring of protein interaction network is of the high interest in cancer, where missense mutations are one of the common factors of oncogenesis. Currently, it is impossible to determine such mutations by direct De novo modeling. However, one can narrow down a list of potential SOF mutations by determining mutations in binding sites that change the identity of an amino acid residue located in one of the functional evolutionarily selected (predicted specificity) positions, and, especially by determining mutations that change amino acid identities between already existing groups (classes) of residue specificity. We estimated a number of such mutations by identifying mutations that fall into binding sites and that have high specificity score and low conservation score. Among ∼10 K mutations of COSMIC, 3631 affect annotated functional regions and binding sites, and, among those, 554 mutations (∼5%) have a specificity score >2.5 (top 25%) and a conservation score less than the specificity score. This set of mutations is enriched in potential SOF mutations. A few examples of putative SOF mutation are given in Supplementary Data S5 .


Identifying the mutations

Figure 7: Identifying
sequence mutations. Click to
enlarge image

Using the worksheets, the students will compare a section of DNA sequence from a healthy cell and a tumour cell from the same patient. The easiest way to identify whether a mutation has occurred is to write the DNA sequence below the coloured peaks (there is a colour key on the sheet to help) and to compare the written sequences.

If one of the letters is different (a peak has changed colour), this indicates a mutation in the sequence. In Figure 7 (right), the A in the DNA sequence from the healthy cell has been replaced by G in the tumour cell.


Figure 9: Ticking off the gene
regions that have been
checked and marking any
mutasies

Image courtesy of the
Welkom Trust Sanger Institute
Communication and Public
Engagement team

Figure 8: A heterozygous
mutasie. Click to enlarge
image

Images courtesy of the
Welkom Trust Sanger Institute
Communication and Public
Engagement team

If the students find a double peak at one base position, this should be recorded with the two alternative bases at that position, one above the other. In Figure 8, the healthy DNA sequence has a G, whereas the tumour sequence has both G and C. This is not an insertion: it represents a heterozygous mutation where only one copy of the gene has substituted a C for a G. In this case the tumour sequence has replaced G with a C.

All students should indicate the gene regions they have checked by ticking off the relevant region on the gene sheet (see Figure 9).

Students who find a mutation should indicate the specific base by circling it on the gene sheet (see Figure 9, left) and make a note of which codon this lies in (in this example, codon 12).

They should also fill in the table at the base of the worksheet, using the codon wheel to translate the DNA sequence into the amino acid, as shown in Table 1:

Table1: Mutations as listed on th individual worksheets
Amino acid number Healthy cell DNA sequence Tumour cell DNA sequence Healthy cell amino acid Tumour cell amino acid
12 GGT GTT Glycine (G) Valine (V)

When all mutations have been found, record them on the summary data sheet (see Table 2).

Table 2: Mutations as recorded on the summary data sheet
Amino acid number Healthy cell DNA sequence Tumour cell DNA sequence Healthy cell amino acid Tumour cell amino acid
12 GGT G T T G (glycine) V (valine)
13 GGC G A C G (glycine) D (aspartic acid)
30 GAC GA T D (aspartic acid) D (aspartic acid)
61 CAA C G A Q (glutamine) R (arginine)
146 GCA C CA A (alanine) P (proline)
173 GAT GA C D (aspartic acid) D (aspartic acid)

Discussing the results

The results above are all single base substitutions. These mutations within the protein-coding region of the KRAS gene may be classified into one of three types, depending on the information encoded by the altered codon.

  • Silent mutations code for the same amino acid.
  • Missense mutations code for a different amino acid.
  • Nonsense mutations code for a stop and can truncate the protein.

Discuss whether the mutations are significant – will they have an impact on protein function or are they ‘silent’? In this activity, codons 30 and 173 are silent and therefore do not have a functional impact.

Table 3: Type of mutation, as recorded onthe summary data sheet
Amino acid number Healthy cell DNA sequence Tumour cell DNA sequence Healthy cell amino acid Tumour cell amino acid Type of Mutation Significant yes / no
12 GGT G T T G (glycine) V (valine) Point (missense) yes
13 GGC G A C G (glycine) D (aspartic acid) Point (missense) yes
30 GAC GA T D (aspartic acid) D (aspartic acid) Point (silent) geen
61 CAA C G A Q (glutamine) R (arginine) Point (missense) yes
146 GCA C CA A (alanine) P (proline) Point (missense) yes
173 GAT GA C D (aspartic acid) D (aspartic acid) Point (silent) geen


Figure 10: A 3D
representation of the KRAS
proteïen. Amino acids 12
(blue), 13 (yellow), 61
(orange) and 146 (pink) are
those which carry mutations

Image courtesy of the
Welkom Trust Sanger Institute
Communication and Public
Engagement team, created with
RasMol

The presentation has a 3D space-fill image of the KRAS protein (Figure 10, right) slides 26–30 show where on the protein the significant mutations are, and you will notice they are all in the same region. Codons 12, 13 and 61 were the first mutations to be associated with oncogenic transformation in the KRAS protein mutation 146 was only discovered in 2005. Use these slides to discuss the impact that the mutations could have on protein structure and KRAS’s function in growth signalling.

As an optional activity, the students can use RasMol, the molecular modelling software used to create the images on slides 26–30, to highlight the mutated amino acids in the protein structure. See the teacher notes w6 for details.


Kyk die video: Mutations Updated (September 2022).