Inligting

Reeks haalbare koëffisiënte in 'n onbeperkte groeimodel

Reeks haalbare koëffisiënte in 'n onbeperkte groeimodel


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

As jy 'n onbeperkte groeimodel in die vorm gegee word:

$frac {dP(t)}{dt} = k P(t)$

Uiteraard sal die bevolkingsgroei nooit onbeperk wees nie, maar kom ons veronderstel vir die oomblik dat ons 'n spesie in 'n omgewing inbring waar daar die moontlikheid is vir onbeperkte groei, ten minste vir 'n gegewe tyd -- d.w.s. indringerspesies.

$k$ is 'n mate van groei van die bevolking op tyd $t$, aangedui deur $P(t)$

Wat is 'n paar haalbare waardes van $k$? Met ander woorde, as 'n getal ver bo of ver onder $k$ is, waar sou ek weet dat die navorsing wat ek lees, belaglik off-basis is?

Ek is seker dit is anders vir verskillende soorte diere, insluitend soogdiere, voëls, bakterieë, ens.


'n Vaste limiet: k moet groter as nul wees. Tensy jy praat van een of ander kannibalistiese spesie of iets wat glad nie by die model pas nie.

Solank die spesie produktief is in die nuwe omgewing: k is groter as 1. Die bevolking groei waarskynlik of weer sal jy waarskynlik nie 'n eksponensiële groeimodel gebruik nie.

Soos voorheen genoem, sal jy die spesie moet ken vir meer inligting. Maar as jy na generasietye en rommelgroottes kyk:

  • Sommige bakteries generasietye (van hier af) wissel van 10 tot 2000 minute (33 uur). Dit is dus $k=2$ per generasietyd. Per dag kyk jy na 'n ondergrens van 2 per dag en 'n boonste grens van $k=2^{14}=10^{43}$ per dag.
  • Muise is iets soos 12 weke generasie tyd en 'n werpsel van 10 wat iets soos $k=10^{10}$ per jaar gee.
  • Olifante is een jonk elke 25 jaar. So $k=16$ per eeu of so.

Natuurlik is dit alles gebaseer op growwe aannames. Maar jy soek riglyne vir 'n onrealistiese model, so hopelik sal hulle doen.


Huishoudelike suiwelproduksie en kindergroei: Bewyse uit Bangladesj

Melkproduksie/verbruik is sterk gekoppel aan kindergroei in Europese en Afrika-bevolkings.

Dit is die eerste studie om na suiwelproduksie en kindergroei in 'n Asiatiese omgewing te kyk.

Gebruik unieke opnamedata om kwasi-eksperimentele variasie in blootstelling aan melkproduksie te ontgin.

Melkproduksie verhoog kinders se HAZ-tellings met 0,52 standaardafwykings in die 6–23 maande ouderdomsgroep.

Melkproduksie word egter geassosieer met 'n 20-punt-afname in eksklusiewe borsvoeding.


Agtergrond

Koolstofkatabolietonderdrukking (CCR) is die hoofmeganisme wat koolhidraatopname in bakterieë beheer, en dus ook beheer of verskillende koolstofbronne parallel of opeenvolgend gemetaboliseer word of nie. Alhoewel dit beskryf word as 'n paradigma van die regulering van bakteriese metabolisme, bly die onderliggende meganismes omstrede (sien [1, 2]). Die stelsel toon 'n hoë vlak van kompleksiteit wat metaboliese, geenuitdrukking en seinverwerking behels. 'n Tipiese voorbeeld van CCR is die verskynsel van diauxiese groei (Fig. 1).

Diaux-groei as 'n tipiese manifestasie van koolstofkataboliet-onderdrukking in bakterieë (eksperimentele data is geneem uit [14]). Die plot toon die opeenvolgende opname van glukose (blou sirkels) en laktose (blou vierkante) deur groei Escherichia coli bakterieë op 'n mengsel van koolstofbronne. Dit lei tot die twee-stadium akkumulasie van biomassa (rooi sirkels) teen 'n hoë groeitempo (op glukose) en 'n laer groeitempo (op laktose) totdat alle koolstofbronne uitgeput is

Verskillende hipoteses aangaande die dinamiese funksionering van die sisteem is deur 'n verskeidenheid modelleringsbenaderings ondersoek [2]. Die doel van hierdie studie is om hierdie hipoteses te vergelyk en hul vermoë om enkele sleutelkenmerke van CCR binne 'n enkele modelleringsraamwerk vas te lê. Vir hierdie doel, gebaseer op 'n (eenvoudige) kernmodelstruktuur met slegs vier intrasellulêre metaboliete, het ons 'n ensemble modelvariante ontwikkel, wat almal diauxiese groeigedrag toon tydens bondelbewerking met twee substrate. Die modelvariante verskil slegs in 'n paar strukturele eienskappe en het slegs 'n klein aantal vrye parameters. Die gebruik van klein modelle met min parameters stel ons in staat om op die onderliggende netwerkstruktuur te fokus wanneer ons die verskillende modelvariante vergelyk.

Ensemblemodelleringbenaderings is gebruik om verskillende modelstrukture en/of verskillende stel parameters te verken en te ontleed (sien [3, 4] vir voorbeelde van ensemblemodellering). Soos gesien kan word in Fig. 2, brei ons die omvang van ensemble-modellering in die huidige studie uit deur nog 'n dimensie by te voeg. In plaas daarvan om die ensemble van modelvariante te beperk tot statiese of dinamiese modelle wat die meganismes van koolhidraatassimilasie en die regulering daarvan kwantitatief beskryf, stel ons ook modelvariante bekend wat opmaak vir 'n gebrek aan meganistiese inligting deur verskillende (lineêre en nie-lineêre) optimaliseringsprogramme te gebruik, hetsy statisties of dinamies toegepas. 'n Belangrike verteenwoordiger van sulke optimaliseringsgebaseerde modelle is vloedbalansmodelle [5, 6].

Oorsig van die ensemble-modelleringstrategie wat in hierdie studie aangewend is. Ons onderskei nie net tussen die tipe vergelykings van die model (staties of dinamies) nie, maar neem ook meganistiese modelle in ag vs modelle gebaseer op (lineêre of nie-lineêre) optimalisering. Die vertikale as weerspieël die toenemende kompleksiteit van die optimaliseringsprogram: 'n nie-lineêre probleem is moeiliker om op te los as 'n lineêre program. Die nul van hierdie as stem ooreen met modelle sonder optimalisering. Afkortings wat gebruik word: AE (algebraïese vergelykings), ODE (gewone differensiaalvergelykings), FBA (vloedbalansanalise), dFBA (dinamiese vloedbalansanalise)

Die modelle in die ensemble kan gekategoriseer word volgens regulatoriese, stoïgiometriese en fisiologiese beperkings en verskil slegs op 'n enkele aspek van mekaar. Ons onderskei vier groepe modelle: (1) vloedbalansmodelle wat slegs reaksiekinetika vir substraatopname en neweproduk-uitskeiding definieer, (2) kinetiese modelle insluitend die effek van groeiverdunning, (3) kinetiese modelle met regulering op die metaboliese en /of genetiese vlak, en (4) hulpbrontoewysingsmodelle. Figuur 3 gee 'n oorsig van alle modelvariante wat hier oorweeg word. Om die uitset van die modelle te kwantifiseer en om 'n vergelyking van die modelvariante moontlik te maak, is die diauxiese groei-indeks d word ingestel, wat die mate van opeenvolgende benutting van die twee koolstofbronne aandui.

Oorsig van alle modelvariante in die ensemble verdeel in vier groepe: beperkingsgebaseerde modelle, kinetiese modelle met groeiverdunning, kinetiese modelle met regulatoriese meganismes en hulpbrontoewysingsmodelle

Om die prestasie van die modelle verder te evalueer, het ons twee nuwe eksperimentele toestande ontleed, naamlik 'n glukosepuls toegepas op 'n kultuur wat op minimale medium groei met laktose en 'n bondelkultuur met ongelyke aanvanklike toestande vir die vervoerstelsels. In laasgenoemde geval word die minder voorkeursubstraat, laktose, in die voorkultuur gebruik en daarom is die onderskeie ensieme volop aan die begin van die eksperiment. Deur eksperimentele data vir hierdie twee toestande met modelvoorspellings te vergelyk, kan 'n aantal modelle uitgesluit word, terwyl ander modelvariante steeds nie onderskeibaar is nie.

Gebaseer op die analise, kom ons tot die gevolgtrekking dat modelle wat bekende regulatoriese meganismes soos induseerderuitsluiting en aktivering van die uitdrukking van vervoerders en ensieme deur 'n globale transkripsiefaktor [1] insluit, die beste in staat is om die verskillende eksperimente te verantwoord. Dit is egter waarskynlik dat 'n presiese kwantitatiewe verduideliking van die beheer van die opname en metabolisering van koolhidrate 'n superposisie van verskeie verskillende molekulêre meganismes behels, wat op verskillende tydskale inwerk. Die presiese bydrae van elke individuele meganisme tydens 'n spesifieke stimulasie van die sisteem moet nog bepaal word.


Logistieke bevolkingsgroei vlak af teen 'n drakrag

Om te oorweeg hoe hulpbronbeperking bevolkingsgroei beïnvloed, moet ons die konsep van inkorporeer drakrag, die maksimum bevolkingsgrootte wat die omgewing kan onderhou. Enige individue wat in hierdie populasie gebore word, sal die bevolkingsgrootte vergroot, tensy die aantal sterftes gebalanseerd of meer is as geboortes. As die bevolkingsgrootte van een generasie na die volgende dieselfde bly, moet individue ook teen 'n soortgelyke tempo sterf. Met eksponensiële bevolkingsgroei, die bevolkingsgroeikoers r was konstant, maar met die byvoeging van 'n drakrag wat deur die omgewing opgelê word, vertraag die bevolkingsgroeikoers soos die bevolkingsgrootte toeneem, en groei stop wanneer die bevolking drakrag bereik.

Wanneer hulpbronne beperk is, toon bevolkings logistieke groei. In logistieke groei neem bevolkingsuitbreiding af namate hulpbronne skaars word, en dit vlak af wanneer die drakrag van die omgewing bereik word, wat 'n S-vormige kurwe tot gevolg het. Bron: OpenStax Biology

Wiskundig kan ons dit bereik deur 'n in te sluit digtheid-afhanklike term in die bevolkingsgroeivergelyking, waar K verteenwoordig drakrag:

Nou, die vergelyking toon bevolkingsgroeikoers r gewysig deur die digtheid-afhanklike term, (K–N)/K.

Wat gebeur met bevolkingsgroei wanneer N is klein relatief tot K? Wanneer N is naby K? En wanneer voeg die bevolking die meeste individue in elke generasie by?


Materiale en Metodes

Walvishaaie het vlek- en streeppatrone op die liggaam wat uniek is aan individue en foto's van hierdie patrone wat deur snorkelaars of duikers geneem is, kan as 'n identifikasiemerker in merk-hervangstudies gebruik word (Meekan et al., 2006 Holmberg et al., 2009 ). Ons het hierdie foto-identifikasie benadering gekombineer met stereo-video en fotogrammetrie (Sequeira et al., 2016) om die liggaamsgrootte van individue deur tyd te bereken. Die gebruik van hierdie nie-indringende benadering om groeikoerse te skat, is lewensvatbaar vir walvishaaie omdat hulle voorspelbare samevoegings vorm in vlak kuswaters in tropiese en warm subtropiese streke en 'n hoë mate van terreingetrouheid aan hierdie samevoegings toon, met sommige wat sporadies oor periodes aangeteken word. van tot 20 jaar (Meekan et al., 2006 Norman en Morgan, 2016 Sequeira et al., 2016).

Data-insameling

Ons het jaarlikse lengtemetings van walvishaaie by Ningaloo Reef, Wes-Australië (22끁�″ S, 113뀷�″ E), vanaf 2009 tot 2019, tydens die jaarlikse piek van die walvishaaie, versamel , 1996), tipies in die eerste week van Mei elke jaar (Aanvullende Figuur 1). Met die opsporing van haaie (gewoonlik deur lugopname), het snorkelaars die water binnegegaan en (i) hoë-resolusie identifikasie (ID) foto's geneem van die flank bo elke borsvin vanaf die vyfde kieuspleet tot by die posterior punt van die borsvin aan beide kante (Speed ​​et al., 2007), (ii) het seks geassesseer deur die teenwoordigheid of afwesigheid van knipsels te ondersoek, en (iii) opgeneemde vollyfvideo-reekse deur gebruik te maak van 'n duiker-aangedrewe stereo-videostelsel (DOV's) 1 om metings van liggamslengte. Alle waarnemings van walvishaaie het 'n individuele identifikasiekode ontvang. Ons het die ID-foto's gebruik om herhaalde waarneming en metings van individuele walvishaaie binne en tussen jare te identifiseer, gebaseer op onderskeidende patrone van kolle of strepe (Meekan et al., 2006). Die DOV's bevat twee Canon HFG25 (25 rame per sekonde, 1920 × 1080 piksels resolusie, wye gesigsveld) of GoPro Hero 4 Black-kameras (30 rame per sekonde, 1920 × 1080 piksels resolusie, medium gesigsveld) gemonteer 𢏀.85 tot 1 m uitmekaar teen 'n binnewaarts gekonvergeerde hoek (𢏄°), en ingestel in 'n pasgemaakte behuising wat ontwerp is om kalibrasiestabiliteit te handhaaf. Die skeidingsafstand tussen die kameras wat hier gebruik is, was groter as konvensionele stelsels wat vir vinvis ontwerp is, aangesien die groter skeidingsafstand meer akkurate metings moontlik maak van teikens wat waarskynlik verder weg is (Boutros et al., 2015). Voor elke velduitstappie is kamerakalibrasies uitgevoer op 'n groot kalibrasievierkant (𢏂 × 2 m) op die bodem van 'n swembad en gemeet op 'n afstand van ϥ m om die waarskynlike omvang van teikens in die veld. Kalibrasies is uitgevoer met behulp van die CAL-sagteware (sien 𠇏ootnote 1”) en kalibrasie-akkuraatheid is geverifieer deur bekende lengtes op 'n skaalstaaf na-kalibrasie in die swembad te meet. Ons het die vurklengte (FL 2009�) en TL (2009�) van individuele walvishaaie gemeet deur die EventMeasure-sagteware te gebruik (sien 𠇏ootnote 1”). Ons het gevind dat FL 'n meer konsekwente en betroubare meting was, aangesien TL meer geneig was tot vooroordeel van stertbuiging. Om egter vergelykbaarheid met ander gepubliseerde studies oor walvishaaie te verbeter, het ons skattings van alle resighted individue omgeskakel na TL deur gebruik te maak van die afgeleide verhouding tussen FL-TL (bereken deur gebruik te maak van 2009� gepaarde skattings oor individue wat pas by 'n lineêre model Figuur 1).

Figuur 1. Lineêre verwantskap wat totale lengte vanaf vurklengte vir walvishaaie voorspel, gebaseer op 3 jaar se monsterneming en 124 gepaarde haaimetings. Walvishaai-omtrek oorgetrek uit Rohner et al. (2011).

Ontleding van Groei

Ons het geslagspesifieke groeiprofiele vir walvishaaie beraam deur grootteverskille tussen waarnemings te gebruik deur die von Bertalanffy-groeimodel (VBGM) te pas om die parameters te skat L (die gemiddelde asimptotiese TL in m) en die groeikoëffisiënt K (wat die kromming van groei na L in eenhede jaar 𠄱). Ons het die VBGM-formulering gebruik wat gereël is om data te merk (Fabens, 1965 Francis, 1988):

waar L1 is die TL by die eerste siening, ΔL is die verskil in TL tussen eerste en finale waarneming, en Δt is die tyd op vryheid (in desimale jare) tussen eerste en laaste waarnemings vir 'n individu. Ons het veranderlikes opgelos K en L vir elke vergelyking met behulp van nie-lineêre kleinste-kwadrate skatting gefasiliteer deur die nls() funksie in R (Baty et al., 2015). Plotte van groeimodelle is beperk (y-afsnit) tot 0,6 m TL om skattings van grootte by geboorte te weerspieël (Joung et al., 1996) en groeiparameters is vir mans en wyfies afsonderlik beraam omdat die numeriese dominansie van mans die interpretasie van die resultaat as 'n gekombineerde groeiprofiel. Modelle is toegerus met parameterreekse vasgestel op 0.0𠄰.5 jaar 𠄱 vir K en 6� m vir L. Vertrouensintervalle is bepaal deur 10 000 iterasies van selflaai-hermonstering. Om te ondersoek of verskille in geslagspesifieke groei bloot veroorsaak is deur die lae aantal en omvang van vroulike waarnemings, het ons hierdie model herhaal wat pas vir mans terwyl ons slegs individue ingesluit het wat ooreenstem met die aanvanklike lengtereeks (4𠄷 m TL) en tyd by vryheid (1𠄵 jaar) van vroulike waarnemings.

Ons het ons datastel van individuele groeitrajekte vergelyk met gepubliseerde skattings van walvishaaigroeiprofiele (Wintner, 2000 Hsu et al., 2014 Perry et al., 2018 Ong et al., 2020). Vir daardie studies wat groeiparameterskattings verskaf het, het ons die individuele haaigroeitrajekte vanaf Ningaloo geplot as lynsegmente wat op onderskeie von Bertalanffy groeikurwes oorgelê is, waardeur lengte- en ouderdomkombinasies vir aanvanklike waarnemings veronderstel is om te val op die beraamde ouderdom-op-lengte wat voorspel is deur die groeimodel. Ons het 'n soortgelyke oefening uitgevoer om die ooreenstemming tussen ons groeiparameterskattings en waargenome groeitrajekte van walvishaaie in gevangenskap te ondersoek (Nasionale Museum van Mariene Biologie en Akwarium, Taiwan Okinawa Expo Akwarium en Osaka Akwarium Kaiyuken, Japan) met behulp van gepubliseerde data met groeiintervalle wat wissel van ρ jaar tot byna twee dekades (vyf mans, twee wyfies en een ongeslagtelike individu, wat wissel van 0,6 tot 4,9 m TL by aanvanklike grootte Chang et al., 1997 Uchida et al., 2000 Wintner, 2000 Nishida, 2001 Matsumoto et al., 2019) en ook nuwe data van die Georgia Aquarium (twee mannetjies, twee wyfies - wat wissel van 4,1 tot 4,7 m TL by aanvanklike grootte en meer as 'n dekade waargeneem).


Aansoeke

Bayesiaanse afleiding is oor alle velde van die wetenskap gebruik. Ons beskryf 'n paar voorbeelde hier, hoewel daar baie ander toepassingsgebiede is, soos filosofie, farmakologie, ekonomie, fisika, politieke wetenskap en verder.

Sosiale en gedragswetenskappe

'n Onlangse sistematiese oorsig wat die gebruik van Bayesiaanse statistiek ondersoek het, het berig dat die sosiale en gedragswetenskappe - sielkunde, sosiologie en politieke wetenskappe - 'n toename in empiriese Bayesiaanse werk ervaar het 4 . Spesifiek, daar was twee parallelle gebruike van Bayesiaanse metodes wat in gewildheid toegeneem het binne die sosiale en gedragswetenskappe: teorie-ontwikkeling en as 'n instrument vir modelskatting.

Bayes se reël is gebruik as 'n onderliggende teorie vir die verstaan ​​van redenering, besluitneming, kognisie en teorieë van verstand, en was veral algemeen in ontwikkelingsielkunde en verwante velde. Bayes se reël is gebruik as 'n konseptuele raamwerk vir kognitiewe ontwikkeling by jong kinders, wat vaslê hoe kinders 'n begrip van die wêreld rondom hulle ontwikkel 144 . Bayesiaanse metodologie is ook bespreek in terme van die verbetering van kognitiewe algoritmes wat vir leer gebruik word. Gigerenzer en Hoffrage 145 bespreek die gebruik van frekwensies, eerder as waarskynlikhede, as 'n metode om op Bayesiaanse redenasie te verbeter. In 'n ander seminale artikel bespreek Slovic en Lichtenstein 146 hoe Bayesiaanse metodes vir oordeel en besluitnemingsprosesse gebruik kan word. Binne hierdie area van die sosiale en gedragswetenskappe is Bayes se reël gebruik as 'n belangrike konseptuele hulpmiddel vir die ontwikkeling van teorieë en die verstaan ​​van ontwikkelingsprosesse.

Die sosiale en gedragswetenskappe is 'n wonderlike omgewing vir die implementering van Bayesiaanse afleiding. Die literatuur is ryk aan inligting wat gebruik kan word om vorige verspreidings af te lei. Insiggewende prioriteite is nuttig in komplekse modelleringsituasies, wat algemeen in die sosiale wetenskappe voorkom, sowel as in gevalle van klein steekproefgroottes. Sekere modelle wat gebruik word om onderwysuitkomste en gestandaardiseerde toetse te verken, soos sommige multidimensionele itemresponsteoriemodelle, is onhandelbaar deur frekwentistiese statistieke te gebruik en vereis die gebruik van Bayesiaanse metodes.

Die aantal publikasies oor Bayesiese statistieke het sedert 2004 geleidelik gestyg, met 'n meer noemenswaardige toename in die afgelope dekade. Hierdie fokus op Bayesiaanse metodes is deels te danke aan die ontwikkeling van meer toeganklike sagteware, sowel as 'n fokus op die publikasie van tutoriale wat toegepaste sosiale en gedragswetenskaplikes teiken. 'n Sistematiese oorsig van Bayesiaanse metodes in die veld van sielkunde het 740 kwalifiserende regressie-gebaseerde artikels met Bayesiaanse metodes ontbloot. Hiervan was 100 artikels (13.5%) tutoriale vir die implementering van Bayesiaanse metodes, en 'n bykomende 225 artikels (30.4%) was óf tegniese referate óf kommentaar op Bayesiaanse statistiek (Kassie 4). Metodoloë het probeer om toegepaste navorsers te lei tot die gebruik van Bayesiaanse metodes binne die sosiale en gedragswetenskappe, alhoewel die implementering relatief stadig was om op te vang. Byvoorbeeld, die sistematiese oorsig het bevind dat slegs 167 regressie-gebaseerde Bayesiaanse artikels (22.6%) toepassings was wat menslike monsters gebruik. Nietemin, sommige subvelde publiseer gereeld werk wat Bayesiaanse metodes implementeer.

Die veld het baie interessante insigte oor psigologiese en sosiale gedrag verkry deur Bayesiaanse metodes, en die substantiewe areas waarin hierdie werk uitgevoer is, is redelik uiteenlopend. Bayesiaanse statistieke het byvoorbeeld gehelp om die rol wat drangonderdrukking in rookstaking het, te ontbloot 147, om bevolkingsvoorspellings te maak gebaseer op deskundige menings 148, om die rol wat stres wat verband hou met babasorg in egskeiding het te ondersoek 149, om die impak van die president van die VSA se ideologie oor Amerikaanse Hooggeregshof-uitsprake 150 en om gedrag te voorspel wat die inname van vrye suikers in 'n mens se dieet beperk 151 . Hierdie voorbeelde verteenwoordig almal verskillende maniere waarop Bayesiaanse metodologie in die literatuur vasgevang word. Dit is algemeen om referate te vind wat Bayes se reël beklemtoon as 'n meganisme om teorieë van ontwikkeling en kritiese denke te verduidelik 144, wat verduidelikend 152,153 is, wat fokus op hoe Bayesiaanse redenasie teorie kan inlig deur gebruik te maak van Bayesiaanse afleiding 154 en wat Bayesiaanse modellering gebruik om onttrek bevindings wat moeilik sou gewees het om met behulp van frekwentistiese metodes af te lei 147 . Oor die algemeen is daar wye gebruik van Bayes se heerskappy binne die sosiale en gedragswetenskappe.

Ons argumenteer dat die verhoogde gebruik van Bayesiaanse metodes in die sosiale en gedragswetenskappe 'n groot voordeel is vir die verbetering van inhoudelike kennis. Ons voel egter ook dat die veld moet voortgaan om streng implementering- en verslagdoeningstandaarde te ontwikkel sodat resultate herhaalbaar en deursigtig is. Ons glo dat daar belangrike voordele verbonde is aan die implementering van Bayesiaanse metodes binne die sosiale wetenskappe, en ons is optimisties dat 'n sterk fokus op verslagdoeningstandaarde die metodes optimaal bruikbaar kan maak vir die verkryging van substantiewe kennis.

Raam 4 Bayesiaanse metodes in die sosiale en gedragswetenskappe

Hoijtink et al. 255 bespreek die gebruik van Bayes-faktore vir insiggewende hipoteses binne kognitiewe diagnostiese assessering, en illustreer hoe Bayes-evaluering van informatiewe diagnostiese hipoteses as 'n alternatiewe benadering tot tradisionele diagnostiese metodes gebruik kan word. Daar is bykomende buigsaamheid met die Bayes-benadering aangesien insiggewende diagnostiese hipoteses geëvalueer kan word deur die Bayes-faktor te gebruik deur slegs data van die individu wat gediagnoseer word, te benut. Lee 154 bied 'n oorsig van die toepassing van Bayes se stelling in die veld van kognitiewe sielkunde, en bespreek hoe Bayesiaanse metodes gebruik kan word om meer volledige teorieë van kognitiewe sielkunde te ontwikkel. Bayesiaanse metodes kan ook rekening hou met waargenome gedrag in terme van verskillende kognitiewe prosesse, gedrag op 'n wye reeks kognitiewe take verduidelik en 'n konseptuele eenwording van verskillende kognitiewe modelle verskaf. Depaoli et al. 152 wys hoe Bayesiaanse metodes gesondheidsgebaseerde navorsing wat in sielkunde uitgevoer word, kan bevoordeel deur te beklemtoon hoe insiggewende voorgange wat met kundige kennis en vorige navorsing ontlok is, gebruik kan word om die fisiologiese impak van 'n gesondheidsgebaseerde stressor beter te verstaan. In hierdie navorsingscenario sou frekwentistiese metodes nie lewensvatbare resultate opgelewer het nie, want die steekproefgrootte was relatief klein vir die model wat beraam word as gevolg van die koste van data-insameling en die populasie wat moeilik toeganklik is vir steekproefneming. Ten slotte bied Kruschke 153 die eenvoudigste voorbeeld aan deur a t-toets gerig op eksperimentele sielkundiges, wat wys hoe Bayesiaanse metodes die interpretasie van enige modelparameter kan bevoordeel. Hierdie artikel beklemtoon die Bayesiaanse manier om resultate te interpreteer, en fokus op die interpretasie van die hele posterior eerder as 'n puntskatting.

Ekologie

Die toepassing van Bayesiaanse ontledings om ekologiese vrae te beantwoord het al hoe meer wydverspreid geword as gevolg van beide filosofiese argumente, veral in terme van subjektiewe versus objektiewe redenasie, en praktiese modelpassingsvoordele. Dit word gekombineer met geredelik beskikbare sagteware (Tabel 2) en talle publikasies wat Bayesiaanse ekologiese toepassings beskryf wat hierdie sagtewarepakkette gebruik (sien refs 155,156,157,158,159,160,161 vir voorbeelde). Die onderliggende Bayesiaanse filosofie is in baie opsigte aantreklik binne ekologie 162 aangesien dit die inkorporering van eksterne, onafhanklike voorafgaande inligting óf uit vorige studies oor dieselfde/soortgelyke spesie óf inherente kennis van die biologiese prosesse binne 'n streng raamwerk toelaat 163,164. Verder laat die Bayesiaanse benadering ook toe dat beide direkte waarskynlikheidsstellings gemaak word oor parameters van belang, soos oorlewingswaarskynlikhede, voortplantingsyfers, bevolkingsgroottes en toekomstige voorspellings 157, en die berekening van relatiewe waarskynlikhede van mededingende modelle - soos die teenwoordigheid of afwesigheid van digtheidsafhanklikheid of omgewingsfaktore in die dryf van die dinamika van die ekosisteem - wat op sy beurt model-gemiddelde skattings toelaat wat beide parameter- en modelonsekerheid insluit. Die vermoë om waarskynlikheidsstellings te verskaf, is veral nuttig met betrekking tot wildbestuur en -bewaring. Byvoorbeeld, King et al. 165 verskaf waarskynlikheidstellings met betrekking tot die vlak van bevolkingsafname oor 'n gegewe tydperk, wat op sy beurt waarskynlikhede verskaf wat verband hou met spesies se bewaringstatus.

’n Bayesiaanse benadering word ook om pragmatiese redes dikwels in ekologiese navorsing toegepas. Baie ekologiese modelle is kompleks - byvoorbeeld, hulle kan tydruimtelik van aard wees, hoog-dimensioneel en / of met verskeie interaksie biologiese prosesse - wat lei tot berekening duur waarskynlikhede wat stadig is om te evalueer. Onvolmaakte of beperkte data-insamelingsprosesse lei dikwels tot vermiste data en gepaardgaande onoplosbare waarskynlikhede. In sulke omstandighede kan standaard Bayesiaanse modelpasinstrumente soos datavergroting toelaat dat die modelle gepas word, terwyl in die alternatiewe frekwentistiese raamwerk bykomende modelvereenvoudigings of benaderings nodig mag wees. Die toepassing van Bayesiaanse statistieke in ekologie is groot en omvat 'n reeks tydruimtelike skale vanaf 'n individuele organismevlak tot 'n ekosisteemvlak wat die begrip van die populasiedinamika van die gegewe sisteem 166 insluit, die modellering van ruimtelike puntpatroondata 167 , die ondersoek van populasiegenetika, die skatting van oorvloed 168 en die beoordeling van bewaringsbestuur 169 .

Ekologiese data-insamelingsprosesse kom gewoonlik uit waarnemingstudies, waar 'n steekproef uit die populasie van belang waargeneem word deur gebruik te maak van een of ander data-opnameprotokol. Die opname moet sorgvuldig ontwerp word, met inagneming van die ekologiese vraag(e) wat van belang is en die kompleksiteit van die model wat vereis word om die data te pas, te minimaliseer om betroubare afleidings te verskaf. Desnieteenstaande kan gepaardgaande uitdagings wat verband hou met modelpas, steeds ontstaan ​​as gevolg van data-insamelingsprobleme, soos dié wat voortspruit uit toerustingonderbreking of swak weerstoestande. Daar kan ook inherente data-insamelingsprobleme in sommige data-opnames wees, soos die onvermoë om inligting op individuele vlak aan te teken. Sulke modelpassing-uitdagings kan insluit - maar is ver van beperk tot - waarnemings wat onreëlmatig gespasieer is in tyd as gevolg van toerusting mislukking of eksperimentele ontwerp, metingsfout as gevolg van onvolmaakte data waarnemings, ontbrekende inligting op 'n reeks verskillende vlakke, van die individuele vlak tot die globale omgewingsvlak, en uitdagings wat verband hou met multiskaal studies waar verskillende aspekte van data op verskillende tydelike skale aangeteken word - byvoorbeeld van uurlikse liggingdata van individue tot daaglikse en maandelikse versamelings van omgewingsdata. Die datakompleksiteite wat ontstaan, gekombineer met die gepaardgaande modelleringskeuses, kan lei tot 'n reeks modelpassende uitdagings wat dikwels aangespreek kan word deur gebruik te maak van standaardtegnieke binne die Bayesiaanse paradigma.

Vir 'n gegewe ekologiese studie is die uitskeiding van die individuele prosesse wat op die ekosisteem inwerk 'n aantreklike meganisme om modelspesifikasie 166 te vereenvoudig. Byvoorbeeld, staatsruimtemodelle verskaf 'n algemene en buigsame modelleringsraamwerk wat twee verskillende tipes proses beskryf: die sisteemproses en die waarnemingsproses. Die stelselproses beskryf die ware onderliggende toestand van die stelsel en hoe dit met verloop van tyd verander. Hierdie toestand kan eenveranderlik of meerveranderlik wees, soos populasiegrootte of liggingdata, onderskeidelik. Die sisteemproses kan ook veelvuldige prosesse beskryf wat op die sisteem inwerk, soos geboorte, voortplanting, verspreiding en dood. Ons is tipies nie in staat om hierdie ware onderliggende stelseltoestande waar te neem sonder een of ander gepaardgaande fout nie en die waarnemingsproses beskryf hoe die waargenome data verband hou met die ware onbekende toestande. Hierdie algemene toestandruimtemodelle strek oor baie toepassings, insluitend dierebeweging 170 , bevolkingtellingdata 171 , vang-hervangtipe data 165 , visseryvoorraadbeoordeling 172 en biodiversiteit 173 . Vir 'n oorsig van hierdie onderwerpe en verdere toepassings rig ons die leser elders 166,174,175. Bayesiaanse modelpasgereedskap, soos MCMC met datavergroting 176, opeenvolgende Monte Carlo of deeltjie MCMC 177,178,179, laat toe om algemene toestandruimtemodelle by die waargenome data te pas sonder dat verdere beperkings - soos verspreidingsaannames - op die model hoef te spesifiseer. spesifikasie, of om bykomende waarskynlikheidsbenaderings te maak.

Die proses om data in te samel, gaan voort om te ontwikkel met vooruitgang in tegnologie. Byvoorbeeld, die gebruik van GPS-geografiese liggingetikette en gepaardgaande bykomende versnellingsmeters, afstandswaarneming, die gebruik van hommeltuie vir gelokaliseerde lugfoto's, onbemande onderwatervoertuie en bewegingsensorkamera-lokvalle word toenemend binne ekologiese navorsing gebruik. Die gebruik van hierdie tegnologiese toestelle en die groei van skare-wetenskap het gelei tot nuwe vorme van data wat in groot hoeveelhede ingesamel is en gepaardgaande uitdagings om model te pas, wat 'n vrugbare grond bied vir Bayesiaanse ontledings.

Genetika

Genetika en genomika studies het baie gebruik gemaak van Bayesiaanse metodes. In genoomwye assosiasiestudies het Bayesiaanse benaderings 'n kragtige alternatief vir frekwentistiese benaderings verskaf vir die assessering van assosiasies tussen genetiese variante en 'n fenotipe van belang in 'n populasie 180 . Dit sluit in statistiese modelle vir die inkorporering van genetiese vermenging 181, fyn kartering om oorsaaklike genetiese variante 182 te identifiseer, toerekening van genetiese merkers wat nie direk gemeet is deur verwysing populasies 183 en meta-analise vir die kombinasie van inligting oor studies. Hierdie toepassings baat verder by die gebruik van marginalisering om rekening te hou met modelleringsonsekerhede wanneer afleidings gemaak word. Meer onlangs het groot kohortstudies soos die UK Biobank 184 die metodologiese vereistes vir die identifisering van genetiese assosiasies met komplekse (sub)fenotipes uitgebrei deur genetiese inligting saam met heterogene datastelle, insluitend beeldvorming, lewenstyl en gereeld versamelde gesondheidsdata, te versamel. 'n Bayesiese ontledingsraamwerk bekend as TreeWAS 185 het genetiese assosiasiemetodes uitgebrei om voorsiening te maak vir die inkorporering van boomgestruktureerde siektediagnoseklassifikasies deur die korrelasiestruktuur van genetiese effekte oor waargenome kliniese fenotipes te modelleer. Hierdie benadering inkorporeer voorafkennis van fenotipe-verwantskappe wat van 'n diagnoseklassifikasieboom afgelei kan word, soos inligting uit die jongste weergawe van die International Classification of Diseases (ICD-10).

Die beskikbaarheid van veelvuldige molekulêre datatipes in multi-omics-datastelle het ook Bayesiaanse oplossings vir die probleem van multimodale data-integrasie gelok. Bayesiaanse latente veranderlike modelle kan gebruik word as 'n leerbenadering sonder toesig om latente strukture te identifiseer wat ooreenstem met bekende of voorheen ongekarakteriseerde biologiese prosesse oor verskillende molekulêre skale. Multi‐omics factor analysis 186 uses a Bayesian linear factor model to disentangle sources of heterogeneity that are common across multiple data modalities from those patterns that are specific to only a single data modality.

In recent years, high-throughput molecular profiling technologies have advanced to allow the routine multi-omics analysis of individual cells 187 . This has led to the development of many novel approaches for modelling single-cell measurement noise, cell-to-cell heterogeneity, high dimensionality, large sample sizes and interventional effects from, for example, genome editing 188 . Cellular heterogeneity lends itself naturally to Bayesian hierarchical modelling and formal uncertainty propagation and quantification owing to the layers of variability induced by tissue-specific activity, heterogeneous cellular phenotypes within a given tissue and stochastic molecular expression at the level of the single cell. In the integrated Bayesian hierarchical model BASiCS 189 , this approach is used to account for cell-specific normalization constants and technical variability to decompose total gene expression variability into technical and biological components.

Deep neural networks (DNNs) have also been utilized to specify flexible, non-linear conditional dependencies within hierarchical models for single-cell omics. SAVER-X 190 couples a Bayesian hierarchical model with a pretrainable deep autoencoder to extract transferable gene–gene relationships across data sets from different laboratories, variable experimental conditions and divergent species to de-noise novel target data sets. In scVI 191 , hierarchical modelling is used to pool information across similar cells and genes to learn models of the distribution of observed expression values. Both SAVER-X and scVI perform approximate Bayesian inference using mini-batch stochastic gradient descent, the latter within a variational setting — a standard technique in DNNs — that allow these models to be fitted to hundreds of thousands or even millions of cells.

Bayesian approaches have also been popular in large-scale cancer genomic data sets 192 and have enabled a data-driven approach to identifying novel molecular changes that drive cancer initiation and progression. Bayesian network models 193 have been developed to identify the interactions between mutated genes and capture mutational signatures that highlight key genetic interactions with the potential to allow for genomic-based patient stratification in both clinical trials and the personalized use of therapeutics. Bayesian methods have also been important in answering questions about evolutionary processes in cancer. Several Bayesian approaches for phylogenetic analysis of heterogeneous cancers enable the identification of the distinct subpopulations that can exist with tumours and their ancestral relationships through the analysis of single-cell and bulk tissue-sequencing data 194 . These models therefore consider the joint problem of learning a mixture model and graph inference through considering the number and identity of the subpopulations and deriving the phylogenetic tree.


1. An Introduction to Structural Equation Modeling

Broadly, structural equation modeling (SEM) unites a suite of variables in a single network. They are generally presented using box-and-arrow diagrams denoting directed (causal) relationships among variables:

Those variables that exist only as predictors in the network are referred to as exogenous, and those that are predicted (at any point) as endogenous. Exogenous variables therefore only ever have arrows coming out of them, while endogenous arrows have arrows coming into them (which does not preclude them from having arrows come out of them as well). This vocabulary is important when considering some special cases later.

In traditional SEM, the relationships among variables (i.e., their linear coefficients) are estimated simultaneously in a single variance-covariance matrix. This approach is well developed but can be computationally intensive (depending on the sizes of the v-cov matrix) and additionally assumes independence and normality of errors, two assumptions that are generally violated in ecological research.

Piecewise structural equation modeling (SEM), also called confirmatory path analysis, was proposed in the early 2000s by Bill Shipley as an alternate approach to traditional variance-covariance based SEM. In piecewise SEM, each set of relationships is estimated independently (or locally). This process decomposes the network into the corresponding simple or multiple linear regressions for each response, each of which are evaluated separately, and then combined later to generate inferences about the entire SEM. This approach has two consequences: 1. Increasingly large networks can be estimated with ease compared to a single vcov matrix (because the approach is modularized), and 2. Specific assumptions about the distribution and covariance of the responses can be addressed using typical extensions of linear regression, such as fixed covariance structures, random effects, and other sophisticated modeling techniques.

Unlike traditional SEM, which uses a (chi^2) test to compare the observed and predicted covariance matrices, the goodness-of-fit of a piecewise structural equation model is obtained using ‘tests of directed separation.’ These tests evaluate the assumption that the specific causal structure reflects the data. This is accomplished by deriving the ‘basis set,’ which is the smallest set of independence claims obtained from the SEM. These claims are relationships that are unspecified in the model, in other words paths that could have been included but were omitted because they were deemed to be biologically or mechanistically insignificant. The tests ask whether these relationships can truly be considered independent (i.e., their association is not statistically significant within some threshold of acceptable error, typically (alpha) =0.05) or whether some causal relationship may exist as indicated by the data.

For instance, the preceding example SEM contains 4 specified paths (solid, black) and 2 unspecified paths (dashed, red), the latter of which constitute the basis set:

In this case, there are two relationships that need to be evaluated: y3 and x1 , and y3 and y2 . However, there are additional influences on y3 , specifically the directed path from y2 . Thus, the claims need to be evaluated for ‘conditional independence,’ i.e. that the two variables are independent voorwaardelik on the already specified influences on both of them. This also pertains to the predictors of y2 , including the potential contributions of x1 . So the full claim would be: y2 | y3 (y1, x1) , with the claim of interest separated by the | bar and the conditioning variable(s) following in parentheses.

As the network grows more complex, however, the independence claims only consider variables that are immediately ancestral to the primary claim (i.e., the parent nodes). For example, if there was another variable predicting x1 , it would not be considered in the independence claim between y3 and y2 since it is >1 node away in the network.

The independence claims are evaluated by fitting a regression between the two variables of interest with any conditioning variables included as covariates. Thus, the claim above y2 | y3 (y1, x1) would be modeled as y3

y2 + y1 + x1 . These regressions are constructed using the same assumptions about y3 as specified in the actual structural equation model. So, for instance, if y3 is a hierarchically sampled variable predicted by y1 , then same hierarchical structure would carry over to the test of directed separation of y3 predicted by y2 .

The P-values of the conditional independence tests are then combined in a single Fisher’s C statistic using the following equation:

This statistic is (chi^2) -distributed with 2k degrees of freedom, with k being the number of independence claims in the basis set.

Shipley (2013) also showed that the the C statistic can be used to compute an AIC score for the SEM, so that nested comparisons can be made in a model selection framework:

where K is the likelihood degrees of freedom. A further variant, (AIC_c) , can be obtained by adding an additional penalty based on sample size:

The piecewiseSEM package automates the derivation of the basis set and the tests of directed separation, as well as extraction of path coefficients based on the user-specified input.


Metodes

Building the model

The list of reactions

As a starting point, the S. cerevisiae bibliome was searched for references related to a list of genes known to be involved in iron homeostasis. The bibliome consist of all articles concerning S. cerevisiae found on PubMed.

From these articles, a list of reactions was manually inferred and selected on the basis of our knowledge.

The new genes cited in these articles that we felt to be critical for model accuracy were then used to direct a new PubMed search. This process was repeated until we were unable to identify any new reactions strictly related to iron homeostasis.

The same process was used for inorganic phosphate homeostasis, for some oxidative stress reactions and for any other processes that we felt to be important for our model that were not described in databases.

We searched the Swissprot database for a list of proteins requiring iron-sulphur clusters or haem as a cofactor. We then selected metabolic pathways involving these proteins that could be directly or indirectly linked to iron homeostasis or oxidative stress. The reactions describing these pathways were expressed according to SGD pathways [68] on the Saccharomyces Genome Database website http://www.yeastgenome.org.

However, when including a given pathway, we did not systematically describe all its steps. If a pathway included several steps producing intermediate metabolites not required by any other pathway included in our model, we wrote the whole pathway as one reaction, unless we had reasons to include the intermediate reactions. For example, siroheme biosynthesis from uroporphyrinogen-III involves three reactions. The two intermediate metabolites produced, precorrin-2 and sirohydrochlorin, are not required by any other reaction already included in the model, so we could have expressed the whole process as one reaction. However, the first step involves S-adenosyl-methionine and S-adenosyl-homocysteine, which are already included in several reactions in our model, and the last step involves iron. Thus, if siroheme cannot be produced - in a mutant for instance - we want to be able to determine whether this deficiency is related to 2-S-adenosyl-methionine synthesis or iron availability. We therefore included siroheme biosynthesis as two reactions, the first producing precorrin-2 from uroporphyrinogen-III and 2-S-adenosyl-methionine and the second producing siroheme from precorrin-2 and NAD.

Finally, we searched the yeast metabolome (described in SGD pathways) for reactions that might link several metabolites already included in our model. This is the reason for which we included alanine degradation, for example, in the model.

The weights of the reactions

The default value for the weight of the reactions was one. However some reactions were given a weight lower than one (for most degradation reactions, the weight is 0.01), or higher than one. For example, the reaction catalysed by the Sod1 superoxide dismutase was given a weight of 100, to take into account the extremely high abundance of this protein in yeast cells (519,000 molecules per cell [69]), its very high catalytic efficiency and turnover number. A list of reactions given weights other than one is provided in Table 2. We simulated our model with all weights set to 1, and it was unable to produce realistic outputs (data not shown). For example, the hydroxyl radical elements have PoP lower than 1% in the complete model, whereas the PoP is higher than 70% in the model without weights, which is not realistic (the parameter PoP is defined below, in the Simulations subsection). The WT model corresponds to a model where the outputs are biologically meaningfull according the data of the literature and our own experience. We performed a sensitivity analysis of the outputs of the model (PoP at steady state) when the weights of the reactions are modified. Each weight was multiplied by a coefficient k. The differences between each PoP of the initial model and the PoP of the model with the modified weight were computed. Our results show that the model is robust to modifications of the weights when the coefficient k ranges from 0.1 to 10. See Additional file 2 for more details.

Simulations

Algorithm

We used a modified version of the Biocham asynchronous Boolean simulation algorithm, which can be summarised as follows:

Initial state: the list of elements that are "ON" at the beginning of the simulation.

Based on the list of elements that are "ON", the list of possible reactions is inferred: A reaction is possible if its reactants and modifiers are "ON".

A reaction is randomly selected from the list of possible reactions

The products of the selected reaction are set to "ON".

The reactants are randomly either set to "OFF" or left "ON".

A new list of elements that are "ON" is computed.

Steps 2 to 6 are repeated for each simulation step.

In the Biocham algorithm, the reactants are not always set to "OFF", so it is possible to reselect the reaction in subsequent steps (see above step 5). This possibility reflects the presence of more than one molecule of each sort in biological systems. If the same reaction is selected over and over again, all the reactants will eventually be turned "OFF" and the reaction will cease to be possible unless other reactions set the reactants back to "ON". Indeed, in biological systems, all elements may be considered to exist in limited quantities.

We also had to overcome a limitation of Biocham to obtain simulations as close as possible to real biological systems: this algorithm does not mimic the large differences in reactions rates observed in real biological systems. Although we do not precisely know the rate of all the reactions in the model, we can reasonably state that the degradation of an enzyme is far less likely to occur than the reaction this enzyme catalyzes. Another example is related to the functioning of the TCA cycle: we know that some reactions of the cycle are technically reversible but the reaction always runs in one direction in practice, because this direction is thermodynamically more favorable. We needed to mimic this situation, to make our model as realistic as possible. We therefore decided to extend the algorithm, by weighting the reactions. From the weights of the possible reaction, at step 2, in the previous algorithm, we calculated the probability of a reaction being selected as the weight of the reaction divided by the sum of the weights of all possible reactions. A reaction is therefore more likely to occur if it has a high weighting.

If the experimental rates of all reactions in the model were known, we could set their weights accordingly. As these rates were not all known, we defined relative weights, using a default rate of one. The weight of a given reaction was modified if we had quantitative or qualitative experimental knowledge relating to its reaction rate (see Table 2 for a list of reactions with weights other than one).

In this paper, all the elements were set to "ON" at the beginning of the simulations, defining thus the "initial conditions". We tested different initial conditions (for the non constant elements only) which always resulted in the same steady-state (data not shown).

Mean profiles

The simulation algorithm was developed in the C (for computation) and Python (for file/data management) languages on a Linux workstation. The simulations were performed on a cluster of 40 nodes (DualCore AMD 64 bits Opteron bi-Processor, 2Go RAM, PBS/Maui scheduler). Each type of model was simulated 100 times, until a steady state was reached (see below). From these 100 simulations and for each element, we defined the PoP as the percentage of simulations in which the element was "ON" (present), for each simulation step. This calculation process is referred to as the profile of an element. Figure 4 shows examples of such profiles in which the mean PoP was also calculated every 100,000 steps.

Steady states

We checked whether steady state had been reached, by performing an ANOVA on the last 2 million steps, for every element (the null hypothesis is that the PoPs are independent from the iterations). If a significant result was obtained (p-value below 0.001), then the simulation was rerun for a larger number of steps. As background noise in the simulation can generate false positives, if the means of the last million steps of the new simulation of each element was equal to the means of the previous simulations, then we considered steady state to have been reached.

For some mutants, some elements have PoPs that increase or decrease slowly. For example, in some mutants the reactive oxygen species have increasing PoP. Most of them will have a PoP of 100 at steady-state but some will reached this maximum value faster than others. We advocate that comparing the simulations of the mutants after the same number of steps may be related to some biological properties of the mutants. To be able to take into consideration these differences in our systematic in silico mutants analysis, we used the PoP of the elements after 20 millions of steps even though the simulations were not at steady-state yet.

74% of the simulations reached steady-state before 20 million steps and among the remaining 26% only some very slowly increasing or decreasing elements have not reached steady-state yet.

In silico mutations and mutants

Model simulations began with all elements "ON" and continued until steady state was reached: this situation corresponds to "wild-type" simulations. To simulate a mutant for a given gene, a wild-type simulation was carried out for one million of steps. The gene in question was then turned "OFF" and the simulation was continued for 19 more millions of steps for the systematic mutants analysis or until steady-state was reached for the other analysis. This method for simulating mutants mimics experimental Tet-OFF mutants, in which the transcription of a given gene is controlled by a tetracycline- responsive promoter and can be turned off by adding tetracycline to the growth medium [70].

The model was simulated according to this procedure with, for each set of simulations, one of the unlimited elements deleted. Each type of model in which an unlimited element was turned "OFF" was referred to as a "mutant". Note that none of the reactions were modified.

Clustering

For each "mutant" model, the mean of the last million of steps of the simulation was calculated for each element. Then, for each element and for each "mutant", a distance to the "wild-type" was calculated as follows: (formula based on A.Ultsch RelDi [71]). In the resulting matrix, each column is an element of the model and each row a "mutant". This matrix was then clustered, using [email protected] (real location data type, relative error = 0.01), which identifies classes of "mutants" causing similar changes in the simulations. The matrix was then transposed and clustered again: this generated classes of elements changing in a similar fashion in the different "mutants". Figure 6 shows the matrix clustered for both elements and "mutants".

Figure 6 is annotated: the rows are annotated with the most significantly enriched gene ontology process (calculated with GoTermFinder [72]), the columns were annotated manually (only the columns containing many elements which PoP changes significantly were annotated - see Additional file 6 for the complete clustering).

Phenotypes analysis

The file containing the manually curated phenotypes of the yeast mutants was retrieved from SGD (file edited the 03/13/10). From this file, we extracted the phenotypes associated with mutants of genes present in the model. From this selection, we further extracted the phenotypes that could be compared with our simulation results (e.g. the auxotrophies related to molecules present in the model). Six types of phenotypes were extracted: "auxotrophies" (cysteine, methionine, heme, glutamate), "chemical compound accumulation" (of elements present in the model), "oxidative stress resistance", "respiratory growth" and "nitrogen source". Then, we compared these phenotypes with the PoP of the corresponding elements in the WT model and in the mutants. For the "oxidative stress resistance", we did not simulate the mutants with an additional source of stress, as where observed the experimental phenotypes. Instead, we looked for the production of ROS in the mutant as compared to the WT in standard simulations. Therefore, only the constitutively stressed mutants showed similar phenotypes in our simulations and in vivo. As for the "respiratory growth" phenotypes, we compared the PoP of the protons in the intermembrane space (noted "Hinter::mitochondria" in the model), because in the model an increase or a decrease in this element PoP can be directly linked to a defect in the respiratory metabolism. For the "nitrogen source utilization" phenotypes, we compared the PoP of the products of the utilization of the nitrogen source. See Additional file 3 for the full results.


Inleiding

In economics, many processes depend on past events, so it is natural to use time-delay differential equations to model economic phenomena. Two main areas of applications are business cycle and economic growth theories. In recent decades, the analysis of the effect of investment delay has been the focus of extensive examination as a tool for endogenous cycles to explain business cycles and growth cycles. Differential equations with time delay (discrete or distributed) and their mathematical methods have been seen to be the most adequate tools to model the business cycle and growth in an economy where the investment delay plays a crucial role [1,2,3,4,5], as well as in physics, finance and biology [6,7,8,9].

The mechanism of the supercritical Hopf bifurcation leading to a stable limit cycle is a well-known route to the self-sustainable cyclic behavior. It can be employed for both ordinary differential equations and delay differential equations. Many examples of its use can be found in economics [10,11,12,13,14] and in other sciences [15,16,17].

One of the most influential models of business cycle with time delay is the Kaldor–Kalecki model [18], which is based on the Kaldor model, one of the earliest endogenous business cycle models [19,20,21]. The Kaldor is a prototype of a dynamical system with cyclic behavior in which nonlinearity plays a crucial role in generating endogenous cycles. The nonlinearities are a common feature used to model the complexity of economic systems [22]. In turn, the investment delay was assumed to be the average time of making investment as it was proposed by Kalecki [23].

The investment decisions are taken given the current state of economy. These past investment decisions lead to the change of capital stock in a present economy and may cause fluctuations in economic variables. This kind of time delay, i.e., the time required for building new capital, is an intrinsic (response) type of time delay, which could be also found in neuron due to the autapse connection [24]. Time-delay systems with both response and propagation time delay were studied in many domains of science [25].

The Kaldor–Kalecki business cycle model has been the topic of several studies as well as augmentations. One of these was to incorporate an exponential trend to describe growth of an economy [26]. This new Kaldor–Kalecki growth model was formulated in a similar manner the Kaldor growth model was developed from the Kaldor business cycle model [27].

The Kaldor–Kalecki model has been extensively studied. While mostly the discrete delay was investigated, some Kaldor–Kalecki models with distributed delays were also proposed. The Kaldor–Kalecki models with fixed delay include both models with one delay and two delays [28,29,30,31,32].

In the existing literature, time delays can be modeled by assuming either fixed time lags or continuously distributed time lags (distributed delay henceforth). The former refers to economic circumstances where there is a set amount of time gap, institutionally or socially defined, for the agents concerned. The latter is suitable for economic situations where different lengths of delays are distributed across the various agents. A major difficulty is that time delays are not known exactly. On the other hand, distributed delays are based on the weighted average of all past data from time zero up to the current time period. Thus, distributed delays provide a more realistic description of economic systems with time delay. There is also some experimental evidence which indicates that they are more accurate than those with instantaneous time lags (see [33]). Cushing [34] introduced and used distributed delays in mathematical biology, while Invernizzi and Medio [35] presented distributed delays into mathematical economics. Some examples in context of economic growth are provided in [36] and [37].

In [38], we proposed an economic growth model where the average time of investment completion is replaced by a distributed time length of investment. The gamma distribution function for the investment delay is assumed. This allows to consider different time lengths of investment accomplishment. The resulting model is described by a dynamical system with a distributed time delay.

While the delay differential equation methods are developing rapidly, the mathematical methods for ordinary differential equations are superlative, especially when distributed delays are considered. Therefore, it is convenient to approximate systems with distributed delays with those of ordinary differential equations. One way to do it is provided by the so-called linear chain trick [39,40,41]. Consequently, an infinite-dimensional dynamical system is approximated by a finite-dimensional dynamical system, where the dimension of the system can be chosen. For an example of this method applied to delayed chemical reaction networks, see [42]. We note that another way to approximate a delay differential equation system with a ordinary differential equation system is via the Padé approximation [43, 44].

The main aim of this paper is to study the emergence of a bifurcation due to the change of the parameter values in the approximated Kaldor–Kalecki growth model. We consider two simplest cases of three- and four-dimensional dynamical systems obtained through the linear chain trick from the Kaldor–Kalecki growth model with the distributed delay, corresponding to the weak and strong kernels, respectively. For both models, we establish conditions for the existence of Hopf bifurcation with respect to the time-delay parameter and the rate of growth parameter. It is found that both parameters play role in a scenario leading to the Hopf bifurcation and arising cyclic behavior.

In the numerical part of this paper, we determine in detail the ranges of parameter values for which cyclical behavior is possible. In this analysis, we use the investment function obtained by Dana and Malgrange for the French economy [27]. As in the theoretical part of the paper, we choose the time-delay parameter and the rate of growth parameter, as well as the adjustment parameter, for the bifurcation investigations. It is shown how some combinations of these three-parameter values can trigger the cycles through the Hopf bifurcation mechanism. In the three-parameter space of the model, we were able to obtain the surface (a section of a paraboloid) separating the regions with stable and cyclic solutions.


Selective Phenome Growth Adapted

Aptamers are single-stranded oligonucleotides selected by evolutionary approaches from massive libraries with significant potential for specific molecular recognition in diagnostics and therapeutics. A complete empirical characterisation of an aptamer selection experiment is not feasible due to the vast complexity of aptamer selection. Simulation of aptamer selection has been used to characterise and optimise the selection process however, the absence of a good model for aptamer-target binding limits this field of study. Here, we generate theoretical fitness landscapes which appear to more accurately represent aptamer-target binding. The method used to generate these landscapes, selective phenome growth, is a new approach in which phenotypic contributors are added to a genotype/phenotype interaction map sequentially in such a way so as to increase the fitness of a selected fit sequence. In this way, a landscape is built around the selected fittest sequences. Comparison to empirical aptamer microarray data shows that our theoretical fitness landscapes more accurately represent aptamer ligand binding than other theoretical models. These improved fitness landscapes have potential for the computational analysis and optimisation of other complex systems.

1. Inleiding

1.1. Agtergrond

Aptamers are single-stranded nucleic acid sequences capable of specific high-affinity binding [1–4]. This makes them attractive candidates as recognition molecules in diagnostics and therapeutics. Aptamers are isolated by systematic evolution of ligands by conventional exponential enrichment (SELEX), which involves several iterative steps of incubation with target, washing away of weak binders, and PCR amplification of strong binders.

Aptamer selection is complex. Many variables such as library size, quantity of target, temperature, selection buffer, pH, degree of PCR amplification, and use of mutation or recombination diversification need to be considered. Due in part to these factors, less than 30% of classical SELEX experiments are successful in isolating aptamers with dissociation constants less than 30 nM [5]. Understanding the dynamics of the selection process is extremely important but what does this entail? The DNA required to fully represent the number of permutations in a 75-base aptamer library would be roughly equal to the mass of the moon [6]. In order to represent this immense sequence space, an initial SELEX library may contain up to 10 15 molecules. A rigorous empirical analysis of anything close to this number of library members is simply not feasible.

Nevertheless, empirical analyses of smaller fractions of a DNA aptamer libraries have been undertaken. The two main empirical selection analysis techniques used are high density DNA microarrays and high-throughput sequencing (HTS). Briefly, high density microarrays can contain up to approximately 1 million features, each representing an aptamer in a library. After array incubation with fluorescent target and a washing step, the binding affinity score of all aptamers on the array can be measured by fluorescence scanning. Platt et al. [7], Knight et al. [8], and Rowe et al. [9] used microarrays to both evolve aptamers and gain insight into an aptamer binding landscape. Additionally, DNA microarray data has been applied to aptamer specificity landscapes [10], fitness landscape morphology [11], and aptamer affinity maturation [12]. In comparison, the possible sequence space coverage using HTS is much greater, with Illumina’s HiSeq HTS capable of yielding sequence data for more than 70 million sequences from a single lane [13]. Using this approach the copy number of a sequence is used as a proxy for its target binding strength so that the fitness of individuals in the library pools can be estimated. Cho et al. used this HTS approach to monitor microfluidic aptamer selection rounds and gauge enrichment [14]. PCR bias may distort this copy number/binding correlation but, by using a motif based statistical framework such as MPBind [15], the binding potential of aptamers can be predicted, eliminating error from PCR bias. Although both DNA microarrays and HTS led to major breakthroughs in understanding library sequence space fitness and selection, these techniques are only capable of analysing a small fraction of a given library’s sequence space. Another approach to analysing aptamer selection is via computational simulation. The challenge in simulating aptamer selection is the design of a suitable model for aptamer binding fitness.

Computational approaches to model aptamer fitness by virtue of folding include secondary structure prediction by minimum free energy [16] and inverse folding [17]. These approaches can be computationally expensive and while being excellent models for folding they may not capture the higher complexity of molecular binding. Hoinka et al. coded a program to simulate the aptamer selection process called “AptaSim” [18]. The binding model used chose aptamer affinities at random without relevance of sequence. While AptaSim was an important step forward in simulating selection enrichment and mutation copy number, AptaSim cannot appropriately represent heritability or represent binding correlation between related sequences required for the study of genetic systems. Oh et al. used a string matching function as a binding fitness model to simulate aptamer selection [19]. This model does include heritability and binding correlation between related sequences, but as only close range epistasis is possible by using one “optimal solution” aptamer the landscape is cone shaped and would often be unrepresentative of a true aptamer binding landscape.

Kauffman’s model is a robust mathematical model, related to study of autocatalytic sets, which serves as an objective function relating genotypic sequence to phenotypic fitness [20]. Derivations of the original model have been used to describe complex interacting systems in areas as diverse as immunology [21], evolutionary biology [22], and economics [23]. The model describes a fitness landscape whose size is determined by the number of components in system ( ) and the ruggedness of the landscape can be tuned using the degree of interaction of these components ( ). This system is perhaps best described when used to represent problems in evolutionary biology as originally intended by Kauffman. A population of genomes where each contains N genes are given fitness values based on the sum of fitness contributions from each of their genes. The fitness contribution of each gene is determined by its interaction with other genes within its own genome. The interacting genes can be positioned sequentially, randomly, or by some other gene interaction pattern predetermined by an interaction map. The allelic sequence of these interacting genes is designated a fitness contribution, usually from a generated random distribution. In this way, the allelic substitution of one interacting gene means there will be a completely different fitness contribution score for the entire collection of interacting genes. In the model by increasing , the number of interactions between genes, the complexity of the system, and the ruggedness of the landscape are increased. In addition to the value of , the position of these interactions on the interaction map is of great importance to the fitness landscape.

Typically the model is used as a scoring system for a population of genomes which can evolve via diversifications such as mutation or recombination. In this way the model is an objective function for a complex system. As mentioned earlier the model can be adapted to many other areas of study. In this paper we use the model to represent binding of an aptamer to a target analyte. In this representation would be equal to length of the aptamer in the library and would be equal to the interactions of bases within each aptamer. Many modifications to the original model have been made, some to optimise the model for a given research area. Herein we will describe some modifications to the model which are aimed at optimising the model to represent binding of an aptamer to a target analyte. The model was believed to resemble molecular fitness landscapes similar to the binding of an aptamer to an analyte [24]. In the model mutational additivity usually holds for noninteracting positions in sequences. This mutational additivity is biologically accurate as has been demonstrated for several proteins [25–33].

Wedge et al. used an model for the simulation of protein directed evolution (DE) [36], a similar field to aptamer selection. Binary strings of length 40 and 100 were used with random epistatic interactions varying from = 0 to 10. Genetic algorithms utilising mutation, crossover, different library sizes, and selection pressures were simulated and compared to deduce general rules for protein directed evolution, which are of great use to DE experiments. As noted in this study, the “No Free Lunch Theorem” (NFL) [37] establishes that all search algorithms perform exactly the same when averaged over all possible problems. This infers that, for an optimisation algorithm, any elevated performance in one class of problem is exactly paid for in performance in another class. If there is discrepancy between a real life system and a model used to describe it, any elevated performance in optimisation using simulation of the model is exactly paid for in performance for the real life system. This illustrates the need for an accurate model when using simulation results to improve empirical ligand selection experiments.

Despite this biological accuracy in regard to mutational additivity, the classical model may have limitations in representing some biological systems. The model’s greatest utility is that ruggedness can be tuned using the epistasis variable . However, this epistasis is quite uniform throughout the sequence. For some biological applications, a higher amount of epistasis is desirable. As increases the landscape tends to become more multipeaked and rugged, to the point where it is too chaotic to allow adaptation. Kauffman refers to this phenomena as the “complexity catastrophe” [38]. Kauffman goes further to say “the complexity catastrophe is averted in the model for those landscapes which are sufficiently smooth to retain high optima as increases” [38]. Thinking along these lines provided a solution to the complexity catastrophe, creating complex landscapes which retained a smoother surface.

1.2. Constructional Selection of Landscapes

Altenberg developed an evolutionary approach to selecting epistatic interaction, thereby creating landscapes which were smoother than classic landscapes with the same degree of epistatic interaction [34]. Altenberg achieved this using selective genome growth, a type of constructional selection, to create modular interaction matrices. These selected matrices have reduced epistasis which give rise to smoother fitness landscapes [34, 39]. Selective genome growth is a process by which the genome of the fittest individual is expanded one gene at a time (Figure S1a in Supplementary Material available online at https://doi.org/10.1155/2017/6760852). The new gene is only kept if the fitness of a selected optimum genome is increased. In this way the probable global optima of the landscape is constructed and all other points on the landscape are relative to this optimisation. A similar method for creating landscapes was devised by Hebbron et al., which uses preferential attachment growth process to add genes to a genome [40]. A problem with these two approaches, when applied to specific applications, is that due to the increasing returns of the selection system these methods attribute extremely high pleiotropy to a handful of genes (vertical lines in Figure 2(c)). This phenomenon of increasing returns of gene control is biologically appropriate and accurate for a system describing a group of genomes, but when describing the binding of an aptamer to an analyte this high aggregated pleiotropy is not biologically appropriate. Each base in an aptamer has a relatively low number of interactions due to its spatial capacity, meaning that high aggregated pleiotropy is not biologically representative for an aptamer.

Herein we have created a new model that we have termed “selective phenome growth.” Selective phenome growth is a constructional selection technique in which phenotypic contributing factors are added to a genotype-phenotype interaction map incrementally (Figure S1b) in such a way that each new phenotypic contributing factor increases the fitness of global or local optima. Additionally, comparison is made between selective phenome growth landscapes and aptamer binding landscapes.

2. Model and Methods

2.1. Selective Phenome Growth to Create a Genotype/Phenome Interaction Map

Selective phenome growth is a new method of constructing an interaction matrix one phenotypic contributor at a time. The method or representing the interaction map is the same as Altenberg’s [34], with slight modification to represent aptamers, and is as follows: (1) The aptamer consists of

binary valued bases that have influence over

phenotypic functions, each of which contributes a component to the total fitness. (2) Each base controls a subset of the fitness components, and, in turn, each fitness component is controlled by a subset of the bases. This genotype-phenotype map can be represented by a matrix,


Kyk die video: Молод и глуп Botg Remix (Oktober 2022).