• 【离子通道的马尔可夫模型:多功能性,可识别性和速度。】 复制标题 收藏 收藏
    DOI:10.1098/rsta.2008.0301 复制DOI
    作者列表:Fink M,Noble D
    BACKGROUND & AIMS: :Markov models (MMs) represent a generalization of Hodgkin-Huxley models. They provide a versatile structure for modelling single channel data, gating currents, state-dependent drug interaction data, exchanger and pump dynamics, etc. This paper uses examples from cardiac electrophysiology to discuss aspects related to parameter estimation. (i) Parameter unidentifiability (found in 9 out of 13 of the considered models) results in an inability to determine the correct layout of a model, contradicting the idea that model structure and parameters provide insights into underlying molecular processes. (ii) The information content of experimental voltage step clamp data is discussed, and a short but sufficient protocol for parameter estimation is presented. (iii) MMs have been associated with high computational cost (owing to their large number of state variables), presenting an obstacle for multicellular whole organ simulations as well as parameter estimation. It is shown that the stiffness of models increases computation time more than the number of states. (iv) Algorithms and software programs are provided for steady-state analysis, analytical solutions for voltage steps and numerical derivation of parameter identifiability. The results provide a new standard for ion channel modelling to further the automation of model development, the validation process and the predictive power of these models.
    背景与目标: :Markov模型(MM)代表Hodgkin-Huxley模型的推广。它们为单通道数据,门控电流,状态相关的药物相互作用数据,交换器和泵动力学等建模提供了通用的结构。本文使用心脏电生理学中的示例讨论与参数估计有关的方面。 (i)参数无法识别(在所考虑模型的13个中的9个中发现)导致无法确定模型的正确布局,这与模型结构和参数可洞悉潜在分子过程的观点相矛盾。 (ii)讨论了实验电压阶跃钳位数据的信息内容,并提出了一个简短而充分的参数估计协议。 (iii)MM与高计算量相关联(由于其状态变量数量众多),这给多细胞全器官模拟以及参数估计带来了障碍。结果表明,模型的刚度比状态数更多地增加了计算时间。 (iv)提供了用于稳态分析的算法和软件程序,电压阶跃的解析解以及参数可识别性的数值推导。结果为离子通道建模提供了新的标准,以进一步促进模型开发,验证过程和这些模型的预测能力的自动化。
  • 【受体结合放射性药物动力学系统的拟合优度和局部可识别性。】 复制标题 收藏 收藏
    DOI:10.1109/10.126608 复制DOI
    作者列表:Vera DR,Scheibe PO,Krohn KA,Trudeau WL,Stadalnik RC
    BACKGROUND & AIMS: :A four-state nonlinear model describing a radiopharmacokinetic system for a hepatic receptor-binding radiopharmaceutical, [99mTc]-galactosyl-neoglycoalbumin (TcNGA), was tested for goodness-of-fit and local identifiability using scanning data from nine healthy subjects and seven patients with severe liver disease. Based on standard deviations of liver and heart imaging data at equilibria as a measure of observational error, the reduced chi-square ranged from 0.5 to 2.6. Values above 1.2 occurred when the subject moved during the 30 min study. Relative standard errors for each parameter were: TcNGA-receptor forward binding rate constant kb, 13-54%; extra-hepatic plasma volume Ve, 0.8-15.0%; hepatic plasma volume Vh, 0.2-6.5%; hepatic plasma flow F, 54----greater than 1000%; and receptor concentration [R]o, 0.3-13%. The highest standard errors occurred when the amount of TcNGA injected exceeded the total amount of receptor. Therefore, when TcNGA functional imaging was performed without excess patient motion and receptor saturation, the kinetic model provided data fits of low systematic error and yielded high precision estimates of receptor concentration and forward binding rate constant. In summary, optimal performance of the kinetic model occurred when the amount of injected TcNGA resulted in the nonlinear operation of the pharmacokinetic system.
    背景与目标: :使用来自9个健康受试者和7个健康受试者的扫描数据,测试了描述肝受体结合放射性药物[99mTc]-半乳糖基-新糖蛋白(TcNGA)的放射性药物动力学系统的四态非线性模型患有严重肝病的患者。根据平衡时肝脏和心脏成像数据的标准偏差(作为观察误差的量度),减少的卡方介于0.5到2.6之间。当受试者在30分钟的研究过程中移动时,出现1.2以上的值。每个参数的相对标准误差为:TcNGA-受体正向结合率常数kb,13-54%;肝外血浆体积Ve,0.8-15.0%;肝血浆体积Vh为0.2-6.5%;肝血浆流量F,54 ----大于1000%;受体浓度[R] o为0.3-13%。当注射的TcNGA的量超过受体的总量时,发生最高的标准误。因此,当进行TcNGA功能成像而没有过多的患者运动和受体饱和时,动力学模型提供了低系统误差的数据拟合,并产生了受体浓度和前向结合速率常数的高精度估算值。总之,当注射的TcNGA的量导致药代动力学系统发生非线性操作时,动力学模型的最佳性能就出现了。
  • 【使用估计方程可识别偏析参数。】 复制标题 收藏 收藏
    DOI:10.1159/000154315 复制DOI
    作者列表:Zhao LP,Grove JS
    BACKGROUND & AIMS: To eliminate the need for distributional assumptions and to reduce the computational burden associated with the method of maximum likelihood, several researchers have proposed using estimating equations techniques for segregation analysis. One concern with the application of this technique has been that the first and second order moments may not carry sufficient information for identifying all of the parameters in segregation models. It is shown that in addition to the marginal means and covariances from nuclear family data, up to the third order product moments need to be used in estimating equations for identifying all of the segregation parameters in a major gene model. A polygenic component and potentially a common family environment parameter can also be identified using up to the fourth order moments. Two weighting functions are developed to improve statistical efficiency.

    背景与目标: 为了消除对分布假设的需要并减少与最大似然法相关的计算负担,一些研究人员提出了使用估计方程技术进行偏析的方法。应用该技术的一个问题是,一阶矩和二阶矩可能没有携带足够的信息来识别分离模型中的所有参数。结果表明,除了核家族数据的边际均值和协方差外,在估计方程式时还需要使用三阶乘积矩来确定主要基因模型中的所有分离参数。多基因成分和潜在的共同家庭环境参数也可以使用最多四阶矩来识别。开发了两个加权函数以提高统计效率。

  • 【活性污泥模型2D校准与全面的WWTP数据:将模型参数的可识别性与进水和操作不确定性进行比较。】 复制标题 收藏 收藏
    DOI:10.1007/s00449-013-1099-8 复制DOI
    作者列表:Machado VC,Lafuente J,Baeza JA
    BACKGROUND & AIMS: :The present work developed a model for the description of a full-scale wastewater treatment plant (WWTP) (Manresa, Catalonia, Spain) for further plant upgrades based on the systematic parameter calibration of the activated sludge model 2d (ASM2d) using a methodology based on the Fisher information matrix. The influent was characterized for the application of the ASM2d and the confidence interval of the calibrated parameters was also assessed. No expert knowledge was necessary for model calibration and a huge available plant database was converted into more useful information. The effect of the influent and operating variables on the model fit was also studied using these variables as calibrating parameters and keeping the ASM2d kinetic and stoichiometric parameters, which traditionally are the calibration parameters, at their default values. Such an "inversion" of the traditional way of model fitting allowed evaluating the sensitivity of the main model outputs regarding the influent and the operating variables changes. This new approach is able to evaluate the capacity of the operational variables used by the WWTP feedback control loops to overcome external disturbances in the influent and kinetic/stoichiometric model parameters uncertainties. In addition, the study of the influence of operating variables on the model outputs provides useful information to select input and output variables in decentralized control structures.
    背景与目标: :本工作基于使用方法论对活性污泥模型2d(ASM2d)进行系统参数校准的基础上,开发了用于描述大型污水处理厂(WWTP)(Manresa,西班牙加泰罗尼亚)的模型,用于进一步的工厂升级基于Fisher信息矩阵。针对ASM2d的应用对进水进行了表征,还评估了校准参数的置信区间。模型校准不需要专业知识,并且巨大的可用工厂数据库已转换为更有用的信息。还使用这些变量作为校准参数,并将传统上是校准参数的ASM2d动力学和化学计量参数保持为默认值,研究了进水和运行变量对模型拟合的影响。传统模型拟合方式的这种“反转”可以评估主要模型输出关于进水和操作变量变化的敏感性。这种新方法能够评估WWTP反馈控制回路使用的操作变量的能力,以克服进水和动力学/化学计量模型参数不确定性中的外部干扰。此外,对操作变量对模型输出的影响的研究为在分散控制结构中选择输入和输出变量提供了有用的信息。
  • 【结合曲线的结构和平衡配体结合参数的实际可识别性。】 复制标题 收藏 收藏
    DOI:10.1085/jgp.201611703 复制DOI
    作者列表:Middendorf TR,Aldrich RW
    BACKGROUND & AIMS: :A critical but often overlooked question in the study of ligands binding to proteins is whether the parameters obtained from analyzing binding data are practically identifiable (PI), i.e., whether the estimates obtained from fitting models to noisy data are accurate and unique. Here we report a general approach to assess and understand binding parameter identifiability, which provides a toolkit to assist experimentalists in the design of binding studies and in the analysis of binding data. The partial fraction (PF) expansion technique is used to decompose binding curves for proteins with n ligand-binding sites exactly and uniquely into n components, each of which has the form of a one-site binding curve. The association constants of the PF component curves, being the roots of an n-th order polynomial, may be real or complex. We demonstrate a fundamental connection between binding parameter identifiability and the nature of these one-site association constants: all binding parameters are identifiable if the constants are all real and distinct; otherwise, at least some of the parameters are not identifiable. The theory is used to construct identifiability maps from which the practical identifiability of binding parameters for any two-, three-, or four-site binding curve can be assessed. Instructions for extending the method to generate identifiability maps for proteins with more than four binding sites are also given. Further analysis of the identifiability maps leads to the simple rule that the maximum number of structurally identifiable binding parameters (shown in the previous paper to be equal to n) will also be PI only if the binding curve line shape contains n resolved components.
    背景与目标: :在配体与蛋白质结合的研究中,一个关键但经常被忽略的问题是,从分析结合数据中获得的参数是否可实际识别(PI),即从拟合模型与噪声数据中获得的估计值是否准确且唯一。在这里,我们报告了一种评估和理解结合参数可识别性的通用方法,该方法提供了一个工具包,可协助实验人员设计结合研究和分析结合数据。部分分数(PF)扩展技术用于将具有n个配体结合位点的蛋白质的结合曲线准确唯一地分解为n个成分,每个成分都具有一个单点结合曲线的形式。作为n阶多项式的根的PF分量曲线的关联常数可以是实数,也可以是复数。我们展示了绑定参数可识别性与这些一站式关联常数的性质之间的基本联系:如果常数都是实数和唯一的,则所有绑定参数都是可识别的;否则,至少某些参数无法识别。该理论用于构建可识别性图谱,从中可以评估任何两个,三个或四个位点结合曲线的结合参数的实际可识别性。还给出了扩展方法以生成具有四个以上结合位点的蛋白质的可识别性图谱的说明。对可识别性图的进一步分析得出一个简单的规则,即只有在结合曲线线形包含n个分解分量的情况下,结构上可识别的结合参数的最大数量(在前一篇论文中显示为等于n)也将为PI。
  • 【我姐姐的饲养员?:基因组研究和兄弟姐妹的可识别性。】 复制标题 收藏 收藏
    DOI:10.1186/1755-8794-1-32 复制DOI
    作者列表:Cassa CA,Schmidt B,Kohane IS,Mandl KD
    BACKGROUND & AIMS: BACKGROUND:Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified. METHODS:We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy. RESULTS:Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency
    背景与目标: 背景:SNP的基因组测序越来越普遍,尽管这些数据所包含的家族信息量尚未量化。
    方法:我们提供了一个框架来衡量患者的SNP基因型披露的同胞风险,并证明同胞SNP基因型可以相当准确地推断出来。
    结果:通过扩展这种推断技术,我们确定在通常变化的SNP处,非常少的匹配项足以确认同胞关系,这表明已发布的序列数据可以可靠地用于推导同胞身份。使用HapMap三重组数据,在一个孩子是纯合子大家庭且次要等位基因频率结论:这些发现表明,使用推断的家族基因组数据会产生实质性的歧视和隐私风险。
  • 【可识别性,可交换性和流行病学混淆。】 复制标题 收藏 收藏
    DOI:10.1093/ije/15.3.413 复制DOI
    作者列表:Greenland S,Robins JM
    BACKGROUND & AIMS: :Non-identifiability of parameters is a well-recognized problem in classical statistics, and Bayesian statisticians have long recognized the importance of exchangeability assumptions in making statistical inferences. A seemingly unrelated problem in epidemiology is that of confounding: bias in estimation of the effects of an exposure on disease risk, due to inherent differences in risk between exposed and unexposed individuals. Using a simple deterministic model for exposure effects, a logical connection is drawn between the concepts of identifiability, exchangeability, and confounding. This connection allows one to view the problem of confounding as arising from problems of identifiability, and reveals the exchangeability assumptions that are implicit in confounder control methods. It also provides further justification for confounder definitions based on comparability of exposure groups, as opposed to collapsibility-based definitions.
    背景与目标: 参数的不可识别性是经典统计中公认的问题,贝叶斯统计学家早就认识到可交换性假设在进行统计推断时的重要性。流行病学中一个看似无关的问题是令人困惑的问题:由于接触者和未接触者之间固有的风险差异,估计接触者对疾病风险的影响存在偏差。通过使用简单的确定性模型来评估曝光效果,在可识别性,可交换性和混淆性概念之间建立了逻辑联系。这种联系使人们可以将混杂问题视为由可识别性问题引起的问题,并揭示混杂因素控制方法中隐含的可交换性假设。它还提供了基于暴露组可比性而不是基于可折叠性的定义的混杂因素定义的理由。
  • 【线性生物模型的可识别性和识别性。】 复制标题 收藏 收藏
    DOI:10.1016/j.biosystems.2014.02.001 复制DOI
    作者列表:Guo Y,Tan J
    BACKGROUND & AIMS: :Pulse is often used to excite biological systems. The inputs such as irrigation, therapy, and treatments to biological systems are also equivalent to pulses. This makes the biological system behave as switched models under the function of the input. To reduce difficulty in model parameter estimation, the system could be represented as a switched linear model under the pulse excitation. In this research, we studied the identification of a class of switched linear biological models with single input and the system matrix dependent on the intensity of excitation. System identifiability and identification were discussed. A recurrent-pulse excitation method was devised to provide necessary constraints for parameter estimation. The recurrent-pulse technique allowed determination of model parameters that would otherwise be difficult to determine uniquely. The usefulness of the method was demonstrated by examples including delayed fluorescence from photosystem II, which was well known as a versatile tool for sensing plant physiological status and environmental changes in the literature.
    背景与目标: :脉冲通常用于激发生物系统。诸如灌溉,治疗和对生物系统的处理之类的输入也等同于脉冲。这使得生物系统在输入的功能下表现为切换模型。为了减少模型参数估计的难度,可以将系统表示为脉冲激励下的开关线性模型。在这项研究中,我们研究了一类具有单输入和系统矩阵的开关线性生物模型的识别,该矩阵取决于激发强度。讨论了系统可识别性和识别性。设计了递归脉冲激励方法来为参数估计提供必要的约束。循环脉冲技术允许确定模型参数,否则将很难唯一确定模型参数。该方法的实用性通过实例得到了证明,其中包括来自Photosystem II的延迟荧光,该系统在文献中是众所周知的感知植物生理状态和环境变化的通用工具。
  • 【Deese-Roediger-McDermott错误记忆范式中的有效警告:可识别性的作用。】 复制标题 收藏 收藏
    DOI: 复制DOI
    作者列表:Neuschatz JS,Benoit GE,Payne DG
    BACKGROUND & AIMS: :These experiments document that warnings can substantially reduce false memories in the Deese-Roediger-McDermott (DRM) paradigm when the critical items are easily identifiable. Participants in a norming study identified the critical item after hearing a list of words. The lists with critical items that could be identified by the largest proportion of participants (high identifiable [HI] lists) and the smallest proportion of participants (low identifiable [LI] lists) were used in the experiment. Participants heard lists of words (e.g., bed, rest, doze) related to a critical item (e.g., sleep) and were warned about the nature of the lists before the study phase. The results indicated that warnings reduced false recognition of critical items for HI lists but not LI lists.
    背景与目标: :这些实验记录表明,当关键项目易于识别时,警告可以大大减少Deese-Roediger-McDermott(DRM)范例中的错误记忆。规范研究的参与者在听完单词表后就确定了关键项目。在实验中使用了可以由最大比例的参与者(高可识别[HI]列表)和最小比例的参与者(低可识别[LI]列表)识别的具有关键项目的列表。参与者听到了与关键项目(例如,睡眠)有关的单词列表(例如,卧床,休息,打)睡),并在研究阶段之前被警告了列表的性质。结果表明,警告减少了对HI列表而非LI列表的关键项目的错误识别。
  • 【葡萄糖胰岛素动态平衡中的诊断参数的可识别性和在线估计。】 复制标题 收藏 收藏
    DOI:10.1016/j.biosystems.2011.11.003 复制DOI
    作者列表:Eberle C,Ament C
    BACKGROUND & AIMS: :Today, diagnostic decisions about pre-diabetes or diabetes are made using static threshold rules for the measured plasma glucose. In order to develop an alternative diagnostic approach, dynamic models as the Minimal Model may be deployed. We present a novel method to analyze the identifiability of model parameters based on the interpretation of the empirical observability Gramian. This allows a unifying view of both, the observability of the system's states (with dynamics) and the identifiability of the system's parameters (without dynamics). We give an iterative algorithm, in order to find an optimized set of states and parameters to be estimated. For this set, estimation results using an Unscented Kalman Filter (UKF) are presented. Two parameters are of special interest for diagnostic purposes: the glucose effectiveness S(G) characterizes the ability of plasma glucose clearance, and the insulin sensitivity S(I) quantifies the impact from the plasma insulin to the interstitial insulin subsystem. Applying the identifiability analysis to the trajectories of the insulin glucose system during an intravenous glucose tolerance test (IVGTT) shows the following result: (1) if only plasma glucose G(t) is measured, plasma insulin I(t) and S(G) can be estimated, but not S(I). (2) If plasma insulin I(t) is captured additionally, identifiability is improved significantly such that up to four model parameters can be estimated including S(I). (3) The situation of the first case can be improved, if a controlled external dosage of insulin is applied. Then, parameters of the insulin subsystem can be identified approximately from measurement of plasma glucose G(t) only.
    背景与目标: :今天,有关糖尿病前期或糖尿病的诊断决策是使用针对所测血浆葡萄糖的静态阈值规则制定的。为了开发替代的诊断方法,可以部署动态模型(如最小模型)。我们提出了一种新的方法来分析模型参数的可识别性,它基于经验可观察性格拉姆式的解释。这样就可以对系统状态的可观察性(带有动态性)和系统参数的可识别性(不带有动态性)两者进行统一查看。我们给出一个迭代算法,以便找到一组最佳的状态和参数进行估算。对于该集合,给出了使用无味卡尔曼滤波器(UKF)的估计结果。为了诊断目的,特别需要关注两个参数:葡萄糖有效性S(G)表征血浆葡萄糖清除的能力,胰岛素敏感性S(I)量化从血浆胰岛素对间质胰岛素子系统的影响。在静脉葡萄糖耐量试验(IVGTT)期间将可识别性分析应用于胰岛素葡萄糖系统的轨迹显示以下结果:(1)如果仅测量血浆葡萄糖G(t),则血浆胰岛素I(t)和S(G )可以估算,但S(I)不能估算。 (2)如果另外捕获血浆胰岛素I(t),可识别性将大大提高,从而可以估计多达四个模型参数,包括S(I)。 (3)如果采用控制剂量的胰岛素,可以改善第一种情况。然后,仅可以通过测量血浆葡萄糖G(t)大致识别出胰岛素子系统的参数。
  • 【串行血清流行病学的统计可识别性和样本量计算。】 复制标题 收藏 收藏
    DOI:10.1016/j.epidem.2015.02.005 复制DOI
    作者列表:Vinh DN,Boni MF
    BACKGROUND & AIMS: :Inference on disease dynamics is typically performed using case reporting time series of symptomatic disease. The inferred dynamics will vary depending on the reporting patterns and surveillance system for the disease in question, and the inference will miss mild or underreported epidemics. To eliminate the variation introduced by differing reporting patterns and to capture asymptomatic or subclinical infection, inferential methods can be applied to serological data sets instead of case reporting data. To reconstruct complete disease dynamics, one would need to collect a serological time series. In the statistical analysis presented here, we consider a particular kind of serological time series with repeated, periodic collections of population-representative serum. We refer to this study design as a serial seroepidemiology (SSE) design, and we base the analysis on our epidemiological knowledge of influenza. We consider a study duration of three to four years, during which a single antigenic type of influenza would be circulating, and we evaluate our ability to reconstruct disease dynamics based on serological data alone. We show that the processes of reinfection, antibody generation, and antibody waning confound each other and are not always statistically identifiable, especially when dynamics resemble a non-oscillating endemic equilibrium behavior. We introduce some constraints to partially resolve this confounding, and we show that transmission rates and basic reproduction numbers can be accurately estimated in SSE study designs. Seasonal forcing is more difficult to identify as serology-based studies only detect oscillations in antibody titers of recovered individuals, and these oscillations are typically weaker than those observed for infected individuals. To accurately estimate the magnitude and timing of seasonal forcing, serum samples should be collected every two months and 200 or more samples should be included in each collection; this sample size estimate is sensitive to the antibody waning rate and the assumed level of seasonal forcing.
    背景与目标: :通常使用病例报告的症状性疾病时间序列来进行疾病动态的推断。推断的动态将取决于所讨论疾病的报告模式和监视系统而有所不同,并且推断将错过轻度或报告不足的流行病。为了消除由不同报告模式引起的差异并捕获无症状或亚临床感染,可以将推论方法应用于血清学数据集,而不是病例报告数据。要重建完整的疾病动态,就需要收集血清学时间序列。在这里介绍的统计分析中,我们考虑了一种具有重复性,周期性收集人群代表性血清的血清学时间序列。我们将此研究设计称为串行血清流行病学(SSE)设计,并基于我们对流感的流行病学知识进行分析。我们认为研究周期为三到四年,在此期间将传播一种抗原类型的流感,并且我们仅根据血清学数据评估我们重建疾病动态的能力。我们表明,再感染,抗体生成和抗体减弱的过程相互混淆,并不总是统计上可识别的,尤其是当动态类似于非振荡的地方性平衡行为时。我们介绍了一些约束条件以部分解决这种混淆,并且我们表明可以在SSE研究设计中准确估算传输速率和基本繁殖数。季节性强迫更难确定,因为基于血清学的研究仅能检测出恢复个体的抗体滴度中的振荡,并且这些振荡通常比对感染个体观察到的振荡弱。为了准确估计季节性强迫的强度和时间,应每两个月收集一次血清样本,每次收集应包括200个或更多样本;该样本量估计值对抗体的减弱速率和假定的季节性强迫水平敏感。
  • 【使用数据克隆评估系统发育模型中的参数可识别性。】 复制标题 收藏 收藏
    DOI:10.1093/sysbio/sys055 复制DOI
    作者列表:Ponciano JM,Burleigh JG,Braun EL,Taper ML
    BACKGROUND & AIMS: :The success of model-based methods in phylogenetics has motivated much research aimed at generating new, biologically informative models. This new computer-intensive approach to phylogenetics demands validation studies and sound measures of performance. To date there has been little practical guidance available as to when and why the parameters in a particular model can be identified reliably. Here, we illustrate how Data Cloning (DC), a recently developed methodology to compute the maximum likelihood estimates along with their asymptotic variance, can be used to diagnose structural parameter nonidentifiability (NI) and distinguish it from other parameter estimability problems, including when parameters are structurally identifiable, but are not estimable in a given data set (INE), and when parameters are identifiable, and estimable, but only weakly so (WE). The application of the DC theorem uses well-known and widely used Bayesian computational techniques. With the DC approach, practitioners can use Bayesian phylogenetics software to diagnose nonidentifiability. Theoreticians and practitioners alike now have a powerful, yet simple tool to detect nonidentifiability while investigating complex modeling scenarios, where getting closed-form expressions in a probabilistic study is complicated. Furthermore, here we also show how DC can be used as a tool to examine and eliminate the influence of the priors, in particular if the process of prior elicitation is not straightforward. Finally, when applied to phylogenetic inference, DC can be used to study at least two important statistical questions: assessing identifiability of discrete parameters, like the tree topology, and developing efficient sampling methods for computationally expensive posterior densities.
    背景与目标: :基于模型的方法在系统发育学中的成功激发了许多旨在生成新的生物学信息模型的研究。这种新的计算机密集型系统发育方法要求进行验证研究和对性能进行合理测量。迄今为止,关于何时以及为什么可以可靠地识别特定模型中的参数的实践指南很少。在这里,我们说明如何使用数据克隆(DC)(一种最近开发的方法来计算最大似然估计值及其渐近方差)来诊断结构参数不可识别性(NI),并将其与其他参数可估计性问题(包括何时使用参数)区分开来在结构上是可识别的,但是在给定的数据集中(INE)不可估计,并且在参数可识别且可估计的情况下,但仅在微弱的情况下(WE)是可估计的。 DC定理的应用使用了众所周知且广泛使用的贝叶斯计算技术。通过DC方法,从业人员可以使用贝叶斯系统进化软件来诊断不可识别性。现在,理论家和从业人员都拥有强大而简单的工具,可以在调查复杂的建模场景时检测不可识别性,而在复杂的建模场景中,概率研究中获取封闭形式的表达式非常复杂。此外,在这里,我们还展示了如何将DC用作检查和消除先验影响的工具,尤其是在先验诱导过程不是很简单的情况下。最后,当应用于系统发育推断时,DC可以用于研究至少两个重要的统计问题:评估离散参数(如树形拓扑)的可识别性,以及开发用于计算上昂贵的后验密度的有效采样方法。
  • 【具有两个输入函数的一室模型的先验可识别性,用于肝血流量测量。】 复制标题 收藏 收藏
    DOI:10.1088/0031-9155/50/7/004 复制DOI
    作者列表:Becker GA,Müller-Schauenburg W,Spilker ME,Machulla HJ,Piert M
    BACKGROUND & AIMS: An extended dual-input Kety-Schmidt model can be applied to positron emission tomography data for the quantification of local arterial (f(a)) and local portal-venous blood flow (f(p)) in the liver by freely diffusible tracers (e.g., [15O]H2O). We investigated the a priori identifiability of the three-parameter model (f(a), f(p) and distribution volume (Vd)) under ideal (noise-free) conditions. The results indicate that the full identifiability of the model depends on the form of the portal-venous input function (c(p)(t)), which is assumed to be a sum of m exponentials convolved with the arterial input function (c(a)(t)). When m>or=2, all three-model parameters are uniquely identifiable. For m=1 identifiability of f(p) fails if c(p)(t) coincides with tissue concentration (q(t)/Vd), which occurs if c(p)(t) is generated from an intestinal compartment with transit time Vd/f(a). Any portal input, f(p) c(p)(t), is balanced by the portal contribution, f(p) q(t)/Vd, to the liver efflux, leaving q(t) unchanged by f(p) and only f(a) and Vd are a priori uniquely identifiable. An extension to this condition of unidentifiability is obtained if we leave the assumption of a generating intestinal compartment system and allow for an arbitrary proportionality constant between c(p)(t) and q(t). In this case, only f(a) remains a priori uniquely identifiable. These findings provide important insights into the behaviour and identifiability of the model applied to the unique liver environment.

    背景与目标: 可将扩展的双输入Kety-Schmidt模型应用于正电子发射断层扫描数据,以通过自由扩散的示踪剂对肝脏中的局部动脉(f(a))和局部门静脉血流(f(p))进行量化(例如[15O] H2O)。我们研究了在理想(无噪声)条件下三参数模型(f(a),f(p)和分布体积(Vd))的先验可识别性。结果表明,该模型的完全可识别性取决于门静脉输入函数(c(p)(t))的形式,该函数假定为m个指数与动脉输入函数(c(在))。当m>或= 2时,所有三个模型参数都是唯一可识别的。对于m = 1,如果c(p)(t)与组织浓度(q(t)/ Vd)吻合,则f(p)的可识别性将失败,如果c(p)(t)是从肠道隔室产生并经过时间Vd / f(a)。任何门户输入f(p)c(p)(t)均由门户对肝外流的贡献f(p)q(t)/ Vd平衡,而使q(t)不变f(p)并且只有f(a)和Vd是先验唯一可识别的。如果我们离开生成肠道隔室系统的假设,并允许c(p)(t)和q(t)之间的任意比例常数,则可以扩展到这种无法确定的条件。在这种情况下,只有f(a)保持先验唯一可识别。这些发现为应用到独特肝脏环境的模型的行为和可识别性提供了重要的见识。

  • 【关于传染病传播动力学模型的可辨识性。】 复制标题 收藏 收藏
    DOI:10.1534/genetics.115.180034 复制DOI
    作者列表:Lintusaari J,Gutmann MU,Kaski S,Corander J
    BACKGROUND & AIMS: :Understanding the transmission dynamics of infectious diseases is important for both biological research and public health applications. It has been widely demonstrated that statistical modeling provides a firm basis for inferring relevant epidemiological quantities from incidence and molecular data. However, the complexity of transmission dynamic models presents two challenges: (1) the likelihood function of the models is generally not computable, and computationally intensive simulation-based inference methods need to be employed, and (2) the model may not be fully identifiable from the available data. While the first difficulty can be tackled by computational and algorithmic advances, the second obstacle is more fundamental. Identifiability issues may lead to inferences that are driven more by prior assumptions than by the data themselves. We consider a popular and relatively simple yet analytically intractable model for the spread of tuberculosis based on classical IS6110 fingerprinting data. We report on the identifiability of the model, also presenting some methodological advances regarding the inference. Using likelihood approximations, we show that the reproductive value cannot be identified from the data available and that the posterior distributions obtained in previous work have likely been substantially dominated by the assumed prior distribution. Further, we show that the inferences are influenced by the assumed infectious population size, which generally has been kept fixed in previous work. We demonstrate that the infectious population size can be inferred if the remaining epidemiological parameters are already known with sufficient precision.
    背景与目标: :了解传染病的传播动态对于生物学研究和公共卫生应用都很重要。业已广泛证明,统计模型为从发病率和分子数据中推断相关的流行病学数量提供了坚实的基础。但是,传输动态模型的复杂性提出了两个挑战:(1)模型的似然函数通常不可计算,并且需要采用基于计算密集型仿真的推理方法,(2)模型可能无法完全识别从可用数据中。虽然第一个困难可以通过计算和算法上的进步来解决,但第二个障碍则更为根本。可识别性问题可能导致推论更多地是由先前的假设所驱动,而不是由数据本身所驱动。我们考虑了基于经典IS6110指纹数据的流行且相对简单但在分析上难以解决的结核病传播模型。我们报告了模型的可识别性,还介绍了有关推理的一些方法学进展。使用似然近似,我们表明不能从可用数据中识别出生殖价值,并且在先前工作中获得的后验分布很可能主要由假定的先前分布所支配。此外,我们表明,推论受假定的传染人群数量的影响,该数量在以前的工作中通常保持不变。我们证明,如果已经以足够的精确度知道了其余的流行病学参数,就可以推断出感染人群的大小。
  • 【一种有效的自动程序,用于测试HIV / AIDS模型的参数可识别性。】 复制标题 收藏 收藏
    DOI:10.1007/s11538-010-9588-2 复制DOI
    作者列表:Saccomani MP
    BACKGROUND & AIMS: :Realistic HIV models tend to be rather complex and many recent models proposed in the literature could not yet be analyzed by traditional identifiability testing techniques. In this paper, we check a priori global identifiability of some of these nonlinear HIV models taken from the recent literature, by using a differential algebra algorithm based on previous work of the author. The algorithm is implemented in a software tool, called DAISY (Differential Algebra for Identifiability of SYstems), which has been recently released (DAISY is freely available on the web site http://www.dei.unipd.it/~pia/ ). The software can be used to automatically check global identifiability of (linear and) nonlinear models described by polynomial or rational differential equations, thus providing a general and reliable tool to test global identifiability of several HIV models proposed in the literature. It can be used by researchers with a minimum of mathematical background.
    背景与目标: :现实的HIV模型往往相当复杂,并且文献中提出的许多最新模型尚无法通过传统的可识别性测试技术进行分析。在本文中,我们使用基于作者先前工作的微分代数算法,对从最近文献中获得的某些非线性HIV模型进行先验全局可识别性。该算法在称为DAISY(用于系统识别的微分代数)的软件工具中实现,该工具最近已发布(DAISY可从网站http://www.dei.unipd.it/~pia/免费获得)。 。该软件可用于自动检查由多项式或有理微分方程描述的(线性和)非线性模型的全局可识别性,从而提供一种通用可靠的工具来测试文献中提出的几种HIV模型的全局可识别性。至少具有数学背景的研究人员可以使用它。

+1
+2
100研值 100研值 ¥99课程
检索文献一次
下载文献一次

去下载>

成功解锁2个技能,为你点赞

《SCI写作十大必备语法》
解决你的SCI语法难题!

技能熟练度+1

视频课《玩转文献检索》
让你成为检索达人!

恭喜完成新手挑战

手机微信扫一扫,添加好友领取

免费领《Endnote文献管理工具+教程》

微信扫码, 免费领取

手机登录

获取验证码
登录