在体外和实验性感染性心内膜炎中，通过agr，sarA和sae对金黄色葡萄球菌α-毒素基因（hla）表达的调节。-小狗文献

【Regulation of Staphylococcus aureus alpha-toxin gene (hla) expression by agr, sarA, and sae in vitro and in experimental infective endocarditis.].

【在体外和实验性感染性心内膜炎中，通过agr，sarA和sae对金黄色葡萄球菌α-毒素基因（hla）表达的调节。】 复制标题 收藏收藏

影响因子 :
发表时间：2006-11-01
来源期刊：J Infect Dis

DOI：10.1086/508210 复制DOI
文章类型：杂志文章

作者列表：
下载文献

BACKGROUND:Protein remote homology detection and fold recognition are central problems in computational biology. Supervised learning algorithms based on support vector machines are currently one of the most effective methods for solving these problems. These methods are primarily used to solve binary classification problems and they have not been extensively used to solve the more general multiclass remote homology prediction and fold recognition problems. RESULTS:We present a comprehensive evaluation of a number of methods for building SVM-based multiclass classification schemes in the context of the SCOP protein classification. These methods include schemes that directly build an SVM-based multiclass model, schemes that employ a second-level learning approach to combine the predictions generated by a set of binary SVM-based classifiers, and schemes that build and combine binary classifiers for various levels of the SCOP hierarchy beyond those defining the target classes. CONCLUSION:Analyzing the performance achieved by the different approaches on four different datasets we show that most of the proposed multiclass SVM-based classification approaches are quite effective in solving the remote homology prediction and fold recognition problems and that the schemes that use predictions from binary models constructed for ancestral categories within the SCOP hierarchy tend to not only lead to lower error rates but also reduce the number of errors in which a superfamily is assigned to an entirely different fold and a fold is predicted as being from a different SCOP class. Our results also show that the limited size of the training data makes it hard to learn complex second-level models, and that models of moderate complexity lead to consistently better results.

译文

背景：蛋白质远程同源性检测和折叠识别是计算生物学中的核心问题。目前，基于支持向量机的监督学习算法是解决这些问题的最有效方法之一。这些方法主要用于解决二进制分类问题，尚未广泛用于解决更一般的多类远程同源性预测和折叠识别问题。
结果：我们目前对在SCOP蛋白质分类的背景下建立基于SVM的多类别分类方案的许多方法进行了全面评估。这些方法包括直接构建基于SVM的多类模型的方案，采用第二级学习方法来组合由一组基于二进制SVM的分类器生成的预测的方案以及为各个级别的SVM构建和组合二进制分类器的方案。 SCOP层次结构超出了定义目标类的层次结构。
结论：分析不同方法在四个不同数据集上获得的性能，我们发现大多数提议的基于多类SVM的分类方法在解决远程同源性预测和折叠识别问题方面非常有效，并且使用来自二进制模型的预测的方案为SCOP层次结构内的祖先类别构建的结构不仅会导致较低的错误率，而且还会减少将超家族分配给完全不同的折叠并预测来自不同SCOP类的折叠的错误数量。我们的结果还表明，训练数据的数量有限，很难学习复杂的第二级模型，而中等复杂性的模型则可以始终如一地获得更好的结果。

【Regulation of Staphylococcus aureus alpha-toxin gene (hla) expression by agr, sarA, and sae in vitro and in experimental infective endocarditis.].

手机登录