高级检索+

融合数据扩散算法与深度生成模型的单细胞特征提取研究

Research on Single Cell Feature Extraction Based on Fusion Data Diffusion Algorithm and Deep Generation Model

  • 摘要: 深度模型在单细胞转录组测序(single-cell transcriptome sequencing, scRNA-seq)中以单细胞分辨率提取基因的特征表达,但是scRNA-seq采集过程中存在“dropout”(数据缺失)问题,造成基因表达矩阵存在大量技术零值的噪声数据,部分基因间的关联性被噪声掩盖或影响。盲目地挖掘噪声数据往往会对深度学习模型的训练和推理过程产生消极影响,进而导致批次效应、虚假差异基因表达结果和性能下降等问题,掩藏真正的表达关系。针对以上问题,本文提出了一种融合单细胞转录组数据扩散算法的深度生成模型,通过数据扩散算法在相似的细胞之间分享信息,消除细胞计数矩阵中噪声的同时填补“dropout”现象,提高深度模型的聚类精度并有效去除批次效应。

     

    Abstract: Deep learning models in single-cell transcriptome sequencing(scRNA-seq) enable the extraction of gene expression features at a single-cell resolution. However, the presence of "dropout" issues during scRNA-seq data collection introduces significant technical zero values, resulting in noisy data in the gene expression matrix. This noise can obscure or impact the correlation between certain genes. Blindly mining noisy data often has detrimental effects on the training and inference processes of deep learning models, leading to problems such as batch effects, false differential gene expression results, and decreased performance, thereby concealing genuine expression relationships. To tackle these challenges, this paper introduces a deep generative model that integrates a single-cell transcriptome data diffusion algorithm. By utilizing a data diffusion method to exchange information among similar cells, this approach aims to eliminate noise in the cell count matrix and impute "dropout" events. Consequently, it enhances the clustering accuracy of deep models and effectively mitigates batch effects.

     

/

返回文章
返回