今天分享的内容是copykat在云服务器Linux端的部署的方法,主要包括安装和测试两大部分,最终的效果是调用服务器进行计算。
测试环境
- Linux centos 7
- R 4.2.3
- minconda3
- 天意云 24C 192GB
主要的思路是先在linux下创建一个新的conda环境,安装上新版的R,然后再依次安装各种依赖包,但是由于linux系统有时缺少系统支持库文件,因此需要根据报错提示解决各种连锁软件的安装,主要遇到的问题集中在XML、GSVA、httpuv、seurat、RcppEigen、hdf5r等,接来下依次介绍解决办法。
安装步骤
新建环境
最好新建一个环境copykat,该步骤不是必需的。
conda activate copykat conda install r-base=4.1.2
安装基础软件
checkPkg <- function(pkg){ return(requireNamespace(pkg, quietly = TRUE)) } if(!checkPkg("BiocManager")) install.packages("BiocManager") if(!checkPkg("devtools")) install.packages("devtools")
安装依赖软件
library(devtools) if(!checkPkg("RcppArmadillo")) install.packages("RcppArmadillo") if(!checkPkg("RcppProgress")) install.packages("RcppProgress") if(!checkPkg("markdown")) install.packages("markdown") if(!checkPkg("R.utils")) install.packages("R.utils") if(!checkPkg("NNLM")) install_github("linxihui/NNLM") if(!checkPkg("copykat")) install_github("navinlabcode/copykat") if(!checkPkg("Seurat")) BiocManager::install("Seurat") if(!checkPkg("knitr")) BiocManager::install("knitr") if(!checkPkg("GSVA")) BiocManager::install("GSVA") if(!checkPkg("pheatmap")) BiocManager::install("pheatmap") if(!checkPkg("ComplexHeatmap")) BiocManager::install("ComplexHeatmap")
安装copykat
install_github("Miaoyx323/stCancer")
如果以上代码运行完没有任何ERROR,恭喜你运气比较好,如果中间出现报错,请接着看下面的内容。
报错与解决办法
XML引起GSVA安装失败
image-20230411190940520
解决方法:退出R,利用conda安装r-XML
conda install r-XML
httpuv引起seurat安装失败
ERROR: failed to lock directory ‘/home/zjw/miniconda3/envs/work/lib/R/library’ for modifying Try removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/00LOCK-httpuv’ ERROR: failed to lock directory ‘/home/zjw/miniconda3/envs/work/lib/R/library’ for modifying Try removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/00LOCK-RcppEigen’ ERROR: dependency ‘httpuv’ is not available for package ‘shiny’ * removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/shiny’ ERROR: dependency ‘shiny’ is not available for package ‘miniUI’ * removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/miniUI’ ERROR: dependencies ‘miniUI’, ‘shiny’, ‘RcppEigen’ are not available for package ‘Seurat’ * removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/Seurat’
解决方法:添加参数INSTALL_opts = '--no-lock'
再安装
install.packages("httpuv",INSTALL_opts = '--no-lock') conda install r-httpuv
image-20230411191801893
RcppEigen报错
ERROR: failed to lock directory ‘/home/zjw/miniconda3/envs/work/lib/R/library’ for modifying Try removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/00LOCK-RcppEigen’
解决方法:同上
install.packages("RcppEigen",INSTALL_opts = '--no-lock')
conda install r-RcppEigen
hdf5r报错
Error in Read10X_h5("matrix.h5") : Please install hdf5r to read HDF5 files
解决方法:yum安装后用conda安装r-hdf5r
sudo yum install hdf5-devel # install.packages("hdf5r") conda install r-hdf5r
hdf5版本过低
configure: error: The version of hdf5 installed on your system is not sufficient. Please ensure that at least version 1.8.13 is installed ERROR: configuration failed for package ‘hdf5r’ * removing ‘/home/zjw/miniconda3/envs/work/lib/R/library/hdf5r’ The downloaded source packages are in ‘/tmp/Rtmpr0m6Da/downloaded_packages’ Updating HTML index of packages in '.Library' Making 'packages.html' ... done Warning message: In install.packages("hdf5r") : installation of package ‘hdf5r’ had non-zero exit status
如果出现版本不符提示,解决方法:
- 下载hdf5-1.8.13的源码
wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.8/hdf5-1.8.13/src/hdf5-1.8.13.tar.bz2
- 编译安装
tar xjf hdf5-1.8.13.tar.bz2 cd hdf5-1.8.13 ./configure --prefix=$HOME/.local/bin/hdf5-1.8.13 make && make install
- 设置环境变量
export PATH=$HOME/.local/bin/hdf5-1.8.13/bin:$PATH export LD_LIBRARY_PATH=$HOME/.local/bin/hdf5-1.8.13/lib:$LD_LIBRARY_PATH
- 重新安装R包hdf5r
install.packages('hdf5r')
使用测试
示例脚本代码
t1 <- Sys.time() library(copykat) library(Seurat) library(hdf5r) download.file("https://cf.10xgenomics.com/samples/cell-exp/6.0.0/Brain_Tumor_3p/Brain_Tumor_3p_filtered_feature_bc_matrix.h5", destfile = "matrix.h5") mat <- Read10X_h5("matrix.h5") sco <- CreateSeuratObject(mat,project="Glioblastoma") Glioblastoma <- copykat(rawmat = sco@assays$RNA@counts, sam.name = "Glioblastoma", n.cores = 18) pred <- data.table::fread("Glioblastoma_copykat_prediction.txt") table(pred$copykat.pred) t2 <- Sys.time()
运行结果
29K Apr 11 20:34 Glioblastoma_copykat_clustering_results.rds 209M Apr 11 20:33 Glioblastoma_copykat_CNA_raw_results_gene_by_cell.txt 296M Apr 11 20:35 Glioblastoma_copykat_CNA_results.txt 924K Apr 11 20:35 Glioblastoma_copykat_heatmap.jpeg 42K Apr 11 20:34 Glioblastoma_copykat_prediction.txt 71M Apr 11 20:36 Glioblastoma_copykat_with_genes_heatmap.pdf 7.1M Apr 11 20:26 matrix.h5 333 Apr 11 20:26 out.log 533 Apr 11 20:13 testcode.R
image-20230411210300234
测试完成,能够正常使用copykat进行单细胞数据分析,并且能够充分调用服务器计算资源,计算速度相对比较快,通过这个方式能够进行更大规模的单细胞数据分析。
[1] "running copykat v1.1.0" [1] "step1: read and filter data ..." [1] "36601 genes, 1615 cells in raw data" [1] "filtered out 106 cells with less than 200 genes; remaining 1510 cells" [1] "12101 genes past LOW.DR filtering" [1] "step 2: annotations gene coordinates ..." [1] "start annotation ..." [1] "step 3: smoothing data with dlm ..." [1] "step 4: measuring baselines ..." number of iterations= 160 number of iterations= 111 number of iterations= 1030 number of iterations= 193 number of iterations= 186 number of iterations= 234 [1] "step 5: segmentation..." [1] "step 6: convert to genomic bins..." [1] "step 7: adjust baseline ..." [1] "step 8: final prediction ..." [1] "step 9: saving results..." [1] "step 10: ploting heatmap ..." Time difference of 4.411825 mins 参考:https://www.omicsclass.com/article/1637
image-20230411211928280
END
© 素材来源于网络,侵权请联系后台删除
往期推荐: