Using bio-ansible
Quick start
sudo apt-get install unzip make gcc git python-virtualenv
sudo apt-get install zlib1g-dev libbz2-dev liblzma-dev libncurses5-dev
sudo apt-get install openjdk-8-jdk ant golang-go
sudo apt-get install git htop tmux vim
virtualenv ~/ansible_env
source ~/ansible_env/bin/activate
pip install --upgrade pip
pip install ansible
git clone https://github.com/MonashBioinformaticsPlatform/bio-ansible
cd bio-ansible/
ansible-playbook -i hosts bio.yml --tags dirs,bds,rnasik,star,bwa,hisat2,subread,samtools,htslib,bedtools,picard,qualimap,fastqc,multiqc
export PATH=$HOME/bioansible/software/apps/BigDataScript-0.99999g:$PATH
export PATH=$HOME/bioansible/software/apps/RNAsik-pipe-1.4.9/bin:$PATH
RNAsik
Tools prerequisites
I tried to account for every sub-dependency in bio-ansible, definitely checked against vanilla ubuntu 16.04 linux distro, however other systems/linux distros might have slight deviation from this. If you run into trouble please double check dependencies for the tool that is failing. There is quite a spectrum of languages there in the pipeline, C/C++, java and python so far. One can image the difficulty to accommodate every distro and/or system. I'm doing my best !
System prerequisites
In order to install system dependencies you'll need admin privilege i.e sudo
General, these are you "stock" utils, that most running system will/should have
unzip
make
gcc
git
python-virtualenv
sudo apt-get install unzip make gcc git python-virtualenv
samtools, htslib and bwa deps, these are some what specific libraries
zlib1g-dev
libbz2-dev
liblzma-dev
libncurses5-dev
sudo apt-get install zlib1g-dev libbz2-dev liblzma-dev libncurses5-dev
Java and BigDataScript, these again rather generic packages, except golang. Note that golang is pretty easy to install, comes as a pre-compiled binary here if you don't want to get it through system package mamanger
openjdk-8-jdk
ant
golang-go
sudo apt-get install openjdk-8-jdk ant golang-go
Extras, these are optional dependencies, but tmux
especially recommended as pipeline run could take some time to complete
provided you are doing this on remote machine (server), which is also recommended
htop
tmux
vim
sudo apt-get install git htop tmux vim
Running bio-ansible
Follow ansible installation guid to get ansible then:
git clone https://github.com/MonashBioinformaticsPlatform/bio-ansible
cd bio-ansible/
ansible-playbook -i hosts bio.yml --tags dirs,bds,rnasik,star,bwa,hisat2,subread,samtools,htslib,bedtools,picard,qualimap,fastqc,multiqc
Alternative installation method for RNAsik
If you have all of the tools installed and you just need RNAsik
you can simply git clone
it. It doesn't require any
other installations/compilations. BUT you do need to have BigDataScript installed
and have it in your PATH
for RNAsik
to run
git clone https://github.com/MonashBioinformaticsPlatform/RNAsik-pipe
path/to/RNAsik-pipe/bin/RNAsik
Make RNAsik analysis ready
bio-ansible is complete bioinformatics stack (with heavily genomics focus at this stage) deployment written in ansible script, which depending on a type of deployment might require admin privilege i.e sudo
.
Given that system prerequisites are satisfied one don't need sudo
to install bioinformatics stack, primarily bio_tools.
In this docs there is an assumption that user either has sudo
rights and/or able to install system prerequisites OR already has those dependencies installed and therefore can simply use bio-ansible as per installing RNAsik
section above to get all required tools dependencies.
Also right now bio-ansible is focused on a particular tools/enviroment management type, which is lmod, where one can module load samtools
into their environment for use, by default samtools
isn't available in the current (shell) environment. This is rather common approach on HPC clusters. Because of that type of installation, if user doesn't have pre-installed lmod
they will needs to either export PATH
for every tool (sounds a bit annoying), OR export
RNAsik
into your PATH
and them simply let RNAsik
know where tools are through -configFile
option.
export PATH=$HOME/bioansible/software/apps/BigDataScript-0.99999g:$PATH
export PATH=$HOME/bioansible/software/apps/RNAsik-pipe-1.4.9/bin:$PATH
copy these lines into file e.g sik.config and add -configFile path/to/sik.config
to RNAsik
starExe = $HOME/bioansible/software/apps/STAR-2.5.2b/STAR
hisat2Exe = $HOME/bioansible/software/apps/hisat2-2.1.0/bin/hisat2
bwaExe = $HOME/bioansible/software/apps/bwa-v0.7.15/bwa
samtoolsExe = $HOME/bioansible/software/apps/samtools-1.4.1/bin/samtools
bedtoolsExe = $HOME/bioansible/software/apps/bedtools2-2.25.0/bin/bedtools
countsExe = $HOME/bioansible/software/apps/subread-1.5.2/bin/featureCounts
fastqcExe = $HOME/bioansible/software/apps/fastqc-0.11.5/fastqc
pythonExe = python
picardExe = java -Xmx6g -jar $HOME/bioansible/software/apps/picard-2.17.10/picard.jar
qualimapExe = $HOME/bioansible/software/apps/qualimap_v2.2.1/qualimap
multiqcExe = $HOME/bioansible/software/apps/multiqc-1.4/bin/multiqc
If the user happens to have lmod
installed, simply module use $HOME/bioansible/software/modules/bio
to let lmod
know about new modules and then simply module load RNAsik-pipe
, which will automatically "pull" other dependencies into your environment. You can check that by module list
to see what is in your environment