mbp-banner

Contributing

There are many places for contribution the most obvious ones are help with documentations, help in the user's group and of course with the source itself.

Documentations

I'm using mkdocs to generate this site, which has been very easy to use. All documentations are written in plain markdown and located in main repo docs/ directory. You can simply fork RNAsik repository, do changes to the docs and send me a pull request (PR). Any changes are super welcomed, even one letter spell correction (there'll be more than one), but all changes need to come through PR, which will not only acknowledge you as contributor, but also enable me to review changes quickly and incorporate them in (pull them in) easily.

Quick notes on mkdocs, it is pretty easy to install with pip in virtualenv if you prefer (you should).

to install mkdocs (don't have to use virtualenv)

virtualenv mkdocs_env
source mkdocs_env/bin/activate
pip install mkdocs

mkdocs in the nutshell

mkdocs build
mkdocs gh-deploy

This will deploy your copy of RNAsik docs to your github-pages (gh-pages) You actually don't need to do that, you don't need deploy your own copy of the docs to your branch. Just use mkdocs server (read below) to prerview changes and send them through to me.

git clone https://github.com/MonashBioinformaticsPlatform/RNAsik-pipe
cd RNAsik-pipe
# do docs changes

get localhost server (to preview your changes)

mkdocs server

This will give you live updates to you copy of the docs, default URL should be localhost:8000, but it will tell you that once you've started the server. Then simply use your favourite text editor to edit markdown documents. Commit your changes, don't be afraid to be verbose, say what you've added/changed/removed in your commit message. And send me PR.

Developing pipeline further

I need to write a more comprehensive developer guide at sometime soon. Any contributions are again extremely welcomed and again as I've mentioned in the documentations section above, any contributions need to come through pull request (PR).

To summarise briefly layouts of the src/:

RNAsik.bds is the main "executable" file that sources all required modules and runs the pipeline.
sikHeader.bds defines help menu and all user inputs options. I do have a couple of command line arguments hidden from main help menu, but if you take a pick at this file you'll see them all
All other *.bds files contain functions to specific tasks those functions get called in RNAsik.bds

Building conda package

First of all you need to install (mini)conda.

download miniconda .sh installer
run it and follow the prompts

I run it like this

bash Miniconda3-latest-Linux-x86_64.sh -b -p ~/.miniconda

These are fairly routine steps, but if this is your first time you'll need to do them

add a few conda "channels", this is so conda knows where to get things from

conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda

install a couple conda packages, required

conda install conda-build anaconda-client

Note that you can use -y flag to say assume yes instead of manually entering yes/no

you will need a copy of bioconda recipes. I haven't PR my fork to official bioconda channel so for now it is

git clone  https://github.com/serine/bioconda-recipes
cd bioconda-recipes
conda build recipes/rnasik

To install RNAsik locally from just build package. You need these two commands. First command simply list the location of where the .tar.bz2 file is on the system. You also need that location if you want to publish to anaconda repository. The second command simply installs the package

conda build recipes/rnasik --output
conda install -y --use-local rnasik

To upload newly build package to anacoda repository
- set up an account at Anacoda
- anaconda login
- anaconda upload <path_to_file.tar.bz2>
- anaconda upload <path_to_file.tar.bz2> --label dev

Once you've logged in once, anaconda will store login token somewhere in your home directory

here ?

~/.continuum/anaconda-client

RNAsik conda environment

Since version 1.5.4 RNAsik conda package only contains RNAsik pipeline without any additional bioinformatics tools. This is because I was having issues building a new version of the conda package. I'm guessing this is to do with the fact that those other bioinformatics tools like samtools etc aren't true dependencies, although I was specifing them under "requirements: run: " the build was spending too much time in "solving environment". Instead I exported full environmnet and how that on github such that use can simply grab that yaml and re-create RNAsik env

This is how to create new environment yaml config file

cat > tools.txt

rnasik
bigdatascript
fastqc
multiqc
bedtools
star==2.7.2b
subread
samtools
picard
bwa
skewer
je-suite
qualimap

ctrl^D

while read t; do conda install -y $t ;done < tools.txt

I know that you can give all of those tools on command line all at once, but this for some reason gets stuck in "solving environment" as well

conda env export > rnasik-1.5.4.yaml

Optionally you can initiate empty - clean environment before installing all of the packages with

conda create --name rnasik-1.5.4
conda activate rnasik-1.5.4

And then install and export. This additional step is probably better approach, since you are creating clean RNAsik environment

Travis CI and testing

Continues integration is very useful to ensure your code is checked continiouslly. RNAsik code is checked (tested) with every commit. However that testing only as good as I, or hopefully we, will make it. BigDataScript provides very nice unit testing mechanism, the trick of course it gotta to be written. Currently only very small proportion of the code is actually covered by tests. A lot of work is needed in this space. Of course one might say that I should have been writing tests as I was writing my code. Perhaps, but I'm new to this and better later then never!

Have a look at bds docs on how to write tests.

Tweet to @kizza_a

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search