Get directory of running script

In shell scripts, it is often useful to know which directory the script is running from. Here is a way to get this information:

curdir=$(cd -P -- "$(dirname -- "$0")" && pwd -P)

“$0” is the script name. “dirname” returns the corresponding directory. “cd” changes to that directory, which becomes the working directory. “pwd” returns it, so it can be assigned to the variable “curdir”. The option “-P” tells to only use the physical directory structure and to not follow symbolic links. Note that “cd” is encapsulated using the “$(…)” syntax, so it does not change the working directory for the rest of the script.

Advertisements

Bam to wig conversion

Conversion from bam to wiggle format can be done using the rsem-bam2wig utility, which takes a sorted bam file as input. The syntax is rather simple:

rsem-bam2wig sorted-bam-input-filename wig-output-filename wiggle-plot-name [--no-fractional-weight]

The option “–no-fractional-weight” should be set if the bam file has not been generated by rsem.

To sort a bam file, use samtools:

samtools sort bam-input-filename sorted-bam-output-filename

Merging in R

A common task I happen to do in R is merging tables (for instance, using gene symbol as a join) and renaming columns with a suffix to indicate their table of origin. Below is a function that automatizes most of the process, taking any number of dataframes as input. It is also posted on GitHub.

bigWig to bedGraph to wig

I am currently analyzing ChIP-seq data from ENCODE, starting from bigWig files, which I have to convert to wig. Unfortunately, in my case, the bigWigToWig program from UCSC converts to bedGraph format. The reason why this is happening is somehow explained in this thread. Briefly, it is likely because the bigWig files were generated from a bedGraph and not a wig file. To be noted, UCSC also has a bigWigToBedGraph conversion program. One difference between the two programs is that bigWigToWig outputs bedGraph files with uniform step size, whereas bigWigToBedGraph outputs bedGraph files with variable step size, by combining consecutive steps that have the same value.

Anyway, I had to convert from bedGraph to wig. A Perl script by Dave Tang does the job, but it outputs wig files with a step size equal to 1 bp. Because such files are unnecessarily big and for some other reasons, I wanted to be able to specify the step size. So, I wrote a new bedGraph to wig converter, inspired by Dave Tang’s.

The script is written below and can also be found on GitHub. Step size is specified on the command line, and it has an option to skip steps with null value, in order to save space. I hope this is useful, and I obviously welcome any feedback.