Linux is a very popular operating system in bioinformatics. In this training you will learn why that is and how it can help you with your bioinformatics analysis. After this training you will be able to:

install software on Linux
use command line to run tools
use command line to handle files
write small scripts to automate your analysis

Training material

Additional information

Exercises during the training

Excercises: Part 1

On the training there is a Linux Ubuntu installation available on a Google cloud environment. To access Linux we use Google Chrome and the 'VNC Viewer for Google Chrome' application.
When you launch the application, you have to enter an IP address, this will be mentioned on the training.

Installing Linux

Installing software

Install the tools from the presentation slides. Note that when you install something, everyone in the training has access to that tool!
Here are some exercises to try on your personal Linux installation:

Command line

File system

Bonus

Installing and compiling examples

Excercises: Part 2

Text mining, scripting and 'for' loops

NGS intro

Bonus

EXTRA

Bioinformatics oneliners

Sneak preview to duplication rate of reads

gunzip -dc fastq.gz | head -n 1000000 | awk '{ if(NR%4==2) { print $1 } }' | sort | uniq -c | sort -g > sorted_duplicated

Convert fastq to fasta

paste - - - - < in.fq | cut -f 1,2 | sed 's/^@/>/' | tr "\t" "\n" > out.fa

Count all the variants called in all the vcf files

cat *.vcf | grep -v '^#' | wc -l

Count all the variants in three vcf files

cat *.raw.vcf | grep -v '^#' | awk '{print $1 "\t" $2 "\t" $5}' | sort | uniq -c | grep ' 3 ' | wc -l

Software installation exercises

Adding the Debian Med repository containing bioinformatics tools in Debian-derived distributions

Introduction to Linux for bioinformatics

Contents

Training material

Additional information

Exercises during the training

Excercises: Part 1

Installing Linux

Installing software

Command line

File system

Bonus

Excercises: Part 2

Text mining, scripting and 'for' loops

NGS intro

Bonus

EXTRA

Bioinformatics oneliners

Sneak preview to duplication rate of reads

Convert fastq to fasta

Count all the variants called in all the vcf files

Count all the variants in three vcf files

Software installation exercises

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Resources

Toolbox