Command-line seems boring at first but wftar some time using it, You will become addicted to its simplicity and power..
A number of tools allow controlling NGS data at different levels. Some are presented below that show some levels of redundancy but also unique features.
- 1 How to get help under UNIX?
- 2 Built-In Unix commands
- 3 File editing applications
- 4 File transfer
- 5 System/Job management applications
How to get help under UNIX?
Most of unix executable have built-in help that can be called in various ways.
# man pages constitute the most sophisticated level of built-in help. # their content is stored in separate files and accessed by the system. man <command-name> # intrinsic help <command-name> -h (--help | -? | ?) # or often simply by typing <command-name>
Built-In Unix commands
These are only few of the many but they are absolute toppers.
- cat to print text to file or combine files into one
- grep, filter huge files and keep the 'substantifique moelle'
- ls is the equivalent of the good old DOS 'dir' and will return a list of files and folders for the provided path
- sort, order lines based on one or more columns
- split to split big files into eatable chunks
- tr to replace one character by another in a whole file at once
File editing applications
The must programs make your days by lifting the weight of hte data for you.
- awk, your best friend when it comes to playing with tabular files
- column presents columnar data nicely padded with spaces
- transpose rows to columns and vice versa. A must have utility that matches well column.
query text files
- filo, Useful FILe and stream Operations (includes groupBy, shuffle and stats)
get BIG Data from the internet
- wget will happily slurp down anything within reach of its greedy claws
System/Job management applications
create scripts with intelligent interfaces
- Expect will guide you in the process of asking input and processing input during script pipelines
run jobs in parallel
- parallel, the GNU tool to run commands in parallel and replace for loops for speed savings.
- ppss to performe parallel tasks from a folder of similar input files.
run jobs while you are Offline
This is particularly useful if you work on a distant computer and which to deconnect to go home while the heavy load is taking place. It is also great to partition your applications in different screens and let them live their own lives while you do something else.
- screen to start experiments in separate 'screens' and be able to logout without loosing them.
list running processes
You will often need to evaluate the workload on your favorite server and identify nasty jobs before killing them
- Top and htop monitor your system and see what is running and how much resources are used.
[ Main_Page ]