From BITS wiki
Jump to: navigation, search

SimilarTo.png: Filo, tabutils, awk

Summarize a text file by column(s)

[ BioWare | Main_Page ]

Utility to process piped data and return simple stats [1].

Example usage

This tool reads one column of data (separated by newline). A simple usage of to compute mean would be:

echo -e "5\n1\n3" | qstats -m

# qstats also takes a filename (or multiple filenames) as an argument.
# This line will compute summary statistics on two files, each containing only one column of numeric data,
# and print them both out preceded by the file name (so you know which one is which)

qstats -s a_file.dat another_file.dat

# A more realistic example would be to subset a CSV by a condition, extract one column (with cut or awk),
# remove the header and grab summary statistics:

grep "COND1" mycsv.csv | cut -d , -f 2 | tail +2 | qstats -s

# If you need to get stats on, say, a comma separated text file of numerical values that is not in column format,
# you can use tr (translation) to substitute commas for newlines and pipe it to the program:

tr , '\n' < file.txt | qstats

# You can also call this tool with filenames as arguments, but it's utility that way is limited because,
# for the time being, it can only read input that is one column long of just numerics
# and very few files are one column long in nature.

# from an arbitrary list of numbers, illustrating the available statistic parameters
echo -e "5\n1\n3\n4\n6\n10" | qstats
Min.     1
1st Qu.  3
Median   4.5
Mean     4.83333
3rd Qu.  6
Max.     10
Range    9
Std Dev. 2.79384
Length   6

  1. https://github.com/tonyfischetti/qstats

[ BioWare | Main_Page ]