Column Normalizer

for Galaxy

Introduction

This tool normalizes numeric columns in a text file.
It can be used from the command line, or from a Galaxy server.


Galaxy Usage


Command-line Usage

usage information

	$ column_normalizer -h
	Column Normalizer
	Copyright (c) 2009 by A. Gordon (gordon at cshl.edu)

	usage: column_normalizer [-o FILE] [-l] [-v] [-sfb] [-p N] [-P] [-h] [-a] [-A NAME] INFILE COLUMN

	  INFILE   - input file name.
	  COLUMN   - column to normalize.
	  -o FILE  - Output file (default is STDOUT).
	  -l       - skip first line.
	  -v       - verbose - print summary.
		   - If output goes to STDOUT, report goes to STDERR.
		   - If output goes to FILE, report goes to STDOUT.
	  -s       - Force scientific notation output.
	  -f       - Force decimal notation output.
	  -b       - Use 'best' notation of scientific or fixed notation output.
	  -p N     - Set precision - number of digits after decimal point (default=8).
	  -P       - Always output a decimal point.
	  -a       - Append new column (instead of overriding column COLUMN).
	  -A NAME  - If skipping first line (-l) AND appending new column (-a),
		     append a new column header to the first line.

	NOTE: The input file is read TWICE - so it can't be a pipe! must be a real file.

usage examples

	( input file )
	$ cat example.txt
	fruit		amount
	apples		48
	oranges		28
	blueberries	4
	

	( The following error is fine - the first line contains a header and can not be normalized. we need to skip it with -l )
	$ column_normalizer example.txt 2
	column_normalizer: Input error: failed to read column 2 on line 1
	
	
	( Skipping the first line )
	$ column_normalizer -l example.txt 2
	fruit		amount
	apples		0.6
	oranges		0.35
	blueberries	0.05
	
	( Appending a new column, named "normalized_amount")
	$ column_normalizer -l -a -A normalized_amount example.txt 2
	fruit		amount	normalized_amount
	apples		48	0.6
	oranges		28	0.35
	blueberries	4	0.05

	( Forcing scientific notation output, with default precision )
	$ column_normalizer -l -s example.txt 2
	fruit		amount
	apples		6.00000000e-01
	oranges		3.50000000e-01
	blueberries	5.00000000e-02

	( Forcing scientific notation output, with precision = 2 digits )
	$ column_normalizer -l -s -p 2 example.txt 2
	fruit		amount
	apples		6.00e-01
	oranges		3.50e-01
	blueberries	5.00e-02





Download and Installation

Files

File Version Release Date md5 sum
column_normalizer-0.1.tar.gz 0.1 23-April-2009 b9196540a3261138a38fb4b5ffc5ab83
Column Normalizer Binary (32-bit)
(statically pre-complied binary for x86-32bit linux systems)
0.1 23-April-2009 66e88c9b37f42923cb97eeee1e907851
Column Normalizer Binary (64-bit)
(statically pre-complied binary for x86-64bit linux systems)
0.1 23-April-2009 55d9c9f4042df8a44f661e985714a1bd
libgtextutils-0.2.tar.gz 0.2 23-April-2009 3f5021f579f8505bd67b16e359feb532


Installation

  1. install 'libgtextutils-0.2'
    	$ wget http://hannonlab.cshl.edu/column_normalizer/libgtextutils-0.2.tar.gz
    	$ tar -xzvf libgtextutils-0.2.tar.gz
    	$ cd libgtextutils-0.2
    	$ ./configure
    	$ make
    	$ make check (optional)
    	$ sudo make install
    
  2. install column_normalizer
    
    	$ wget http://hannonlab.cshl.edu/column_normalizer/column_normalizer-0.1.tar.gz
    	$ tar -xzvf column_normalizer.tar.gz
    	$ cd column_normalizer
    	$ make
    	$ sudo cp column_normalizer /usr/local/bin
    
note:
There's no 'configure' script. The makefile invokes 'pkg-config' directly to find libgtextutils-0.2. If libgtextutils was properly installed, make should just work.
If pkg-config fails to find libgtextutils, but the following file exists:
/usr/local/lib/pkgconfig/gtextutils-0.2.pc
Try to set the PKG_CONFIG_PATH variable to the location of the .pc file:
  $ export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig/
  $ make
Otherwise, you can manually change the makefile to point to the correct location of libgtextutils.

Galaxy Integration

  1. Copy the column_normalizer.xml file to one of Galaxy's tool sub-folders:
        $ cp column_normalizer.xml [GALAXY-DIR]/tools/filters/
    		
  2. Update Galaxy's tool_conf.xml with the new tool, as so:
        <section name="Text Manipulation" id="textutils">
    	...
    	<tool file="filters/column_normalizer.xml" />
    	...
        </section>
    	
  3. Restart Galaxy

License

column normalizer is released under AGPLv3.
See the COPYING file in the tarball for more details.

Useful Links

contact

gordon at cshl.edu