Input file:
$ cat file.txt
IN,90,453
US,12,1,120
NZ,89,200
WI,500
TS,12,124
Required output: Sort the above comma delimited file based on the last field (column). i.e. required output:
US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500
Solution:
The solution using Awk in UNIX bash shell is here. And here is the python one:
$ python
Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> d_list = [line.strip() for line in open("file.txt")]
>>> d_list
['IN,90,453', 'US,12,1,120', 'NZ,89,200', 'WI,500', 'TS,12,124']
>>> d_list.sort(key = lambda line: line.split(",")[-1])
>>> d_list
['US,12,1,120', 'TS,12,124', 'NZ,89,200', 'IN,90,453', 'WI,500']
>>> for line in d_list:
... print line
...
US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500
>>>
Some notes:
Accessing last element of a list in python:
A negative index accesses elements from the end of the list counting backwards. The last element of any non-empty list is always list[-1].
6 Comments:
If you want a numeric sort like in your bash solution you need to convert your field to a number: d_list.sort(key = lambda line: float(line.split(",")[-1]))
@Ryan T. Thanks Ryan. Its useful.
what if i need to further sort according to another coloumn??
like
Input
chrom index forward reverse
chr01 13 1 2
chr03 12 1 4
chr01 3445 1 6
chr02 2311 3 1
chr13 23432 4 7
chr01 212 5 2
chr02 345 12 6
chr01 45 45 0
Output expected
chrom index forward reverse
chr01 13 1 2
chr01 45 45 0
chr01 212 5 2
chr01 3445 1 6
chr02 345 12 6
chr02 2311 3 1
chr03 12 1 4
chr13 23432 4 7
please reply ASAP...how do i modify to sort it based on multiple coloumns...
@Gaurav Kandoi : Thanks a lot for the question.
Using unix sort I can think of this solution:
$ cat file.txt
chrom index forward reverse
chr01 13 1 2
chr03 12 1 4
chr01 3445 1 6
chr02 2311 3 1
chr13 23432 4 7
chr01 212 5 2
chr02 345 12 6
chr01 45 45 0
$ head -1 file.txt ; tail -n +2 file.txt | sort -k1,1 -k2,2n
chrom index forward reverse
chr01 13 1 2
chr01 45 45 0
chr01 212 5 2
chr01 3445 1 6
chr02 345 12 6
chr02 2311 3 1
chr03 12 1 4
chr13 23432 4 7
@Gaurav Kandoi :
I have added this post in my unix bash scripting blog. Thanks.
http://unstableme.blogspot.in/2012/07/unix-sort-file-ignoring-first-line.html
Thank you, merci, danka, muchas gracias! Helped solve my python conundrum! :-)
Post a Comment