Wednesday, November 17, 2010

Python sort file based on last field


Input file:

$ cat file.txt
IN,90,453
US,12,1,120
NZ,89,200
WI,500
TS,12,124

Required output: Sort the above comma delimited file based on the last field (column). i.e. required output:

US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500

Solution:
The solution using Awk in UNIX bash shell is here. And here is the python one:

$ python
Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> d_list = [line.strip() for line in open("file.txt")]
>>> d_list
['IN,90,453', 'US,12,1,120', 'NZ,89,200', 'WI,500', 'TS,12,124']
>>> d_list.sort(key = lambda line: line.split(",")[-1])
>>> d_list
['US,12,1,120', 'TS,12,124', 'NZ,89,200', 'IN,90,453', 'WI,500']
>>> for line in d_list:
... print line
...
US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500
>>>

Some notes:
Accessing last element of a list in python:
A negative index accesses elements from the end of the list counting backwards. The last element of any non-empty list is always list[-1].

6 Comments:

Ryan T. said...

If you want a numeric sort like in your bash solution you need to convert your field to a number: d_list.sort(key = lambda line: float(line.split(",")[-1]))

Jadu Saikia said...

@Ryan T. Thanks Ryan. Its useful.

Gaurav Kandoi said...

what if i need to further sort according to another coloumn??

like

Input


chrom index forward reverse
chr01 13 1 2
chr03 12 1 4
chr01 3445 1 6
chr02 2311 3 1
chr13 23432 4 7
chr01 212 5 2
chr02 345 12 6
chr01 45 45 0

Output expected


chrom index forward reverse
chr01 13 1 2
chr01 45 45 0
chr01 212 5 2
chr01 3445 1 6
chr02 345 12 6
chr02 2311 3 1
chr03 12 1 4
chr13 23432 4 7


please reply ASAP...how do i modify to sort it based on multiple coloumns...

Jadu Saikia said...

@Gaurav Kandoi : Thanks a lot for the question.
Using unix sort I can think of this solution:

$ cat file.txt
chrom index forward reverse
chr01 13 1 2
chr03 12 1 4
chr01 3445 1 6
chr02 2311 3 1
chr13 23432 4 7
chr01 212 5 2
chr02 345 12 6
chr01 45 45 0

$ head -1 file.txt ; tail -n +2 file.txt | sort -k1,1 -k2,2n

chrom index forward reverse
chr01 13 1 2
chr01 45 45 0
chr01 212 5 2
chr01 3445 1 6
chr02 345 12 6
chr02 2311 3 1
chr03 12 1 4
chr13 23432 4 7

Jadu Saikia said...

@Gaurav Kandoi :
I have added this post in my unix bash scripting blog. Thanks.

http://unstableme.blogspot.in/2012/07/unix-sort-file-ignoring-first-line.html

LexieCita said...

Thank you, merci, danka, muchas gracias! Helped solve my python conundrum! :-)