Friday, June 12, 2009

Grouping related items using python dictionary


Thought of trying a awk post that I did someday back on my bash scripting blog.

Input file:

$ cat data.txt
Manager1|sw1
Manager3|sw5
Manager1|sw4
Manager2|sw9
Manager2|sw12
Manager1|sw2
Manager1|sw0

Required output: Group the similar engineers which are under common Manager. i.e. required output:

Manager3|sw5
Manager2|sw9,sw12
Manager1|sw1,sw4,sw2,sw0


The python program:

d={}

fp = open("grp.txt","w")
for line in open("data.txt"):
line=line.strip().split("|")
d.setdefault(line[0],[])
d[line[0]].append(line[1])

print d
for i,j in d.iteritems():
fp.write(i+"|"+','.join(j)+"\n")

Output file after executing above script:

$ cat grp.txt
Manager3|sw5
Manager2|sw9,sw12
Manager1|sw1,sw4,sw2,sw0


Related concepts:
setdefault(key[, default])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

Dictionary iteritems : Read here