Python basics for newbies: June 2009

Friday, June 19, 2009

Remove duplicate based on field using python

Input file:


$ cat file.txt
DD:12
AA:11
EE:13
AA:11
BB:09
DD:13
AA:78

Required output: Keep only 1st occurrence of each unique first field. i.e. required output:


DD:12
AA:11
EE:13
BB:09

Python script:


d = {}

input = file('file.txt')
for line in input:
   ff = line.split(':',1)[0]
   if ff not in d:
      d[ff] = 1
      print line,

Awk alternative:


$ awk -F ":" '!x[$1]++' file.txt
DD:12
AA:11
EE:13
BB:09

Friday, June 12, 2009

Grouping related items using python dictionary

Thought of trying a awk post that I did someday back on my bash scripting blog.

Input file:


$ cat data.txt
Manager1|sw1
Manager3|sw5
Manager1|sw4
Manager2|sw9
Manager2|sw12
Manager1|sw2
Manager1|sw0

Required output: Group the similar engineers which are under common Manager. i.e. required output:


Manager3|sw5
Manager2|sw9,sw12
Manager1|sw1,sw4,sw2,sw0

The python program:


d={}

fp = open("grp.txt","w")
for line in open("data.txt"):
   line=line.strip().split("|")
   d.setdefault(line[0],[])
   d[line[0]].append(line[1])

print d
for i,j in d.iteritems():
   fp.write(i+"|"+','.join(j)+"\n")

Output file after executing above script:


$ cat grp.txt
Manager3|sw5
Manager2|sw9,sw12
Manager1|sw1,sw4,sw2,sw0

Related concepts:
setdefault(key[, default])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.

Dictionary iteritems : Read here

Python basics for newbies

Friday, June 19, 2009

Remove duplicate based on field using python

Friday, June 12, 2009

Grouping related items using python dictionary

Google pythonstarter.blogspot.com

FeedCount

Followers

About Me

Labels

My Blog List

Blog Archive

Python basics for newbies

Friday, June 19, 2009

Remove duplicate based on field using python

Friday, June 12, 2009

Grouping related items using python dictionary

Google pythonstarter.blogspot.com

FeedCount

Subscribe To

Followers

About Me

Labels

My Blog List

Blog Archive