Python basics for newbies: Keep first unique field using python

Thursday, April 16, 2009

Keep first unique field using python

Input file:
$ cat file.txt
1239941013,A,K
1239941013,T,K
1239941013,Z,T
1239941210,J,L
1239941210,Q,W
1239941519,K,P
1239941013,N,P
1239941013,S,P

Required: Remove the duplicate first fields (keep only first unique first field). i.e. required output:


1239941013,A,K
,T,K
,Z,T
1239941210,J,L
,Q,W
1239941519,K,P
1239941013,N,P
,S,P

Python script for the same:


fp = open("file.txt", "rU")
lines = fp.readlines()
fp.close()

f_f=" "
for line in lines:
    f=line.split(",")
    if f[0]==f_f:
        print ","+",".join(f[1:]).rstrip()
    else:
        f_f=f[0]
        print line.rstrip()

Executing the script:


$ python remove-dup-ff.py
1239941013,A,K
,T,K
,Z,T
1239941210,J,L
,Q,W
1239941519,K,P
1239941013,N,P
,S,P

Related functions and concepts:
1) str.split([sep[, maxsplit]])
Return a list of the words in the string, using sep as the delimiter string. read more here

2) str.rstrip([chars])
Return a copy of the string with trailing characters removed read more

3) str.join(seq)
Read here

An example on python join used above:


$ python
>>> line="1239941013,A,K"
>>> f=line.split(",")
>>> f
['1239941013', 'A', 'K']
>>> ",".join(f[1:])
'A,K'

Python basics for newbies

Thursday, April 16, 2009

Keep first unique field using python

0 Comments:

Google pythonstarter.blogspot.com

FeedCount

Followers

About Me

Labels

My Blog List

Blog Archive

Python basics for newbies

Thursday, April 16, 2009

Keep first unique field using python

0 Comments:

Google pythonstarter.blogspot.com

FeedCount

Subscribe To

Followers

About Me

Labels

My Blog List

Blog Archive