Sunday, December 26, 2010

Python list append example - divide by two


Input file:

$ cat file.txt
h1|u|1
h2|5|1|1
rec1|1239400800|Sat|fan1|AX|2|10035|-|2|50
rec2|1239400800|Sat|fan1|AX|2|-|-|2|17
rec5|1239400801|Sat|fan3|AY|5|10035|-|2|217
rec8|1239400804|Sat|fan5|AX|2|5|-|2|970

Required Output:
- Lines starting with "h1" or "h2", no action required, just print.
- Lines starting with "rec", divide the values starting from 6th field by 2.

Required output is:

h1|u|1
h2|5|1|1
rec1|1239400800|Sat|fan1|AX|1|5017|-|1|25
rec2|1239400800|Sat|fan1|AX|1|-|-|1|8
rec5|1239400801|Sat|fan3|AY|2|5017|-|1|108
rec8|1239400804|Sat|fan5|AX|1|2|-|1|485

The python script:

fp = open("file.txt", "rU")
lines = fp.readlines()
fp.close()

for line in lines:
if line.startswith("h1"):
print line,
if line.startswith("h2"):
print line,
if line.startswith("rec"):
f=line.split("|")
r = f[5:]
l = []
for each in r:
try:
l.append(str(int(each)/2))
except ValueError:
l.append(each)

t = "|".join(f[0:5]) + "|" + "|".join(l)
print t.rstrip()

Wednesday, December 8, 2010

Python - Replace based on another file


$ cat main.txt
P|34|90
T|12
R|0|1291870414|ip1|890
R|1|1291870415|ip5|690
R|2|1291870415|ip1|899
R|3|1291870412|ip2|896
R|4|1291870418|ip3|999
R|5|1291870419|ip5|191

$ cat lookup.txt
ip7|172.17.4.8
ip1|172.17.4.3
ip5|172.17.4.9
ip4|172.17.4.2
ip3|172.17.4.1
ip2|172.17.4.6
ip6|172.17.4.7

Required Output:
Replace the 4th field (pipe delimited) of the 'R' lines of 'main.txt' with the corresponding lookup value from 'lookup.txt' i.e. 'ip1' to be replaced with '172.17.4.3', 'ip2' with '172.17.4.6' etc.

P|34|90
T|12
R|0|1291870414|172.17.4.3|890
R|1|1291870415|172.17.4.9|690
R|2|1291870415|172.17.4.3|899
R|3|1291870412|172.17.4.6|896
R|4|1291870418|172.17.4.1|999
R|5|1291870419|172.17.4.9|191

The python script:

import sys
d={}
for line in open("lookup.txt"):
line=line.strip().split("|")
d[line[0]]=line[-1]
for line in open(sys.argv[1]):
if line.startswith('P'):
print line,
if line.startswith('T'):
print line,
if line.startswith('R'):
line=line.strip().split("|")
print '|'.join(line[0:3])+'|'+d[line[3]]+'|'+'|'.join(line[4:])

Executing it:

$ python replace-from-file.py main.txt
P|34|90
T|12
R|0|1291870414|172.17.4.3|890
R|1|1291870415|172.17.4.9|690
R|2|1291870415|172.17.4.3|899
R|3|1291870412|172.17.4.6|896
R|4|1291870418|172.17.4.1|999
R|5|1291870419|172.17.4.9|191

Related Posts:
- Lookup file operation using Python
- Lookup file in python using Dictionary
- Simple python file lookup function
- Find text string in file in Python

Wednesday, November 17, 2010

Python sort file based on last field

Input file:

$ cat file.txt
IN,90,453
US,12,1,120
NZ,89,200
WI,500
TS,12,124

Required output: Sort the above comma delimited file based on the last field (column). i.e. required output:

US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500

Solution:
The solution using Awk in UNIX bash shell is here. And here is the python one:

$ python
Python 2.5.2 (r252:60911, Jan 20 2010, 21:48:48)
[GCC 4.2.4 (Ubuntu 4.2.4-1ubuntu3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> d_list = [line.strip() for line in open("file.txt")]
>>> d_list
['IN,90,453', 'US,12,1,120', 'NZ,89,200', 'WI,500', 'TS,12,124']
>>> d_list.sort(key = lambda line: line.split(",")[-1])
>>> d_list
['US,12,1,120', 'TS,12,124', 'NZ,89,200', 'IN,90,453', 'WI,500']
>>> for line in d_list:
... print line
...
US,12,1,120
TS,12,124
NZ,89,200
IN,90,453
WI,500
>>>

Some notes:
Accessing last element of a list in python:
A negative index accesses elements from the end of the list counting backwards. The last element of any non-empty list is always list[-1].