I am trying to create a python script that adds quotations around part of a string, after 3 commas
So if the input data looks like this:
1234,1,1/1/2010,This is a test. One, two, three.
I want python to convert the string to:
1234,1,1/1/2010,"This is a test. One, two, three."
The quotes will always need to be added after 3 commas
I am using Python 3.1.2 and have the following so far:
i_file=open("input.csv","r")
o_file=open("output.csv","w")
for line in i_file:
tokens=line.split(",")
count=0
new_line=""
for element in tokens:
if count = "3":
new_line = new_line + '"' + element + '"'
break
else:
new_line = new_line + element + ","
count=count+1
o_file.write(new_line + "\n")
print(line, " -> ", new_line)
i_file.close()
o_file.close()
The script closes immediately when I try to run it and produces no output
Can you see what’s wrong?
Thanks
,
Having addressed the two issues mentioned in my comment above I’ve just tested that the code below (edit: ALMOST works; see very short code sample below for a fully tested and working version) for your test input.
i_file=open("input.csv","r")
o_file=open("output.csv","w")
for line in i_file:
tokens=line.split(",")
count=0
new_line=""
for element in tokens:
if count == 3:
new_line = new_line + '"' + element + '"'
break
else:
new_line = new_line + element + ","
count=count+1
o_file.write(new_line + "\n")
print(line, " -> ", new_line)
i_file.close()
o_file.close()
Side note:
A relatively new feature in Python is the with
statement. Below is an example of how you might take advantage of that more-robust method of coding (note that you don’t need to add the close()
calls at the end of processing):
with open("input.csv","r") as i_file, open("output.csv","w") as o_file:
for line in i_file:
tokens = line.split(",", 3)
if len(tokens) > 3:
o_file.write(','.join(tokens0:3))
o_file.write('"{0}"'.format(tokens-1.rstrip('\n')))
,
Shorter but untested:
i_file=open("input.csv","r")
o_file=open("output.csv","w")
comma = ','
for line in i_file:
tokens=line.split(",")
new_line = comma.join(tokens:3+'"'+comma.join(tokens3:)+'"')
o_file.write(new_line+'\n')
print(line, " -> ", new_line)
i_file.close()
o_file.close()
,
Perhaps you should consider using a regular expression to do this?
Something like
import re
t = "1234,1,1/1/2010,This is a test. One, two, three."
first,rest = re.search(r'(^,+,^,+,^,+,)(.*)',t).groups()
op = '%s"%s"'%(first,rest)
print op
1234,1,1/1/2010,"This is a test. One, two, three."
Does this satisfy your requirements?
,
>>> import re
>>> s
'1234,1,1/1/2010,This is a test. One, two, three.'
>>> re.sub("(.^,*,.^,*,.^,*,)(.*)" , '\\1\"\\2"' , s)
'1234,1,1/1/2010,"This is a test. One, two, three."'
import re
o=open("output.csv","w")
for line in open("input.csv"):
line=re.sub("(.^,*,.^,*,.^,*,)(.*)" , '\\1\"\\2"' , line)
o.write(line)
o.close()