Wednesday, June 29, 2011

Handling line endings with Python 2 csv module

This is another note to self style post, this time about cross-platform handling of CSV files with the Python 2 csv module. The typical boilerplate for processing CSV files is the following:

with open("sample.csv", "r") as handle:
    reader = csv.reader(handle)
    fieldnames = reader.next()
    for row in reader:
        print row

In general this code works, however when the CSV file uses a single \r (Mac Classic style) the following error will be raised:

new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

I ran some tests across different platforms testing this behaviour, and it seems quite consistent:

End of line marker Mac OSX 10.6.8 Windows 7 Windows XP Ubuntu 10.10
\n Yes Yes Yes Yes
\r\n Yes Yes Yes Yes
\r No No No No

The solution is to open the file using the mode "rU" rather than just "r".

No comments:

Post a Comment