You will find this post useful if
1. You are on Windows
2. You are using Python 3.0 or above
3. You are having problems with os.walk or os.listdir because some of your filenames are encoding in Unicode
While using os.walk, you encounter and error similar to the one below.
File "C:\Python32\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 226: character maps to
What would have gotten you going in Python 2.7 but will not work in Python 3.0:
rootdir = ru'D:\COUNTRY\ROADS\'
for (root, dirname, filename) in os.walk(rootdir):
You will be irreverently pointed out a Syntax error.
What will work (or to be more precise, worked on my system !):
rootdir = r'D:\COUNTRY\ROADS\'
for (root, dirname, filename) in os.walk(rootdir.encode('utf-8')):
# do your stuff here
Useful links:
- If you are on Python 2.6 and face a similar problem, checkout this thread on StackOverflow.
- Python 2.7 Unicode HOWTO (lot of details and good material on Unicode)
- A general discussion of what's changing in Python 3.0 (aka Py3K, Python 3000), esp. w.r.to Unicode.
Comments
Post a Comment