Skip to main content

Handling Unicode filenames in Windows with Python 3.0

You will find this post useful if
1. You are on Windows
2. You are using Python 3.0 or above
3. You are having problems with os.walk or os.listdir because some of your filenames are encoding in Unicode

While using os.walk, you encounter and error similar to the one below.
File "C:\Python32\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 226: character maps to

What would have gotten you going in Python 2.7 but will not work in Python 3.0:
rootdir = ru'D:\COUNTRY\ROADS\'
for (root, dirname, filename) in os.walk(rootdir):
You will be irreverently pointed out a Syntax error.

What will work (or to be more precise, worked on my system !):
rootdir = r'D:\COUNTRY\ROADS\'
for (root, dirname, filename) in os.walk(rootdir.encode('utf-8')):
# do your stuff here

Useful links:
  1. If you are on Python 2.6 and face a similar problem, checkout this thread on StackOverflow.
  2. Python 2.7 Unicode HOWTO (lot of details and good material on Unicode)
  3. A general discussion of what's changing in Python 3.0 (aka Py3K, Python 3000), esp. w.r.to Unicode.


Comments

Popular posts from this blog

Barcamps over the world: BCB3/Minnebar

I think I am kinda special. Not quite like Paris Hilton special, but getting a  chance to attend two Barcamps separated by 8000 miles in a span of 3 weeks has got to be some kind of special stuff. I think the big guy above is smiling at me. Invest in my equity. This is an article outlining some of the interesting differences I saw between the barcamps in Bangalore (BCB3, 31 March -April 1, 2007 ) and Minnesota(Minnebar 2007, 21 April). This is not an article intended to compare or pass a judgement. Just throwing up some observations, fwiw . I am not offering explanations, I am not a socio-anthropology by training. Some of these do not require a degree to arrive at the reason of causation, but I want to keep this blog close to what I saw, not what I think. At most, some "could-be"s. Both the barcamps have a local flavour and preservation of local flavour to me, is inherently good.  Consider food, for example.  A predominantly South Indian buffet spread for lunch ...

This thing called Minnebar

  The drive from Milwaukee to Minneapolis/St.Paul is about 320 miles, each way. Takes about 5 hours, you can do it in 4.5 if you push it a bit. The drive cannot really be called scenic, unless you put it in compare mode against Ohio, I guess. I really wanted to see a barcamp outside India and there was no way I would this one in Minneapolis (since I was already in the area.)  So I got up at 3:46 am. The rest is sweet memories. Minnebar  (pronounced minibar) is in its second edition. It is  a BarCamp organised by enthusiasts in the twin cities of St. Paul/Minneapolis, Minnesota.    You can see the pictures here.   William Gurstelle literally blew us with his presentation on explosive underground technologies: potato guns, pumpkin throwers, tesla coils. He is the author of 5 books on how to increase the entropy of the unverse by factors of 10 or alike.    In some sense, the starCamper, if there be one was David Hans...

Work at a call center

  A nice article on the Jeff Bezos school of management. I am a strong supporter of this ethic. It just makes the world a better place.