Skip to main content

Handling Unicode filenames in Windows with Python 3.0

You will find this post useful if
1. You are on Windows
2. You are using Python 3.0 or above
3. You are having problems with os.walk or os.listdir because some of your filenames are encoding in Unicode

While using os.walk, you encounter and error similar to the one below.
File "C:\Python32\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2013' in position 226: character maps to

What would have gotten you going in Python 2.7 but will not work in Python 3.0:
rootdir = ru'D:\COUNTRY\ROADS\'
for (root, dirname, filename) in os.walk(rootdir):
You will be irreverently pointed out a Syntax error.

What will work (or to be more precise, worked on my system !):
rootdir = r'D:\COUNTRY\ROADS\'
for (root, dirname, filename) in os.walk(rootdir.encode('utf-8')):
# do your stuff here

Useful links:
  1. If you are on Python 2.6 and face a similar problem, checkout this thread on StackOverflow.
  2. Python 2.7 Unicode HOWTO (lot of details and good material on Unicode)
  3. A general discussion of what's changing in Python 3.0 (aka Py3K, Python 3000), esp. w.r.to Unicode.


Comments

Popular posts from this blog

Barcamps over the world: BCB3/Minnebar

I think I am kinda special. Not quite like Paris Hilton special, but getting a  chance to attend two Barcamps separated by 8000 miles in a span of 3 weeks has got to be some kind of special stuff. I think the big guy above is smiling at me. Invest in my equity. This is an article outlining some of the interesting differences I saw between the barcamps in Bangalore (BCB3, 31 March -April 1, 2007 ) and Minnesota(Minnebar 2007, 21 April). This is not an article intended to compare or pass a judgement. Just throwing up some observations, fwiw . I am not offering explanations, I am not a socio-anthropology by training. Some of these do not require a degree to arrive at the reason of causation, but I want to keep this blog close to what I saw, not what I think. At most, some "could-be"s. Both the barcamps have a local flavour and preservation of local flavour to me, is inherently good.  Consider food, for example.  A predominantly South Indian buffet spread for lunch ...

Bambi 2.0

Bambi is a small coding-fest that we organise in our group at GE Healthcare. It was inspired by Yahoo Hackday after I heard about it at BarCamp Bangalore last year. I still remember, I came back all charged up after BarCamp and with some help from Arun B, we put together the first version of Bambi. Ours was a small team, roughly about 60 people, so spreading the news was not much of a problem. Getting people out of their workload was a bigger problem. The load is high and the work is, I guess, somewhat exciting ;-)  It is sometimes tough to lure people out of writing indexing algorithms for proprietary image databases or mitral-valve plane adjusters for segmentation of the human heart.     Today we had the demos for Bambi 2.0 The quality of demos were much improved and people came on the last day with some utterly cool demos. Unfortunately, I do not think I can write about them in detail owing to Intellectual Property issues but a mash of  Biometrics, Blue...

5 Ways to Kill a Post-Mortem

Software projects fail. One of the techniques you could apply to a project not going well is a post-mortem. A meeting where the critical participants meet to discuss what is going wrong. Preferably when there is still time to salvage things. But before you run a post-mortem of a project, here are some things you would want to consciously avoid. Hopefully as a project lead or in similar capacity, this will help you facilitate the meeting and shape positive outcomes. It starts with you.  The Final Blame-Game The idea of a post-mortem is to take an earnest look at some of the things that went wrong and understand why and how to avoid them. In brief, make mistakes, but please make new mistakes . If the meeting degenerates into blame-fixing, then you are headed in the wrong direction. Rule the meeting politely but iron glove. Don't let it become a mudslinging match. If you have been deeply involved in the project, chances are high that you would have your own frustr...