I'm 赵一开/BlahGeek, a CS&T Student @ Tsinghua U.

Get in touch:   See more:

把Unicode转换为合法的文件名(ASCII)

从pelican模块里面找到的,感觉很有用,mark一下。

import re

def slugify(value):
    if type(value) == unicode:
        import unicodedata
        from unidecode import unidecode
        value = unicode(unidecode(value))
        value = unicodedata.normalize('NFKD', value).encode('ascii', 'ignore')
    value = unicode(re.sub('[^\w\s-]', '', value).strip().lower())
    return  re.sub('[-\s]+', '-', value)

if __name__ == '__main__':
    print slugify(raw_input().decode(sys.stdin.encoding))

输入: 标点,空格 English \t\n? > < ?123.45

输出: biao-dian-kong-ge-english-tn-12345