Prev: Do some simple addition operations under bash by using expr.
Next: Do some simple addition operations under bash by using expr
From: Hongyi Zhao on 1 Feb 2010 04:35 Hi all, Suppose I've some files with their names consist Chinese characters and all of these files are resided in the same directory. Now, I want to delete Chinese characters from all of these filenames. What should I do? Thanks in advance. -- ..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Ben Bacarisse on 1 Feb 2010 06:23 Hongyi Zhao <hongyi.zhao(a)gmail.com> writes: > Suppose I've some files with their names consist Chinese characters > and all of these files are resided in the same directory. Now, I want > to delete Chinese characters from all of these filenames. What should > I do? A very useful tool for such things is iconv. It will convert between one character encoding an another, but it can also be asked to drop any characters that can't be encoded in the target character set (the -c flag). Thus: iconv -c --from=utf-8 --to=ascii drops all UTF-8 encoded characters that are not in the 7-bit ASCII table. for f in *; do mv $f $(echo "$f" | iconv -c --from=utf-8 --to=ascii) done (untested -- check before using!!) should do what you want. Of course, UTF-8 is only one possible encoding for Chinese characters, so you might have to change that part of the example. -- Ben.
From: Ben Finney on 1 Feb 2010 06:35 Ben Bacarisse <ben.usenet(a)bsb.me.uk> writes: > for f in *; do > mv $f $(echo "$f" | iconv -c --from=utf-8 --to=ascii) > done > > (untested -- check before using!!) should do what you want. Of course, > UTF-8 is only one possible encoding for Chinese characters, so you > might have to change that part of the example. You will also need to consider the scenario when the removal of some characters results in a collision in the names. -- \ “The whole area of [treating source code as intellectual | `\ property] is almost assuring a customer that you are not going | _o__) to do any innovation in the future.” —Gary Barnett | Ben Finney
From: Ben Bacarisse on 1 Feb 2010 08:31
Ben Finney <ben+unix(a)benfinney.id.au> writes: > Ben Bacarisse <ben.usenet(a)bsb.me.uk> writes: > >> for f in *; do >> mv $f $(echo "$f" | iconv -c --from=utf-8 --to=ascii) <sigh> missing quotes round both arguments to mv: mv "$f" "$(echo """$f""" | iconv -c --from=utf-8 --to=ascii)" >> done >> >> (untested -- check before using!!) should do what you want. Of course, >> UTF-8 is only one possible encoding for Chinese characters, so you >> might have to change that part of the example. > > You will also need to consider the scenario when the removal of some > characters results in a collision in the names. Well, the OP will, yes. To the OP: if you extend this idea to a more general move of a path name, take care with other effects of dropping characters. Components of the path can turn into . or .. (this can happen even with the loop above) or might be dropped altogether. -- Ben. |