c# 4.0 - How to convert saved text file encoding to UTF8? -
recently saved text file on computer when open again saw strings like:
"˜ÌÇí ÍÑÝã ÚÌíÈå¿" now want know possible reconvert original text (utf8)?
i try codes doesn't works
string tempstr="˜ÌÇí ÍÑÝã ÚÌíÈå¿"; encoding ansi = encoding.getencoding(1256); byte[] ansibytes = ansi.getbytes(tempstr); byte[] utf8bytes = encoding.convert(ansi, encoding.utf8, ansibytes); string utf8string = encoding.utf8.getstring(utf8bytes);
you can use like:
string str = encoding.getencoding(1256).getstring(encoding.getencoding("iso-8859-1").getbytes(tempstr)) the string wasn't decoded... bytes "enlarged" char, like:
byte[] bytes = ... char[] chars = new char[bytes.length]; (int = 0; < bytes.length; i++) { chars[i] = bytes[i]; } string str = new string(chars); now... transformation same done codepage iso-8859-1. have done reverse, or have used codepage me, selected second one.
encoding.getencoding("iso-8859-1").getbytes(tempstr) this gave me original byte[]
then i've done tests , seems text in beginning wasn't utf8, in codepage 1256, arabic codepage. i
string str = encoding.getencoding(1256).getstring(...); the thing, ˜ doesn't seem part of original string.
there possibility:
string str = encoding.getencoding(1256).getstring(encoding.getencoding(1252).getbytes(tempstr)); the codepage 1252 codepage used in usa , in big part of europe. if have windows configured english, there chance uses 1252 default codepage. result different using iso-8859-1
Comments
Post a Comment