c# 4.0 - How to convert saved text file encoding to UTF8? -
recently saved text file on computer when open again saw strings like:
"˜ÌÇí ÍÑÝã ÚÌíÈå¿"
now want know possible reconvert original text (utf8)?
i try codes doesn't works
string tempstr="˜ÌÇí ÍÑÝã ÚÌíÈå¿"; encoding ansi = encoding.getencoding(1256); byte[] ansibytes = ansi.getbytes(tempstr); byte[] utf8bytes = encoding.convert(ansi, encoding.utf8, ansibytes); string utf8string = encoding.utf8.getstring(utf8bytes);
you can use like:
string str = encoding.getencoding(1256).getstring(encoding.getencoding("iso-8859-1").getbytes(tempstr))
the string wasn't decoded... byte
s "enlarged" char
, like:
byte[] bytes = ... char[] chars = new char[bytes.length]; (int = 0; < bytes.length; i++) { chars[i] = bytes[i]; } string str = new string(chars);
now... transformation same done codepage iso-8859-1. have done reverse, or have used codepage me, selected second one.
encoding.getencoding("iso-8859-1").getbytes(tempstr)
this gave me original byte[]
then i've done tests , seems text in beginning wasn't utf8, in codepage 1256, arabic codepage. i
string str = encoding.getencoding(1256).getstring(...);
the thing, ˜
doesn't seem part of original string.
there possibility:
string str = encoding.getencoding(1256).getstring(encoding.getencoding(1252).getbytes(tempstr));
the codepage 1252 codepage used in usa , in big part of europe. if have windows configured english, there chance uses 1252 default codepage. result different using iso-8859-1
Comments
Post a Comment