Converting hex char string to unicode string (python) -
this question has answer here:
i have string of unicode ordinals (in hex form) so:
\u063a\u064a\u0646\u064a\u0627
it's unicode repsentation of arabic string غينيا
(gotten of arabic lorem ipsum generator).
i want convert unicode hex string غينيا
. tried print u'%s' % "\u063a\u064a\u0646\u064a\u0627"
(pointed out here) returns hex format, not symbols. print word.replace("\u","\\u")
doesn't job either. do?
i'm not entirely sure question want, i'll cover both cases can see.
case 1: want output arabic string code, using unicode literal syntax. in case, should prefix string literal u , you'll right rain:
s = u"\u063a\u064a\u0646\u064a\u0627" print(s)
this same as
print u'%s' % s
except shorter. in case, formatting otherwise empty string formed string doesn't make sense, because it's not changing - in other words, u'%s' % s == s
.
case 2: have escaped string other source want evaluate unicode string. kind of looks you're trying print u'%s' %
. can done
import ast s = r"\u063a\u064a\u0646\u064a\u0627" print ast.literal_eval("u'{}'".format(s))
note unlike eval
safe, literal_eval
doesn't allow function call. see s here r-prefixed string, backslashes aren't escaping literally backslash characters.
both pieces of code correctly output
غينيا
some elaboration on print u'%s' % s
case 1. behaves differently, because if string has been escaped, won't evaluated unicode literal in formatting. because python builds unicode out of unicode literal-like expressions (such s) when @ first evaluated. if has been escaped, kind of out of reach using normal string operations, have use literal_eval
evaluate again in order print string. when run
print u'%s' % s
the output is
\u063a\u064a\u0646\u064a\u0627
note isn't representation of unicode object literally ascii string backslashes , characters.
wiki
Comments
Post a Comment