设为首页 收藏本站
查看: 784|回复: 0

[经验分享] python编码转换实验

[复制链接]

尚未签到

发表于 2018-8-6 12:10:08 | 显示全部楼层 |阅读模式
  Python 2.6.6 (r266:84292, Jul 23 2015, 15:22:56)
  [GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> print ord('A')
  65
  >>>
  ...
  >>> a = {"a":"1","b","2"}
  File "<stdin>", line 1
  a = {"a":"1","b","2"}
  ^
  SyntaxError: invalid syntax
  >>> a = {"a":"1","b":"2"}
  >>> str(a)
  "{'a': '1', 'b': '2'}"
  >>> print a
  {'a': '1', 'b': '2'}
  >>> print type(a)
  <type 'dict'>
  >>> print type(str(a))
  <type 'str'>
  >>> b = [1,2,3]
  >>> print type(b)
  <type 'list'>
  >>> print type(str(b))
  <type 'str'>
  >>> str(b)
  '[1, 2, 3]'
  >>> b.__class__
  <type 'list'>
  >>> str(b).__class__
  <type 'str'>
  >>> isinstance(a, str)
  False
  >>> isinstance(a, dict)
  True
  >>> isinstance(a, unicode)
  False
  >>> isinstance(a, utf-8)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'utf' is not defined
  >>> isinstance(a, 'utf-8')
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>

  TypeError: isinstance() arg 2 must be a>  >>> isinstance(a, type)
  False
  >>> isinstance(a, unicode)
  False
  >>> isinstance(a, unicode)
  False
  >>> import chardet
  >>> chardet.detect(a)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/chardet/__init__.py", line 30, in detect
  u.feed(aBuf)
  File "/usr/lib/python2.6/site-packages/chardet/universaldetector.py", line 74, in feed
  if aBuf[:3] == codecs.BOM:
  TypeError: unhashable type
  >>> chardet.detect(str(a))
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> chardet.detect(str(b))
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> c = ["我","是"]
  >>> chardet.detect(str(c))
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> print c
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> c.encode('unicode')
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'list' object has no attribute 'encode'
  >>> str(c).encode('unicode')
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  LookupError: unknown encoding: unicode
  >>> str(c).encode('utf-8')
  "['\\xe6\\x88\\x91', '\\xe6\\x98\\xaf']"
  >>> d = str(c)
  >>> chardet.detect(d)
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> chardet.detect(c)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/chardet/__init__.py", line 30, in detect
  u.feed(aBuf)
  File "/usr/lib/python2.6/site-packages/chardet/universaldetector.py", line 108, in feed
  if self._highBitDetector.search(aBuf):
  TypeError: expected string or buffer
  >>> chardet.detect(d)
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> print d
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print dc
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'dc' is not defined
  >>> print c
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print d.decode('ascii')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print type(d.decode('ascii'))
  <type 'unicode'>
  >>> print d.decode('ascii')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> chardet.detect(c.decode('ascii')
  ... )
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'list' object has no attribute 'decode'
  >>> chardet.detect(d.decode('ascii'))
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/chardet/__init__.py", line 25, in detect
  raise ValueError('Expected a bytes object, not a unicode object')
  ValueError: Expected a bytes object, not a unicode object
  >>> type(d)
  <type 'str'>
  >>> print type(d.decode('ascii'))
  <type 'unicode'>
  >>>  print d.decode('ascii')
  File "<stdin>", line 1
  print d.decode('ascii')
  ^
  IndentationError: unexpected indent
  >>> print d.decode('ascii')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print d.decode('ascii').encode('utf-8')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print d.decode('ascii').encode('utf-8')[0]
  [
  >>> print d.decode('ascii')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> e = d.decode('ascii')
  >>> print e
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> type(e)
  <type 'unicode'>
  >>> f = e.encode('utf-8')
  >>> f
  "['\\xe6\\x88\\x91', '\\xe6\\x98\\xaf']"
  >>> print f
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> type(f)
  <type 'str'>
  >>> print f.decode("unicode_escape")
  ['', 'ˉ']
  >>> print f.encode("raw_unicode_escape")
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print f.encode("raw_unicode_escape").decode('utf-8')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print b
  [1, 2, 3]
  >>> print c
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print type(c)
  <type 'list'>
  >>> print type(d)
  <type 'str'>
  >>> print d
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> import syss
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  ImportError: No module named syss
  >>> import sys

  >>>>  <module 'sys' (built-in)>
  >>> sys.setdefaultencoding('utf-8')
  >>> print d
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print type(c)
  <type 'list'>
  >>> print type(d)
  <type 'str'>
  >>> cc = ["我","是"]
  >>> print cc
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print type(cc)
  <type 'list'>
  >>> dd = str(cc)
  >>> pirnt dd
  File "<stdin>", line 1
  pirnt dd
  ^
  SyntaxError: invalid syntax
  >>> print dd
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print type(dd)
  <type 'str'>
  >>> chardet.detect(d)
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> chardet.detect(dd)
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> sys.defaultencoding()
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'module' object has no attribute 'defaultencoding'
  >>> sys.defaultencoding
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'module' object has no attribute 'defaultencoding'
  >>> sys.defaultencode
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'module' object has no attribute 'defaultencode'
  >>> sys.defaultencode()
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'module' object has no attribute 'defaultencode'
  >>> sys.defaultencoding()
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'module' object has no attribute 'defaultencoding'
  >>> sys.defaultencode
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  AttributeError: 'module' object has no attribute 'defaultencode'
  >>> q = '中国'
  >>> type(q)
  <type 'str'>
  >>> chardet.detect(q0
  ... )
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'q0' is not defined
  >>> chardet.detect(q)
  {'confidence': 0.75249999999999995, 'encoding': 'utf-8'}
  >>> p = ['中国', '复兴']
  >>> chardet.detect(p)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/chardet/__init__.py", line 30, in detect
  u.feed(aBuf)
  File "/usr/lib/python2.6/site-packages/chardet/universaldetector.py", line 108, in feed
  if self._highBitDetector.search(aBuf):
  TypeError: expected string or buffer
  >>> chardet.detect(str(p))
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> print type(dd)
  <type 'str'>
  >>> print dd.decode('unicode_escape')
  ['', 'ˉ']
  >>> print type(dd.decode('unicode_escape'))
  <type 'unicode'>
  >>> dd
  "['\\xe6\\x88\\x91', '\\xe6\\x98\\xaf']"
  >>> print dd
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print dd.encode('raw_unicode_escape')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print type(dd.encode('raw_unicode_escape'))
  <type 'str'>
  >>> print type(dd.encode('raw_unicode_escape').decode('utf-8'))
  <type 'unicode'>
  >>> print type(dd.encode('raw_unicode_escape').decode('utf-8')
  ... )
  <type 'unicode'>
  >>> print dd
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print dd, type(dd)
  ['\xe6\x88\x91', '\xe6\x98\xaf'] <type 'str'>
  >>> print dd.encode('raw_unicode_escape'), type(dd.encode('raw_unicode_escape'))
  ['\xe6\x88\x91', '\xe6\x98\xaf'] <type 'str'>
  >>> print dd.decode('utf-8'), type(dd.decode('utf-8')
  ... )
  ['\xe6\x88\x91', '\xe6\x98\xaf'] <type 'unicode'>
  >>> print dd.decode('utf-8')
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print dd
  ['\xe6\x88\x91', '\xe6\x98\xaf']
  >>> print ee
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'ee' is not defined
  >>> ee = u"dd"
  >>> ee = u"['\xe6\x88\x91', '\xe6\x98\xaf']"
  >>> print ee
  ['', 'ˉ']
  >>> ee
  u"['\xe6\x88\x91', '\xe6\x98\xaf']"
  >>> ee = [u'中国', u'复兴']
  >>> type(ee)
  <type 'list'>
  >>> print ee
  [u'\u4e2d\u56fd', u'\u590d\u5174']
  >>> print str(ee)
  [u'\u4e2d\u56fd', u'\u590d\u5174']
  >>> printee
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'printee' is not defined
  >>> print ee
  [u'\u4e2d\u56fd', u'\u590d\u5174']
  >>> print json.dumps(ee).decode('unicode_escape')
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'json' is not defined
  >>> import json
  >>> print json.dumps(ee).decode('unicode_escape')
  ["中国", "复兴"]
  >>> print str(ee).decode('unicode_escape')
  [u'中国', u'复兴']
  >>> x = '中国'
  >>> print x
  中国
  >>> x
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> type(x)
  <type 'str'>
  >>> chardet.detect(x)
  {'confidence': 0.75249999999999995, 'encoding': 'utf-8'}
  >>> y = x.decode('utf-8')
  >>> y
  u'\u4e2d\u56fd'
  >>> print y
  中国
  >>> chardet.detect(y)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/chardet/__init__.py", line 25, in detect
  raise ValueError('Expected a bytes object, not a unicode object')
  ValueError: Expected a bytes object, not a unicode object
  >>> x
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> x = '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> print x
  中国
  >>> x
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> x = u'\xe4\xb8\xad\xe5\x9b\xbd'
  >>> print x
  -
  >>> x.decode('utf-8')
  u'\xe4\xb8\xad\xe5\x9b\xbd'
  >>> print x.decode('utf-8')
  -
  >>> chardet.detect(x)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.6/site-packages/chardet/__init__.py", line 25, in detect
  raise ValueError('Expected a bytes object, not a unicode object')
  ValueError: Expected a bytes object, not a unicode object
  >>> print type(x)
  <type 'unicode'>
  >>> x
  u'\xe4\xb8\xad\xe5\x9b\xbd'
  >>> pirnt x
  File "<stdin>", line 1
  pirnt x
  ^
  SyntaxError: invalid syntax
  >>> print x
  -
  >>> print x.encode('raw_unicode_escape')
  中国
  >>> y = x.encode('raw_unicode_escape')
  >>> y
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> type y
  File "<stdin>", line 1
  type y
  ^
  SyntaxError: invalid syntax
  >>> type(y)
  <type 'str'>
  >>> print y
  中国
  >>> chardet.detect(y)
  {'confidence': 0.75249999999999995, 'encoding': 'utf-8'}
  >>> z = y.encode('utf-8')
  >>> print z
  中国
  >>> z
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> y
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> type(z)
  <type 'str'>
  >>> type(y)
  <type 'str'>
  >>> chardet.detect(y)
  {'confidence': 0.75249999999999995, 'encoding': 'utf-8'}
  >>> y
  '\xe4\xb8\xad\xe5\x9b\xbd'
  >>> z = y.encode('utf-8')
  >>> z = y.decode('utf-8')
  >>> z
  u'\u4e2d\u56fd'
  >>> print z
  中国
  >>> type(z)
  <type 'unicode'>
  >>> a
  u'\xe4\xb8\xad\xe5\x9b\xbd'
  >>> f='\u53eb\u6211'
  >>> print f
  \u53eb\u6211
  >>> f
  '\\u53eb\\u6211'
  >>> type(f)
  <type 'str'>
  >>> chardet.detect(f)
  {'confidence': 1.0, 'encoding': 'ascii'}
  >>> f.decode('ascii')
  u'\\u53eb\\u6211'
  >>> print f.decode('ascii')
  \u53eb\u6211
  >>> f.decode('unicode_escape')
  u'\u53eb\u6211'
  >>> print f.decode('unicode_escape')
  叫我
  >>> sys.getdefaultencoding()
  'utf-8'
  >>> dd = { 'name': u'功夫熊猫' }
  >>> print dd
  {'name': u'\u529f\u592b\u718a\u732b'}
  >>> dd
  {'name': u'\u529f\u592b\u718a\u732b'}
  >>> dd2 = { 'name': '功夫熊猫' }
  >>> dd2
  {'name': '\xe5\x8a\x9f\xe5\xa4\xab\xe7\x86\x8a\xe7\x8c\xab'}
  >>> print simplejson.dumps(dd, ensure_ascii=False)
  Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  NameError: name 'simplejson' is not defined
  >>> print json.dumps(dd, ensure_ascii=False)
  {"name": "功夫熊猫"}
  >>> print json.dumps(dd2, ensure_ascii=False)
  {"name": "功夫熊猫"}
  >>> print dd2
  {'name': '\xe5\x8a\x9f\xe5\xa4\xab\xe7\x86\x8a\xe7\x8c\xab'}
  >>>

运维网声明 1、欢迎大家加入本站运维交流群:群②:261659950 群⑤:202807635 群⑦870801961 群⑧679858003
2、本站所有主题由该帖子作者发表,该帖子作者与运维网享有帖子相关版权
3、所有作品的著作权均归原作者享有,请您和我们一样尊重他人的著作权等合法权益。如果您对作品感到满意,请购买正版
4、禁止制作、复制、发布和传播具有反动、淫秽、色情、暴力、凶杀等内容的信息,一经发现立即删除。若您因此触犯法律,一切后果自负,我们对此不承担任何责任
5、所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其内容的准确性、可靠性、正当性、安全性、合法性等负责,亦不承担任何法律责任
6、所有作品仅供您个人学习、研究或欣赏,不得用于商业或者其他用途,否则,一切后果均由您自己承担,我们对此不承担任何法律责任
7、如涉及侵犯版权等问题,请您及时通知我们,我们将立即采取措施予以解决
8、联系人Email:admin@iyunv.com 网址:www.yunweiku.com

所有资源均系网友上传或者通过网络收集,我们仅提供一个展示、介绍、观摩学习的平台,我们不对其承担任何法律责任,如涉及侵犯版权等问题,请您及时通知我们,我们将立即处理,联系人Email:kefu@iyunv.com,QQ:1061981298 本贴地址:https://www.iyunv.com/thread-547610-1-1.html 上篇帖子: Python 安装nose-linux运维 下篇帖子: 完整的python项目流程
您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

扫码加入运维网微信交流群X

扫码加入运维网微信交流群

扫描二维码加入运维网微信交流群,最新一手资源尽在官方微信交流群!快快加入我们吧...

扫描微信二维码查看详情

客服E-mail:kefu@iyunv.com 客服QQ:1061981298


QQ群⑦:运维网交流群⑦ QQ群⑧:运维网交流群⑧ k8s群:运维网kubernetes交流群


提醒:禁止发布任何违反国家法律、法规的言论与图片等内容;本站内容均来自个人观点与网络等信息,非本站认同之观点.


本站大部分资源是网友从网上搜集分享而来,其版权均归原作者及其网站所有,我们尊重他人的合法权益,如有内容侵犯您的合法权益,请及时与我们联系进行核实删除!



合作伙伴: 青云cloud

快速回复 返回顶部 返回列表