簡(jiǎn)單介紹Python中的JSON模塊

更新時(shí)間：2015年04月08日 09:33:29 投稿：goldensun

這篇文章主要介紹了簡(jiǎn)單介紹Python中的JSON模塊,包括初步的從Python中的數(shù)據(jù)格式轉(zhuǎn)換為JSON格式等,需要的朋友可以參考下

（一）什么是json：

JSON(JavaScript Object Notation) 是一種輕量級(jí)的數(shù)據(jù)交換格式。易于人閱讀和編寫。同時(shí)也易于機(jī)器解析和生成。它基于JavaScript Programming Language, Standard ECMA-262 3rd Edition - December 1999的一個(gè)子集。JSON采用完全獨(dú)立于語(yǔ)言的文本格式，但是也使用了類似于C語(yǔ)言家族的習(xí)慣（包括C, C++, C#, Java, JavaScript, Perl, Python等）。這些特性使JSON成為理想的數(shù)據(jù)交換語(yǔ)言。

JSON建構(gòu)于兩種結(jié)構(gòu)：

“名稱/值”對(duì)的集合（A collection of name/value pairs）。不同的語(yǔ)言中，它被理解為對(duì)象（object），紀(jì)錄（record），結(jié)構(gòu)（struct），字典（dictionary），哈希表（hash table），有鍵列表（keyed list），或者關(guān)聯(lián)數(shù)組（associative array）。

值的有序列表（An ordered list of values）。在大部分語(yǔ)言中，它被理解為數(shù)組（array）。

這些都是常見(jiàn)的數(shù)據(jù)結(jié)構(gòu)。事實(shí)上大部分現(xiàn)代計(jì)算機(jī)語(yǔ)言都以某種形式支持它們。這使得一種數(shù)據(jù)格式在同樣基于這些結(jié)構(gòu)的編程語(yǔ)言之間交換成為可能。

(二)Python JSON模塊

Python2.6開(kāi)始加入了JSON模塊，無(wú)需另外下載，Python的Json模塊序列化與反序列化的過(guò)程分別是 encoding和 decoding。encoding-把一個(gè)Python對(duì)象編碼轉(zhuǎn)換成Json字符串；decoding-把Json格式字符串解碼轉(zhuǎn)換成Python對(duì)象。要使用json模塊必須先導(dǎo)入：

import json

1，簡(jiǎn)單數(shù)據(jù)類型的處理

Python JSON模塊可以直接處理簡(jiǎn)單數(shù)據(jù)類型（string、unicode、int、float、list、tuple、dict）。 json.dumps()方法返回一個(gè)str對(duì)象，編碼過(guò)程中會(huì)存在從python原始類型向json類型的轉(zhuǎn)化過(guò)程，具體的轉(zhuǎn)化對(duì)照如下：

20154892843670.png (244×200)

json.dumps方法提供了很多好用的參數(shù)可供選擇，比較常用的有sort_keys（對(duì)dict對(duì)象進(jìn)行排序，我們知道默認(rèn)dict是無(wú)序存放的）、separators，indent等參數(shù)，dumps方法的定義為：

json.dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True,cls=None, indent=None, separators=None, encoding="utf-8", default=None, sort_keys=False,**kw)

使用簡(jiǎn)單的json.dumps方法對(duì)簡(jiǎn)單數(shù)據(jù)類型進(jìn)行編碼，例如：

obj = [[1,2,3],123,123.123,'abc',{'key1':(1,2,3),'key2':(4,5,6)}] 
encodedjson = json.dumps(obj) 
print 'the original list:\n',obj 
print 'length of obj is:',len(repr(obj))
print 'repr(obj),replace whiteblank with *:\n', repr(obj).replace(' ','*') 
print 'json encoded,replace whiteblank with *:\n',encodedjson.replace(' ','*')

輸出：（Python默認(rèn)的item separator是‘, '(不是',')，所以list無(wú)論是轉(zhuǎn)化成字符串還是json格式，成員之間都是有空格隔開(kāi)的）

the original list: 
[[1, 2, 3], 123, 123.123, 'abc', {'key2': (4, 5, 6), 'key1': (1, 2, 3)}] 
length of obj is: 72
repr(obj),replace whiteblank with *: 
[[1,*2,*3],*123,*123.123,*'abc',*{'key2':*(4,*5,*6),*'key1':*(1,*2,*3)}] 
json encoded,replace whiteblank with *: 
[[1,*2,*3],*123,*123.123,*"abc",*{"key2":*[4,*5,*6],*"key1":*[1,*2,*3]}] 
<type 'list'>

我們接下來(lái)在對(duì)encodedjson進(jìn)行decode，得到原始數(shù)據(jù)，需要使用的json.loads()函數(shù)。loads方法返回了原始的對(duì)象，但是仍然發(fā)生了一些數(shù)據(jù)類型的轉(zhuǎn)化，上例中‘a(chǎn)bc'轉(zhuǎn)化為了unicode類型。需要注意的是，json字符串中的字典類型的key必須要用雙引號(hào)“”json.loads()才能正常解析。從json到python的類型轉(zhuǎn)化對(duì)照如下：

20154892947143.png (244×213)

decodejson = json.loads(encodedjson) 
print 'the type of decodeed obj from json:', type(decodejson) 
print 'the obj is:\n',decodejson 
print 'length of decoded obj is:',len(repr(decodejson))

輸出：

the type of decodeed obj from json: <type 'list'> 
the obj is: 
[[1, 2, 3], 123, 123.123, u'abc', {u'key2': [4, 5, 6], u'key1': [1, 2, 3]}] 
length of decoded obj is: 75 #比原obj多出了3個(gè)unicode編碼標(biāo)示‘u'

sort_keys排序功能使得存儲(chǔ)的數(shù)據(jù)更加有利于觀察，也使得對(duì)json輸出的對(duì)象進(jìn)行比較。下例中，data1和data2數(shù)據(jù)應(yīng)該是一樣的，dict存儲(chǔ)的無(wú)序性造成兩者無(wú)法比較。

data1 = {'b':789,'c':456,'a':123} 
data2 = {'a':123,'b':789,'c':456} 
d1 = json.dumps(data1,sort_keys=True) 
d2 = json.dumps(data2) 
d3 = json.dumps(data2,sort_keys=True) 
print 'sorted data1(d1):',d1 
print 'unsorted data2(d2):',d2 
print 'sorted data2(d3):',d3 
print 'd1==d2?:',d1==d2 
print 'd1==d3?:',d1==d3

輸出：

sorted data1(d1): {"a": 123, "b": 789, "c": 456} 
unsorted data2(d2): {"a": 123, "c": 456, "b": 789} 
sorted data2(d3): {"a": 123, "b": 789, "c": 456} 
d1==d2?: False 
d1==d3?: True

indent參數(shù)是縮進(jìn)的意思，它可以使數(shù)據(jù)的存儲(chǔ)格式更優(yōu)雅、可讀性更強(qiáng)，這是通過(guò)增加一些冗余的空格進(jìn)行填充的。但是在解碼（json.loads()）時(shí)，空白填充會(huì)被刪除。

data = {'b':789,'c':456,'a':123} 
d1 = json.dumps(data,sort_keys=True,indent=4) 
print 'data len is:',len(repr(data)) 
print '4 indented data:\n',d1 
d2 = json.loads(d1) 
print 'decoded DATA:', repr(d2) 
print 'len of decoded DATA:',len(repr(d2))

輸出：（可見(jiàn)loads時(shí)會(huì)將dumps時(shí)增加的intent 填充空格去除）

data len is: 30 
4 indented data: 
{ 
  "a": 123,  
  "b": 789,  
  "c": 456 
} 
decoded DATA: {u'a': 123, u'c': 456, u'b': 789} 
len of decoded DATA: 33

json主要是作為一種數(shù)據(jù)通信的格式存在的，無(wú)用的空格會(huì)浪費(fèi)通信帶寬，適當(dāng)時(shí)候也要對(duì)數(shù)據(jù)進(jìn)行壓縮。separator參數(shù)可以起到這樣的作用，該參數(shù)傳遞是一個(gè)元組，包含分割對(duì)象的字符串，其實(shí)質(zhì)就是將Python默認(rèn)的（‘, ',': '）分隔符替換成（',',':'）。

data = {'b':789,'c':456,'a':123} 
print 'DATA:', repr(data) 
print 'repr(data)       :', len(repr(data)) 
print 'dumps(data)      :', len(json.dumps(data)) 
print 'dumps(data, indent=2) :', len(json.dumps(data, indent=4)) 
print 'dumps(data, separators):', len(json.dumps(data, separators=(',',':')))

輸出:

DATA: {'a': 123, 'c': 456, 'b': 789} 
repr(data)       : 30 
dumps(data)      : 30 
dumps(data, indent=2) : 46 
dumps(data, separators): 25

另一個(gè)比較有用的dumps參數(shù)是skipkeys，默認(rèn)為False。 dumps方法存儲(chǔ)dict對(duì)象時(shí)key必須是str類型，其他類型會(huì)導(dǎo)致TypeError異常產(chǎn)生，如果將skipkeys設(shè)為True則會(huì)優(yōu)雅的濾除非法keys。

data = {'b':789,'c':456,(1,2):123} 
print'original data:',repr(data) 
print 'json encoded',json.dumps(data,skipkeys=True)

輸出:

original data: {(1, 2): 123, 'c': 456, 'b': 789} 
json encoded {"c": 456, "b": 789}

2，JSON處理自定義數(shù)據(jù)類型

json模塊不僅可以處理普通的python內(nèi)置類型，也可以處理我們自定義的數(shù)據(jù)類型，而往往處理自定義的對(duì)象是很常用的。

如果直接通過(guò)json.dumps方法對(duì)Person的實(shí)例進(jìn)行處理的話，會(huì)報(bào)錯(cuò)，因?yàn)閖son無(wú)法支持這樣的自動(dòng)轉(zhuǎn)化。通過(guò)上面所提到的json和 python的類型轉(zhuǎn)化對(duì)照表，可以發(fā)現(xiàn)，object類型是和dict相關(guān)聯(lián)的，所以我們需要把我們自定義的類型轉(zhuǎn)化為dict，然后再進(jìn)行處理。這里，有兩種方法可以使用。

方法一：自己寫轉(zhuǎn)化函數(shù)

自定義object類型和dict類型進(jìn)行轉(zhuǎn)化：encode-定義函數(shù) object2dict()將對(duì)象模塊名、類名以及__dict__存儲(chǔ)在一個(gè)字典并返回;decode-定義dict2object()解析出模塊名、類名、參數(shù)，創(chuàng)建新的對(duì)象并返回。在json.dumps()中通過(guò)default參數(shù)指定轉(zhuǎn)化過(guò)程中調(diào)用的函數(shù)；json.loads()則通過(guò) object_hook指定轉(zhuǎn)化函數(shù)。

方法二：繼承JSONEncoder和JSONDecoder類，覆寫相關(guān)方法

JSONEncoder類負(fù)責(zé)編碼，主要是通過(guò)其default函數(shù)進(jìn)行轉(zhuǎn)化，我們可以重載該方法。對(duì)于JSONDecoder，亦然。

#handling private data type 
#define class 
class Person(object): 
  def __init__(self,name,age): 
    self.name = name 
    self.age = age 
  def __repr__(self): 
    return 'Person Object name : %s , age : %d' % (self.name,self.age) 
    
    
#define transfer functions 
def object2dict(obj): 
  #convert object to a dict 
  d = {'__class__':obj.__class__.__name__, '__module__':obj.__module__} 
  d.update(obj.__dict__) 
  return d 
   
def dict2object(d): 
  #convert dict to object 
  if'__class__' in d: 
    class_name = d.pop('__class__') 
    module_name = d.pop('__module__') 
    module = __import__(module_name) 
    print 'the module is:', module 
    class_ = getattr(module,class_name) 
    args = dict((key.encode('ascii'), value) for key, value in d.items()) #get args 
    print 'the atrribute:', repr(args) 
    inst = class_(**args) #create new instance 
  else: 
    inst = d 
  return inst 
#recreate the default method 
class LocalEncoder(json.JSONEncoder): 
  def default(self,obj): 
    #convert object to a dict 
    d = {'__class__':obj.__class__.__name__, '__module__':obj.__module__} 
    d.update(obj.__dict__) 
    return d 
   
class LocalDecoder(json.JSONDecoder): 
  def __init__(self): 
    json.JSONDecoder.__init__(self,object_hook = self.dict2object) 
  def dict2object(self, d): 
    #convert dict to object 
    if'__class__' in d: 
      class_name = d.pop('__class__') 
      module_name = d.pop('__module__') 
      module = __import__(module_name) 
      class_ = getattr(module,class_name) 
      args = dict((key.encode('ascii'), value) for key, value in d.items()) #get args 
      inst = class_(**args) #create new instance 
    else: 
      inst = d 
    return inst 
#test function 
if __name__ == '__main__': 
  p = Person('Aidan',22) 
  print p 
  #json.dumps(p)#error will be throwed 
  d = object2dict(p) 
  print 'method-json encode:', d 
   
  o = dict2object(d) 
  print 'the decoded obj type: %s, obj:%s' % (type(o),repr(o)) 
   
  dump = json.dumps(p,default=object2dict) 
  print 'dumps(default = object2dict):',dump 
  load = json.loads(dump,object_hook = dict2object) 
  print 'loads(object_hook = dict2object):',load 
  d = LocalEncoder().encode(p) 
  o = LocalDecoder().decode(d) 
   
  print 'recereated encode method: ',d 
  print 'recereated decode method: ',type(o),o

輸出：

Person Object name : Aidan , age : 22 
method-json encode: {'age': 22, '__module__': '__main__', '__class__': 'Person', 'name': 'Aidan'} 
the module is: <module '__main__' from 'D:/Project/Python/study_json'> 
the atrribute: {'age': 22, 'name': 'Aidan'} 
the decoded obj type: <class '__main__.Person'>, obj:Person Object name : Aidan , age : 22 
dumps(default = object2dict): {"age": 22, "__module__": "__main__", "__class__": "Person", "name": "Aidan"} 
the module is: <module '__main__' from 'D:/Project/Python/study_json'> 
the atrribute: {'age': 22, 'name': u'Aidan'} 
loads(object_hook = dict2object): Person Object name : Aidan , age : 22 
recereated encode method: {"age": 22, "__module__": "__main__", "__class__": "Person", "name": "Aidan"} 
recereated decode method: <class '__main__.Person'> Person Object name : Aidan , age : 22

您可能感興趣的文章: