快捷導(dǎo)航

Python處理JSON數(shù)據(jù)并導(dǎo)入Neo4j數(shù)據(jù)庫

更新時間：2024年11月04日 15:31:18 作者：engchina

在數(shù)據(jù)處理和分析中,JSON是一種常見的數(shù)據(jù)格式,Neo4j是一個高性能的圖數(shù)據(jù)庫,能夠存儲和查詢復(fù)雜的網(wǎng)絡(luò)關(guān)系,下面我們就來看看Python如何處理JSON數(shù)據(jù)并導(dǎo)入Neo4j數(shù)據(jù)庫吧

引言

在數(shù)據(jù)處理和分析中，JSON是一種常見的數(shù)據(jù)格式。Neo4j是一個高性能的圖數(shù)據(jù)庫，能夠存儲和查詢復(fù)雜的網(wǎng)絡(luò)關(guān)系。本文將通過解析一段Python代碼，詳細介紹如何處理JSON數(shù)據(jù)并將其導(dǎo)入Neo4j數(shù)據(jù)庫。

代碼結(jié)構(gòu)概覽

首先，我們來看一下代碼的整體結(jié)構(gòu)：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time     :2022/9/13 10:03
# @File     :handler_person_data.py
# @Description: 處理json數(shù)據(jù)

import json
import os

from common import constant
from common.conn_neo4j import ConnNeo4j

# 獲得數(shù)據(jù)文件的路徑
data_path = os.path.join(constant.DATA_DIR, "data-json.json")
# 讀取數(shù)據(jù)文件的內(nèi)容
data = json.load(open(data_path, 'r', encoding='utf-8'))
print("人物數(shù)目：", len(data))

# 連接Neo4j服務(wù)器
neo4j = ConnNeo4j()
# 遍歷數(shù)據(jù)
for item in data:
    item['name'] = item['中文名']
    # 畢業(yè)于
    school = []
    if '畢業(yè)于' in item.keys():
        school = item['畢業(yè)于']
        item.pop('畢業(yè)于')

    # 作品
    works = []
    if '作品' in item.keys():
        works = item['作品']
        item.pop('作品')

    # 相關(guān)人物
    relate_persons = {}
    if '相關(guān)人物' in item.keys():
        relate_persons = item['相關(guān)人物']
        item.pop('相關(guān)人物')

    print(item)
    # 創(chuàng)建人物節(jié)點
    neo4j.create_node("人物", item)
    # 創(chuàng)建學(xué)校節(jié)點，人物與學(xué)校間的關(guān)系
    neo4j.create_node_relations("人物", item, "學(xué)校", school, "畢業(yè)于", {'type': '畢業(yè)于'}, False)
    # 創(chuàng)建作品節(jié)點，人物與作品間的關(guān)系
    neo4j.create_node_relations("人物", item, "作品", works, "創(chuàng)作", {'type': '創(chuàng)作'}, False)
    # 創(chuàng)建相關(guān)人物，人物社會關(guān)系
    for key in relate_persons.keys():
        tmp_value = relate_persons[key]
        tmp_rel_type = key
        if key in ['兒子', '女兒', '父親', '母親']:
            neo4j.create_node_relations("人物", item, "人物", tmp_value, tmp_rel_type, {'type': '親子'}, False)
        elif key in ['孫子', '孫女', '爺爺', '奶奶']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '祖孫'}, False)
        elif key in ['哥哥', '妹妹', '弟弟', '姐姐']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '兄弟姐妹'}, False)
        elif key in ['丈夫', '妻子']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '夫妻'}, False)
        elif key in ['女婿', '兒媳']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '婿媳'}, False)
        elif key in ['學(xué)生', '老師']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '師生'}, False)
        else:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '其他'}, False)

代碼詳解

1. 導(dǎo)入必要的庫

import json
import os

from common import constant
from common.conn_neo4j import ConnNeo4j

json：用于處理JSON格式的數(shù)據(jù)。

os：用于處理文件路徑。

constant：從common模塊中導(dǎo)入的常量，可能包含數(shù)據(jù)目錄等信息。

ConnNeo4j：從common.conn_neo4j模塊中導(dǎo)入的Neo4j連接類。

2. 定義數(shù)據(jù)文件路徑

# 獲得數(shù)據(jù)文件的路徑
data_path = os.path.join(constant.DATA_DIR, "data-json.json")

data_path：指向包含數(shù)據(jù)的JSON文件路徑。

3. 讀取JSON文件內(nèi)容

# 讀取數(shù)據(jù)文件的內(nèi)容
data = json.load(open(data_path, 'r', encoding='utf-8'))
print("人物數(shù)目：", len(data))

使用json.load()函數(shù)讀取JSON文件的內(nèi)容，并將其存儲在data變量中。

打印出數(shù)據(jù)中的人物數(shù)目。

4. 連接Neo4j服務(wù)器

# 連接Neo4j服務(wù)器
neo4j = ConnNeo4j()

創(chuàng)建一個ConnNeo4j對象，用于連接Neo4j數(shù)據(jù)庫。

5. 遍歷數(shù)據(jù)并處理

# 遍歷數(shù)據(jù)
for item in data:
    item['name'] = item['中文名']
    # 畢業(yè)于
    school = []
    if '畢業(yè)于' in item.keys():
        school = item['畢業(yè)于']
        item.pop('畢業(yè)于')

    # 作品
    works = []
    if '作品' in item.keys():
        works = item['作品']
        item.pop('作品')

    # 相關(guān)人物
    relate_persons = {}
    if '相關(guān)人物' in item.keys():
        relate_persons = item['相關(guān)人物']
        item.pop('相關(guān)人物')

    print(item)
    # 創(chuàng)建人物節(jié)點
    neo4j.create_node("人物", item)
    # 創(chuàng)建學(xué)校節(jié)點，人物與學(xué)校間的關(guān)系
    neo4j.create_node_relations("人物", item, "學(xué)校", school, "畢業(yè)于", {'type': '畢業(yè)于'}, False)
    # 創(chuàng)建作品節(jié)點，人物與作品間的關(guān)系
    neo4j.create_node_relations("人物", item, "作品", works, "創(chuàng)作", {'type': '創(chuàng)作'}, False)
    # 創(chuàng)建相關(guān)人物，人物社會關(guān)系
    for key in relate_persons.keys():
        tmp_value = relate_persons[key]
        tmp_rel_type = key
        if key in ['兒子', '女兒', '父親', '母親']:
            neo4j.create_node_relations("人物", item, "人物", tmp_value, tmp_rel_type, {'type': '親子'}, False)
        elif key in ['孫子', '孫女', '爺爺', '奶奶']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '祖孫'}, False)
        elif key in ['哥哥', '妹妹', '弟弟', '姐姐']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '兄弟姐妹'}, False)
        elif key in ['丈夫', '妻子']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '夫妻'}, False)
        elif key in ['女婿', '兒媳']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '婿媳'}, False)
        elif key in ['學(xué)生', '老師']:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '師生'}, False)
        else:
            neo4j.create_node_relations('人物', item, '人物', tmp_value, tmp_rel_type, {'type': '其他'}, False)

遍歷data中的每個JSON對象。

將中文名字段重命名為name。

處理畢業(yè)于、作品和相關(guān)人物字段，并將其從JSON對象中移除。

打印處理后的JSON對象。

調(diào)用neo4j.create_node()方法創(chuàng)建人物節(jié)點。

調(diào)用neo4j.create_node_relations()方法創(chuàng)建學(xué)校、作品和相關(guān)人物節(jié)點，并建立相應(yīng)的關(guān)系。

總結(jié)

通過這段代碼，我們學(xué)會了如何從JSON文件中提取數(shù)據(jù)，并將其導(dǎo)入Neo4j數(shù)據(jù)庫。這個過程包括讀取JSON文件、處理數(shù)據(jù)、創(chuàng)建節(jié)點和關(guān)系。希望這篇文章對你理解如何處理JSON數(shù)據(jù)并導(dǎo)入Neo4j數(shù)據(jù)庫有所幫助。

到此這篇關(guān)于Python處理JSON數(shù)據(jù)并導(dǎo)入Neo4j數(shù)據(jù)庫的文章就介紹到這了,更多相關(guān)Python處理JSON數(shù)據(jù)內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: