I want to convert a JSON file I created to a SQLite database.

My intention is to decide later which data container and entry point is best, json (data entry via text editor) or SQLite (data entry via spreadsheet-like GUIs like SQLiteStudio).

我的json文件如下(包含我所在城市一些十字路口的交通数据):

...
"2011-12-17 16:00": {
    "local": "Av. Protásio Alves; esquina Ramiro Barcelos",
    "coord": "-30.036916,-51.208093",
    "sentido": "bairro-centro",
    "veiculos": "automotores",
    "modalidade": "semaforo 50-15",
    "regime": "típico",
    "pistas": "2+c",
    "medicoes": [
        [32, 50],
        [40, 50],
        [29, 50],
        [32, 50],
        [35, 50]
        ]
    },
"2011-12-19 08:38": {
    "local": "R. Fernandes Vieira; esquina Protásio Alves",
    "coord": "-30.035535,-51.211079",
    "sentido": "único",
    "veiculos": "automotores",
    "modalidade": "semáforo 30-70",
    "regime": "típico",
    "pistas": "3",
    "medicoes": [
        [23, 30],
        [32, 30],
        [33, 30],
        [32, 30]
        ]
    }
...

And I have created nice database with a one-to-many relation with these lines of Python code:

import sqlite3

db = sqlite3.connect("fluxos.sqlite")
c = db.cursor()

c.execute('''create table medicoes
         (timestamp text primary key,
          local text,
          coord text,
          sentido text,
          veiculos text,
          modalidade text,
          pistas text)''')

c.execute('''create table valores
         (id integer primary key,
          quantidade integer,
          tempo integer,
          foreign key (id) references medicoes(timestamp))''')

但问题是,当我准备插入实际数据为c.execute("insert into medicoes values(?,?,?,?,?,?,?)" % keys)的行时,我意识到,由于从JSON文件加载的dict没有特殊顺序,它无法正确映射到数据库的列顺序.

因此,我问:"我应该使用哪种策略/方法以编程方式从JSON文件中的每个"块"(在本例中为"local"、"coord"、"sentido"、"veiculos"、"modalidade"、"regime"、"pistas"e"medicoes")中读取键,以相同的顺序创建包含列的数据库,然后插入具有适当值的行"?

I have a fair experience with Python, but am just beginning with SQL, so I would like to have some counseling about good practices, and not necessarily a ready recipe.

推荐答案

You have this python code:

c.execute("insert into medicoes values(?,?,?,?,?,?,?)" % keys)

which I think should be

c.execute("insert into medicoes values (?,?,?,?,?,?,?)", keys)

因为% operator期望其左边的字符串包含格式化代码.

Now all you need to make this work is for keys to be a tuple (or list) containing the values for the new row of the medicoes table in the correct order. Consider the following python code:

import json

traffic = json.load(open('xxx.json'))

columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas']
for timestamp, data in traffic.iteritems():
    keys = (timestamp,) + tuple(data[c] for c in columns)
    print str(keys)

当我用你的样本数据运行这个时,我得到:

(u'2011-12-19 08:38', u'R. Fernandes Vieira; esquina Prot\xe1sio Alves', u'-30.035535,-51.211079', u'\xfanico', u'automotores', u'sem\xe1foro 30-70', u'3')
(u'2011-12-17 16:00', u'Av. Prot\xe1sio Alves; esquina Ramiro Barcelos', u'-30.036916,-51.208093', u'bairro-centro', u'automotores', u'semaforo 50-15', u'2+c')

which would seem to be the tuples you require.

You could add the necessary sqlite code with something like this:

import json
import sqlite3

traffic = json.load(open('xxx.json'))
db = sqlite3.connect("fluxos.sqlite")

query = "insert into medicoes values (?,?,?,?,?,?,?)"
columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas']
for timestamp, data in traffic.iteritems():
    keys = (timestamp,) + tuple(data[c] for c in columns)
    c = db.cursor()
    c.execute(query, keys)
    c.close()

Edit:如果不想硬编码列列表,可以执行以下操作:

import json

traffic = json.load(open('xxx.json'))

someitem = traffic.itervalues().next()
columns = list(someitem.keys())
print columns

When I run this it prints:

[u'medicoes', u'veiculos', u'coord', u'modalidade', u'sentido', u'local', u'pistas', u'regime']

You could use it with something like this:

import json
import sqlite3

db = sqlite3.connect('fluxos.sqlite')
traffic = json.load(open('xxx.json'))

someitem = traffic.itervalues().next()
columns = list(someitem.keys())
columns.remove('medicoes')
columns.remove('regime')

query = "insert into medicoes (timestamp,{0}) values (?{1})"
query = query.format(",".join(columns), ",?" * len(columns))
print query

for timestamp, data in traffic.iteritems():
    keys = (timestamp,) + tuple(data[c] for c in columns)
    c = db.cursor()
    c.execute(query)
    c.close()

当我try 使用您的示例数据时,这段代码打印的查询如下:

insert into medicoes (timestamp,veiculos,coord,modalidade,sentido,local,pistas) values (?,?,?,?,?,?,?)

Json相关问答推荐

将json数组反序列化为选项 struct

带有PowerShell内核的NewtonSoft json.Net的奇怪行为

Oracle plsql:如何将json文件加载到嵌套表中

当并非所有子对象都有 Select 器字段时 Select

使用 Powershell,如何将 Azure AD 组成员转换为 Json 对象(文件),然后可以更新?

如何使用Powershell查找所有包含特定键值对的JSON对象并迭代所有对象?

Jolt 不打印任何东西

将请求中的数据推送到数组中

如何在 Flutter 中遍历嵌套的动态 JSON 文件

JOLT JSON 将值从一对多转换为一对一

将 json 文件转换为 json 对象会打乱对象的顺序

使用 ConvertFrom-Json 后,Powershell 访问 JSON 中的嵌套对象

我无法在 Go - Gin 中解析日期/时间

JOLT 转换 - 删除 JSON 数组中的空 node

Swift - 将图像从 URL 写入本地文件

在 Rails 中使用 JSON 创建嵌套对象

Select 什么数据类型json或者jsonb或者text

Spring Security 和 JSON 身份验证

杰克逊:反序列化 for each 值都具有正确类型的 Map

Newtonsoft 对象 → 获取 JSON 字符串