这是完整的表格:(代码在末尾)
query = """
SELECT * FROM businesses
"""
df = pd.read_sql_query(query, conn)
df
我想要一个单一的SQL查询,找到前两个流行的城市,其中业主至少有2家store .预计输出为伦敦和莱斯特:
我目前知道如何按城市分组,以找到前两个城市:
# Top 2 popular cities
query1 = """
SELECT city, COUNT(*) as frequency
FROM businesses
GROUP BY city
ORDER BY frequency DESC
LIMIT 2
"""
df = pd.read_sql_query(query1, conn)
df
以及如何筛选至少有2家店铺的店主:
# Owners who have at least 2 shops
query2 = """
SELECT owner, COUNT(*) AS count
FROM businesses
GROUP BY owner
HAVING count >= 2
ORDER BY count DESC
"""
df = pd.read_sql_query(query2, conn)
df
但我不知道如何将查询1和查询2组合到一个查询中.
我非常感谢您在这方面提供的任何帮助,try 自学SQL:)
以下是创建数据库的代码,如果您愿意遵循:
# import libraries
import pandas as pd
import sqlite3
# create database
conn = sqlite3.connect("my_db.db")
# create table
cur = conn.cursor()
cur.execute("DROP TABLE IF EXISTS businesses;")
query = """
CREATE TABLE IF NOT EXISTS businesses (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
city TEXT NOT NULL,
owner TEXT NOT NULL
)
"""
cur.execute(query)
conn.commit()
# add rows to table
query = """
INSERT INTO businesses
(id, name, city, owner)
VALUES
(1, "Shop A", "London", "Tom"),
(2, "Shop B", "London", "Tom"),
(3, "Shop C", "London", "Tom"),
(4, "Shop D", "Luton", "Alice"),
(5, "Shop E", "Leeds", "Jenny"),
(6, "Shop F", "Leicester", "James"),
(7, "Shop G", "Leicester", "James"),
(8, "Shop H", "Leicester", "Emma"),
(9, "Shop I", "Leicester", "Emma"),
(10, "Shop J", "Liverpool", "James"),
(11, "Shop K", "Liverpool", "James"),
(12, "Shop L", "Liverpool", "George"),
(13, "Shop M", "Shefield", "Mary"),
(14, "Shop N", "Shefield", "Mary"),
(15, "Shop O", "Cambridge", "Oliver"),
(16, "Shop P", "Manchester", "Harry")
"""
cur.execute(query)
conn.commit()