SQL PROJECT CLASSIC MODELS DATABASE¶
The project revolves around solving critical business queries using a comprehensive MySQL sample database schema. This schema includes essential tables like Customers, Products, ProductLines, Orders, OrderDetails, Payments, Employees, and Offices. By harnessing the power of SQL, we aim to extract valuable insights, facilitate data-driven decision-making, and enhance overall business performance.
The MySQL sample database schema consists of the following tables:
- Customers: stores customer’s data.
- Products: stores a list of scale model cars.
- ProductLines: stores a list of product line categories.
- Orders: stores sales orders placed by customers.
- OrderDetails: stores sales order line items for each sales order.
- Payments: stores payments made by customers based on their accounts.
- Employees: stores all employee information as well as the organization structure such as who reports to whom.
- Offices: stores sales office data.
In case you want to use this schema use following credentials:
- hostname: relational.fit.cvut.cz
- port: 3306
- username: guest
- password: relational
In [ ]:
# Import library
import mysql.connector
from mysql.connector import Error
from sqlalchemy import create_engine
from sqlalchemy.orm import aliased
import pandas as pd
# Connect to MySQL Database example from
engine = create_engine("mysql+mysqlconnector://guest:relational@relational.fit.cvut.cz/classicmodels")
connection = engine.connect()
How many unique customers does Classic Models have ?¶
In [ ]:
query= (""" SELECT COUNT(DISTINCT(customerName))
FROM customers
""")
sql_df = pd.read_sql(query,con=engine)
sql_df.columns = ['Unique_Customers']
sql_df
Out[ ]:
Unique_Customers | |
---|---|
0 | 122 |
In [ ]:
query= (""" SELECT customerNumber, customerName , city, country , creditLimit
FROM customers
WHERE creditLimit > 100000
""")
sql_df = pd.read_sql(query,con=engine)
print(f"{len(sql_df)} customers have a credit limit higher than 100.000.")
sql_df
25 customers have a credit limit higher than 100.000.
Out[ ]:
customerNumber | customerName | city | country | creditLimit | |
---|---|---|---|---|---|
0 | 114 | Australian Collectors, Co. | Melbourne | Australia | 117300.0 |
1 | 119 | La Rochelle Gifts | Nantes | France | 118200.0 |
2 | 124 | Mini Gifts Distributors Ltd. | San Rafael | USA | 210500.0 |
3 | 131 | Land of Toys Inc. | NYC | USA | 114900.0 |
4 | 141 | Euro+ Shopping Channel | Madrid | Spain | 227600.0 |
5 | 146 | Saveley & Henriot, Co. | Lyon | France | 123900.0 |
6 | 148 | Dragon Souveniers, Ltd. | Singapore | Singapore | 103800.0 |
7 | 151 | Muscle Machine Inc | NYC | USA | 138500.0 |
8 | 157 | Diecast Classics Inc. | Allentown | USA | 100600.0 |
9 | 187 | AV Stores, Co. | Manchester | UK | 136800.0 |
10 | 227 | Heintze Collectables | Århus | Denmark | 120800.0 |
11 | 239 | Collectable Mini Designs Co. | San Diego | USA | 105000.0 |
12 | 249 | Amica Models & Co. | Torino | Italy | 113000.0 |
13 | 259 | Toms Spezialitäten, Ltd | Köln | Germany | 120400.0 |
14 | 276 | Anna's Decorations, Ltd | North Sydney | Australia | 107800.0 |
15 | 278 | Rovelli Gifts | Bergamo | Italy | 119600.0 |
16 | 286 | Marta's Replicas Co. | Cambridge | USA | 123700.0 |
17 | 298 | Vida Sport, Ltd | Genève | Switzerland | 141300.0 |
18 | 319 | Mini Classics | White Plains | USA | 102700.0 |
19 | 321 | Corporate Gift Ideas Co. | San Francisco | USA | 105000.0 |
20 | 363 | Online Diecast Creations Co. | Nashua | USA | 114200.0 |
21 | 386 | L'ordine Souveniers | Reggio Emilia | Italy | 121400.0 |
22 | 448 | Scandinavian Gift Ideas | Bräcke | Sweden | 116400.0 |
23 | 458 | Corrida Auto Replicas, Ltd | Madrid | Spain | 104600.0 |
24 | 496 | Kelly's Gift Shop | Auckland | New Zealand | 110000.0 |
What are the customers with phone number from Singapure (Code +65)¶
In [ ]:
query= (""" SELECT *
FROM customers
WHERE CAST(phone as UNSIGNED) = +65
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
customerNumber | customerName | contactLastName | contactFirstName | phone | addressLine1 | addressLine2 | city | state | postalCode | country | salesRepEmployeeNumber | creditLimit | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 148 | Dragon Souveniers, Ltd. | Natividad | Eric | +65 221 7555 | Bronz Sok. | Bronz Apt. 3/6 Tesvikiye | Singapore | None | 079903 | Singapore | 1621 | 103800.0 |
1 | 166 | Handji Gifts& Co | Victorino | Wendy | +65 224 1555 | 106 Linden Road Sandown | 2nd Floor | Singapore | None | 069045 | Singapore | 1612 | 97900.0 |
What is the total value of payments for each customer?¶
In [ ]:
query= (""" SELECT customerName, SUM(amount) as totalPayment
FROM payments
INNER JOIN customers
USING (customerNumber)
GROUP BY customerName
ORDER BY totalPayment DESC
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
customerName | totalPayment | |
---|---|---|
0 | Euro+ Shopping Channel | 715738.98 |
1 | Mini Gifts Distributors Ltd. | 584188.24 |
2 | Australian Collectors, Co. | 180585.07 |
3 | Muscle Machine Inc | 177913.95 |
4 | Dragon Souveniers, Ltd. | 156251.03 |
... | ... | ... |
93 | Royale Belge | 29217.18 |
94 | Frau da Collezione | 25358.32 |
95 | Atelier graphique | 22314.36 |
96 | Auto-Moto Classics Inc. | 21554.26 |
97 | Boards & Toys Co. | 7918.60 |
98 rows × 2 columns
How much has each product line sold? Include total sales , quianty of products and averag per order¶
In [ ]:
pd.set_option('display.max_colwidth', None)# Display full description pandas dataframe
query= (""" SELECT productLine, ROUND(SUM(quantityOrdered * priceEach),2) AS sales ,
SUM(quantityOrdered) AS quantity, ROUND(AVG(quantityOrdered * priceEach),2) as avgpersale,
textdescription
FROM products
INNER JOIN orderdetails
USING (productCode)
INNER JOIN productlines
USING (productLine)
GROUP BY productLine,textdescription
ORDER BY sales DESC
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
productLine | sales | quantity | avgpersale | textdescription | |
---|---|---|---|---|---|
0 | Classic Cars | 3853922.49 | 35582.0 | 3815.76 | Attention car enthusiasts: Make your wildest car ownership dreams come true. Whether you are looking for classic muscle cars, dream sports cars or movie-inspired miniatures, you will find great choices in this category. These replicas feature superb attention to detail and craftsmanship and offer features such as working steering system, opening forward compartment, opening rear trunk with removable spare wheel, 4-wheel independent spring suspension, and so on. The models range in size from 1:10 to 1:24 scale and include numerous limited edition and several out-of-production vehicles. All models include a certificate of authenticity from their manufacturers and come fully assembled and ready for display in the home or office. |
1 | Vintage Cars | 1797559.63 | 22933.0 | 2736.01 | Our Vintage Car models realistically portray automobiles produced from the early 1900s through the 1940s. Materials used include Bakelite, diecast, plastic and wood. Most of the replicas are in the 1:18 and 1:24 scale sizes, which provide the optimum in detail and accuracy. Prices range from $30.00 up to $180.00 for some special limited edition replicas. All models include a certificate of authenticity from their manufacturers and come fully assembled and ready for display in the home or office. |
2 | Motorcycles | 1121426.12 | 12778.0 | 3123.75 | Our motorcycles are state of the art replicas of classic as well as contemporary motorcycle legends such as Harley Davidson, Ducati and Vespa. Models contain stunning details such as official logos, rotating wheels, working kickstand, front suspension, gear-shift lever, footbrake lever, and drive chain. Materials used include diecast and plastic. The models range in size from 1:10 to 1:50 scale and include numerous limited edition and several out-of-production vehicles. All models come fully assembled and ready for display in the home or office. Most include a certificate of authenticity. |
3 | Trucks and Buses | 1024113.57 | 11001.0 | 3325.04 | The Truck and Bus models are realistic replicas of buses and specialized trucks produced from the early 1920s to present. The models range in size from 1:12 to 1:50 scale and include numerous limited edition and several out-of-production vehicles. Materials used include tin, diecast and plastic. All models include a certificate of authenticity from their manufacturers and are a perfect ornament for the home and office. |
4 | Planes | 954637.54 | 11872.0 | 2841.18 | Unique, diecast airplane and helicopter replicas suitable for collections, as well as home, office or classroom decorations. Models contain stunning details such as official logos and insignias, rotating jet engines and propellers, retractable wheels, and so on. Most come fully assembled and with a certificate of authenticity from their manufacturers. |
5 | Ships | 663998.34 | 8532.0 | 2710.20 | The perfect holiday or anniversary gift for executives, clients, friends, and family. These handcrafted model ships are unique, stunning works of art that will be treasured for generations! They come fully assembled and ready for display in the home or office. We guarantee the highest quality, and best value. |
6 | Trains | 188532.92 | 2818.0 | 2327.57 | Model trains are a rewarding hobby for enthusiasts of all ages. Whether you're looking for collectible wooden trains, electric streetcars or locomotives, you'll find a number of great choices for any budget within this category. The interactive aspect of trains makes toy trains perfect for young children. The wooden train sets are ideal for children under the age of 5. |
What are the top 10 products sold?¶
In [ ]:
query= (""" SELECT productName , productLine, productScale, ROUND(SUM(quantityOrdered * priceEach),1) as sales
FROM orderdetails
INNER JOIN products
USING (productCode)
GROUP BY productName , productLine, productScale
ORDER BY sales DESC
LIMIT 10
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
productName | productLine | productScale | sales | |
---|---|---|---|---|
0 | 1992 Ferrari 360 Spider red | Classic Cars | 1:18 | 276840.0 |
1 | 2001 Ferrari Enzo | Classic Cars | 1:12 | 190755.9 |
2 | 1952 Alpine Renault 1300 | Classic Cars | 1:10 | 190018.0 |
3 | 2003 Harley-Davidson Eagle Drag Bike | Motorcycles | 1:10 | 170686.0 |
4 | 1968 Ford Mustang | Classic Cars | 1:12 | 161531.5 |
5 | 1969 Ford Falcon | Classic Cars | 1:12 | 152543.0 |
6 | 1980s Black Hawk Helicopter | Planes | 1:18 | 144959.9 |
7 | 1998 Chrysler Plymouth Prowler | Classic Cars | 1:18 | 142530.6 |
8 | 1917 Grand Touring Sedan | Vintage Cars | 1:18 | 140535.6 |
9 | 2002 Suzuki XREO | Motorcycles | 1:12 | 135767.0 |
How many days does it take from the time the order is placed to the time it is shipped?¶
In [ ]:
query= (""" SELECT AVG(DATEDIFF(shippedDate,orderDate)) as avgDays
FROM orders
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
avgDays | |
---|---|
0 | 3.7564 |
What is the total sales per order status?¶
In [ ]:
query= (""" SELECT status,SUM(quantityOrdered * priceEach) AS totalSales,SUM(quantityOrdered) as Quantity
FROM orders
INNER JOIN orderdetails
USING (orderNumber)
GROUP BY status
ORDER BY totalSales DESC
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
status | totalSales | Quantity | |
---|---|---|---|
0 | Shipped | 8865094.64 | 97141.0 |
1 | Cancelled | 238854.18 | 2634.0 |
2 | On Hold | 169575.61 | 1994.0 |
3 | In Process | 135271.52 | 1490.0 |
4 | Resolved | 134235.88 | 1660.0 |
5 | Disputed | 61158.78 | 597.0 |
What are the top 5 Customers with more cancelled orders ?¶
In [ ]:
query= (""" SELECT customerName,status,COUNT(orderNumber) as cancelledOrders,ROUND(SUM(quantityOrdered * priceEach),2) AS totalSales
FROM orders
INNER JOIN customers
USING (customerNumber)
INNER JOIN orderdetails
USING (orderNumber)
GROUP BY customerName,status
HAVING status = "Cancelled"
ORDER BY cancelledOrders DESC
LIMIT 5
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
customerName | status | cancelledOrders | totalSales | |
---|---|---|---|---|
0 | Euro+ Shopping Channel | Cancelled | 16 | 47065.36 |
1 | Scandinavian Gift Ideas | Cancelled | 16 | 44167.09 |
2 | UK Collectables, Ltd. | Cancelled | 14 | 45443.54 |
3 | Land of Toys Inc. | Cancelled | 14 | 41445.21 |
4 | GiftsForHim.com | Cancelled | 10 | 37769.38 |
In [ ]:
query= (""" SELECT orderNumber,customerName, requiredDate, shippedDate ,
CASE
WHEN DATEDIFF(shippedDate,requiredDate) >= 0 then 'Delayed'
ELSE 'Ontime'
END as statusDiff
FROM orders
INNER JOIN customers
USING (customerNumber)
HAVING statusDiff = "Delayed"
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
orderNumber | customerName | requiredDate | shippedDate | statusDiff | |
---|---|---|---|---|---|
0 | 10121 | Reims Collectables | 2003-05-13 | 2003-05-13 | Delayed |
1 | 10160 | Men 'R' US Retailers, Ltd. | 2003-10-17 | 2003-10-17 | Delayed |
2 | 10165 | Dragon Souveniers, Ltd. | 2003-10-31 | 2003-12-26 | Delayed |
3 | 10240 | Osaka Souveniers Co. | 2004-04-20 | 2004-04-20 | Delayed |
4 | 10251 | Tekni Collectables Inc. | 2004-05-24 | 2004-05-24 | Delayed |
5 | 10331 | Motor Mint Distributors Inc. | 2004-11-23 | 2004-11-23 | Delayed |
6 | 10339 | Tokyo Collectables, Ltd | 2004-11-30 | 2004-11-30 | Delayed |
7 | 10358 | Euro+ Shopping Channel | 2004-12-16 | 2004-12-16 | Delayed |
8 | 10417 | Euro+ Shopping Channel | 2005-05-19 | 2005-05-19 | Delayed |
What are the top 10 highest order ?¶
In [ ]:
query= (""" SELECT customerName,orderDate,orderNumber,SUM(quantityOrdered * priceEach) AS totalSales
FROM orders
INNER JOIN orderdetails
USING (orderNumber)
INNER JOIN customers
USING (customerNumber)
GROUP BY orderNumber,orderDate,customerName
ORDER BY totalSales DESC
LIMIT 10
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
customerName | orderDate | orderNumber | totalSales | |
---|---|---|---|---|
0 | Dragon Souveniers, Ltd. | 2003-10-22 | 10165 | 67392.85 |
1 | Vida Sport, Ltd | 2004-08-30 | 10287 | 61402.00 |
2 | Toms Spezialitäten, Ltd | 2004-10-16 | 10310 | 61234.67 |
3 | Euro+ Shopping Channel | 2004-01-16 | 10212 | 59830.55 |
4 | Diecast Collectables | 2003-12-09 | 10207 | 59265.14 |
5 | Muscle Machine Inc | 2003-06-03 | 10127 | 58841.35 |
6 | Muscle Machine Inc | 2003-12-02 | 10204 | 58793.53 |
7 | Corrida Auto Replicas, Ltd | 2003-05-28 | 10126 | 57131.92 |
8 | Collectable Mini Designs Co. | 2004-02-19 | 10222 | 56822.65 |
9 | Mini Gifts Distributors Ltd. | 2003-08-08 | 10142 | 56052.56 |
Who is the employee with more sales ?¶
In [ ]:
query= (""" SELECT CONCAT(firstName," ",lastName) AS fullname,
offices.city,offices.phone,ROUND(SUM(quantityOrdered * priceEach),2) AS totalSales
FROM employees
INNER JOIN offices
USING (officecode)
INNER JOIN customers
ON employees.employeeNumber=customers.salesRepEmployeeNumber
INNER JOIN orders
USING (customerNumber)
INNER JOIN orderdetails
USING (orderNumber)
GROUP BY fullname, offices.city,offices.phone
ORDER BY totalSales DESC
""")
sql_df = pd.read_sql(query,con=engine)
sql_df
Out[ ]:
fullname | city | phone | totalSales | |
---|---|---|---|---|
0 | Gerard Hernandez | Paris | +33 14 723 4404 | 1258577.81 |
1 | Leslie Jennings | San Francisco | +1 650 219 4782 | 1081530.54 |
2 | Pamela Castillo | Paris | +33 14 723 4404 | 868220.55 |
3 | Larry Bott | London | +44 20 7877 2041 | 732096.79 |
4 | Barry Jones | London | +44 20 7877 2041 | 704853.91 |
5 | George Vanauf | NYC | +1 212 555 3000 | 669377.05 |
6 | Peter Marsh | Sydney | +61 2 9264 2451 | 584593.76 |
7 | Loui Bondur | Paris | +33 14 723 4404 | 569485.75 |
8 | Andy Fixter | Sydney | +61 2 9264 2451 | 562582.59 |
9 | Steve Patterson | Boston | +1 215 837 0825 | 505875.42 |
10 | Foon Yue Tseng | NYC | +1 212 555 3000 | 488212.67 |
11 | Mami Nishi | Tokyo | +81 33 224 5000 | 457110.07 |
12 | Martin Gerard | Paris | +33 14 723 4404 | 387477.47 |
13 | Julie Firrelli | Boston | +1 215 837 0825 | 386663.20 |
14 | Leslie Thompson | San Francisco | +1 650 219 4782 | 347533.03 |