Database Design and Analysis. Use the Accidents_2016.csv file below to implement the following steps for your database project.

I need an explanation for this MySQL question to help me study.

  1. Create a MySQL schema named “accidents.”
  2. Within the accidents schema, create a table named accidents_2016 with the following columns:
    • accident_index as a varchar(13),
    • accident_severity as an int
  3. Within the accidents schema, create a table named vehicles_2016 with the following columns:
    • accident_index as a varchar(13),
    • vehicle_type as a varchar(10)
  4. Within the accidents schema, create a table named vehicle_type with the following columns:
    • vcode int,
    • vtype as a varchar(100)
  5. Next, you will load the data for the three tables.
    • Load the accidents data. Note that @dummy is a placeholder for a column in the .csv file that you want to ignore during the load.
load data local infile '…\data\Accidents_2016.csv'_x000D_
into table accidents_2016_x000D_
fields terminated by ','_x000D_
enclosed by '"'_x000D_
lines terminated by 'n'_x000D_
ignore 1 lines_x000D_
(@col1, @dummy, @dummy, @dummy, @dummy, @dummy, @col2_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
) _x000D_
set accident_index=@col1,accident_severity=@col2;
  • Load the vehicle data.
load data local infile '…\data\Vehicles_2016.csv'_x000D_
into table vehicles_2016_x000D_
fields terminated by ','_x000D_
enclosed by '"'_x000D_
lines terminated by 'n'_x000D_
ignore 1 lines_x000D_
(@col1, @dummy, @dummy, @col2_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
,@dummy, @dummy, @dummy, @dummy, @dummy_x000D_
)_x000D_
set accident_index=@col1,vehicle_type=@col2;
  • Load the vehicle type data.
load data local infile  '…\data\vehicle_type.csv'_x000D_
into table  vehicle_type_x000D_
fields  terminated by ','_x000D_
enclosed by  '"'_x000D_
lines  terminated by 'n'_x000D_
ignore 1  lines
  1. After the data are loaded, you will perform the analysis. First, find the average accident severity and the number of accidents for vehicles of type motorcycle. Note the performance of your query. Your query may run so slowly that MySQL aborts running completing.
  2. Improve Query Performance
    • Look at the explain tool output and save the results to a graphic file.
    • From the explain results, how many rows have to be read per join?
    • Add an index named “accident_index” of type “index” on the accident_index
    • column in the accidents_2016 table and another index named “accident_index” of type “index” on the vehicles_2106 table.
alter table accidents_2016_x000D_
add index accident_index (accident_index asc);
alter table vehicles_2016_x000D_
add index accident_index (accident_index asc);

After adding the indices, rerun the query explanation tool and determine the number of rows to be read per join.

  1. Find the median accident severity.

MySQL does not have a median function so to find the median accident severity, you will have to write a Python script.

  • You’ll need to install Python and the PyMySQL module.
  • Install Python version 2.7 or 3.4 from www.python.org.

To install the PyMySQL module, run the following command in a Windows command prompt after Python has been installed:

python -m pip install  --index-url=https://pypi.python.org/simple/ --trusted-host pypi.python.org PyMySQL

b) Create an accident median table

create  table accident_medians_x000D_
(_x000D_
       vtype varchar(100),_x000D_
       severity int_x000D_
);
  • Run the following Python script:
import pymysql_x000D_
myConnection  = pymysql.connect(host='localhost', user='****', passwd='****', db='accidents')_x000D_
cur = myConnection.cursor()_x000D_
cur.execute('SELECT vtype FROM vehicle_type WHERE  vtype LIKE "%otorcycle%";')_x000D_
cycleList = cur.fetchall()_x000D_
selectSQL = ('''_x000D_
                SELECT  t.vtype, a.accident_severity_x000D_
                FROM accidents_2016 AS a_x000D_
                JOIN vehicles_2016 AS v ON  a.accident_index = v.Accident_Index_x000D_
                JOIN vehicle_type AS t ON  v.Vehicle_Type = t.vcode_x000D_
                WHERE t.vtype LIKE %s_x000D_
                ORDER BY  a.accident_severity;''')_x000D_
insertSQL = ('''INSERT INTO accident_medians  VALUES (%s, %s);''')_x000D_
                _x000D_
for cycle  in cycleList:_x000D_
                cur.execute(selectSQL,cycle[0])_x000D_
                accidents = cur.fetchall()_x000D_
                quotient, remainder =  divmod(len(accidents),2)_x000D_
                if  remainder:_x000D_
                                med_sev =  accidents[quotient][1]_x000D_
                else:_x000D_
                                med_sev =  (accidents[quotient][1] + accidents[quotient+2][1])/2_x000D_
                print('Finding median  for',cycle[0])_x000D_
                cur.execute(insertSQL,(cycle[0],med_sev))_x000D_
myConnection.commit()_x000D_
myConnection.close()

Write each query you used in Steps 1 – 8 in a text file. If a query produced a result set, then list the first ten rows of each row set after the query.