Skip to content

Re-Exam MT4007 Feb 16, 2024

READ THE Overview BEFORE BEGINING

Overview

The exam should be handed in just as you have done during the homework. Create a sub-directory in your github repository and name it exam. To complete the exam the code and the final .md file should be uploaded to your repo. Finally, an issue should be raised when you hand in, no later than 19.00. You will automatically fail if you hand in later than that. Make sure that your answers are short and concise.

All the data required for the exam can be found in the data repo under the sub-directory exam_data.

Permitted Tools

You are NOT allowed to collaborate with others.

All the packages taught in the course are permitted. If you use exotic packages you might need to explain the methods you have used orally.

Tools such as Chat-GPT are allowed as long as you understand what has been used. Copy-pasting things that you do not understand is NOT allowed. Be careful using Chat-GPT in the theoretical part, it provides faulty answers on these questions.

If I suspect that you have blatantly copied answers from the internet without understanding, you will have to perform an oral exam.

Help

I will be available on Zoom for questions during the following times:

  • 14.30-15.00
  • 16.30-17.00

Besides these times you can reach me via email.

Grading

To pass the exam you need at least 5 points in the theoretical part and 10 points in practical part. Otherwise that the grading is as follows:

GradeFEDCBA
Points0-1415-1718-2021-2324-2627-30

Theoretical Part

1. SQL (5p)

Imagine a database containing a table Products with columns ProductId, Name, Price Category. Write an SQL query to calculate the average grade for each product category. Explain each part of your query thoroughly.

2. REST APIs (5p)

How is a typical REST request structured? Provide an example of a request to a server for user authentication (Log in). Explain each part of the request.

Practical Part

In this part you will be tested on the practical tools you have learnt in this course. You will be asked to analyse, wrangle and visualise data. Note that the requested figures do NOT have to be exactly match.

If data entry errors(missing-values, duplicates and so on) arise you can remove them from the data unless the question specifically ask for you to deal with them. Make sure to explain what you have done.

3. COVID-19 (10p)

Your task is to study the mortality rate of COVID-19 in the world. In this scenario we do not have access to data so we need to turn to internet to find the relevant data.

  • Make a GET request to the following API https://mt4007-ht23.github.io/data/covid.json. Which contains COVID-19 data for 2022. The data is in JSON format. Generate the following table. (3p)

If you are unable to get the data using REST, visit the link and copy the data manually into a .json file.

country-isoconfirmeddeaths
AFG2083247872
AGO1051841931
ALB3341353596
AND47820165
ARE10490242348
  • To make a meaningful analysis we need population data across the world. Webscrape the following link for population data. The year 2022 is enough. Furthermore, generate a similar table as below. (4p)

    If you are unable to webscrape the data, you can find the file population-2022.csv containing the same data.

Country NameCountry Code2022
ArubaABW107697
AfghanistanAFG40683694
AngolaAGO34907095
AlbaniaALB2812325
AndorraAND77469
  • Visualise the total number of cases per 100 000 inhabitant and per country. Choose a suitable plot. (3p)

4. Data Storage (10p)

We now need to store the data we have retrieved in the previous assignment. For this assignment, you should complete each subtask using only SQL.

  • Create 3 tables, COUNTRY-ISO-MAP, COVID-CASES, POPULATION-DATA. Choose suitable variables for each table and link them in a reasonable way. (4p)

  • Populate the tables with the data. (2p)

  • Retrieve data using SQL and generate a plot that illustrates the effect of covid in each country. (4p)