Homework 5
Instructions
In this assigment, you will practice your skills in both SQL and Regex. The scenario involves user activity logs and user comments. The goal is to perform data analysis tasks using SQL queries and extract relevant information from user comments using Regex.
Make sure of the following:
- Your solutions should be in form of a report in .md format.
- You have documented your procedure properly.
- Your answers are clear and concise.
When you are finished push you results to Github and raise an issue, just as you have done in previous homeworks. To pass the homework you will have to complete the assigments below and also finish the peer-review.
Feel free to contact me if anything is unclear.
SQL
The file user_actions.db
in the data repo contains an SQLite database. In the database there is a table named user_actions
. Analyse the table and solve the following tasks.
- Retrieve the usernames of all users who have performed the "signup" action.
- Find the total number of log entries for each user. Display the user_id, username, and the count of log entries.
- Identify users who have both logged in (action = 'login') and signed up (action = 'signup') on the same day. Display the user_id and username.
Each task should be solved only using SQL.
Regex
The file comments.txt
in the data repo contains lines of text, each representing a user comment. Users sometimes include tags in their comments using the format "#tag". Analyse the file and solve the following tasks.
- Write a regular expression to extract all hashtags from a given comment. For example, applying the regex to comment 1 should return ["#programming", "#tips"].
- Create a regular expression to find comments that mention both "#programming" and "#python". Apply the regex to comment 2 and check if it matches.
- Using your regular expression, extract all unique hashtags from the entire text file. (*)
The last task is optional (*)!
Each task should be solved only using regex.
Good Luck!