diff --git a/_config.yml b/_config.yml index ea7d07e6d..5519f5ad2 100644 --- a/_config.yml +++ b/_config.yml @@ -8,22 +8,22 @@ plugins: #------------------------------- # General Site Settings -title: Johnny Hopkins -description: "Hi I'm Johnny, and I'm a Data Scientist. My portfolio focuses on interesting projects I've recently undertaken, with a strong emphasis on business impact. Please visit my Github & LinkedIn pages (or download my Resume) by using the links below!" +title: Aikaterini Tsaousi +description: "Hi I'm Kat and I'm a Data Scientist. My portfolio focuses on interesting projects I've recently undertaken. Please visit my Github & LinkedIn pages (or download my Resume) by using the links below!" baseurl: "" # the subpath of your site, e.g. /blog url: "" # the base hostname & protocol for your site, e.g. http://example.com #------------------------------- # About Section subtitle: Data Science Portfolio -location: "London, UK" -resume_url: /docs/resume.docx -avatar_image: /img/profile_picture.JPG +location: "Bristol, UK" +resume_url: /docs/Draft_CV_Jan_2026.pdf +avatar_image: /img/SoY_Profile_Photo.jpg #------------------------------- # Contact links -linkedln: "https://linkedln.com/#" # Add your linkedln handle -github: "https://github.com/#" # Add your github handle +linkedln: "https://www.linkedin.com/in/aikaterini-tsaousi/" # Add your linkedln handle +github: "https://github.com/Kat-Bristol" # Add your github handle paginate: 6 paginate_path: "/page/:num" @@ -52,3 +52,12 @@ defaults: # - vendor/gems/ # - vendor/ruby/ + + + + + + + + + diff --git a/_posts/2021-06-25-coffee-and-python.md b/_posts/2021-06-25-coffee-and-python.md new file mode 100644 index 000000000..ae691a496 --- /dev/null +++ b/_posts/2021-06-25-coffee-and-python.md @@ -0,0 +1,77 @@ + + +coffee_and_python.txt +Page +1 +/ +1 +100% +--- +layout: post +title: Coffee & Python +image: "/posts/coffee_python.jpg" +tags: [Python, Coffee] +--- + +# My first project +## is all about +### how much +#### I LOVE +##### Python & Coffee! + +--- + +Firstly, I love Python so much, here is some code! + +``` +my_love_for_python = 0 +my_python_knowledge = 0 + +for day in lifetime: + my_love_for_python += 1 + my_python_knowledge += 1 +``` + +Just so you really see how much I love Python, here is some code BUT with some colours for keywords & functionality! + +```python +my_love_for_python = 0 +my_python_knowledge = 0 + +for day in lifetime: + my_love_for_python += 1 + my_python_knowledge += 1 +``` + +Here is an **unordered list** showing some things I love about Python + +* For my work + * Data Analysis + * Data Visualisation + * Machine Learning +* For fun + * Deep Learning + * Computer Vision + * Projects about coffee + +Here is an _ordered list_ showing some things I love about coffee + +1. The smell + 1. Especially in the morning, but also at all times of the day! +2. The taste +3. The fact I can run the 100m in approx. 9 seconds after having 4 cups in quick succession + +I love Python & Coffee so much, here is that picture from the top of my project AGAIN, but this time, in the BODY of my project! + +![alt text](/img/posts/coffee_python.jpg "Coffee & Python - I love them!") + +The above image is just linked to the actual file in my Github, but I could also link to images online, using the URL! + +A line break, like this one below - helps me make sense of what I'm reading, especially when I've had so much coffee that my vision goes a little blurry + +--- + +I could also add things to my project like links, tables, quotes, and HTML blocks - but I'm starting to get a cracking headache. Must be coffee time. +Displaying coffee_and_python.txt. + +--- diff --git a/_posts/2021-06-09-Finding-Prime-Numbers-With-Python.md b/_posts/2026-01-30-Finding-Prime-Numbers-With-Python.md similarity index 95% rename from _posts/2021-06-09-Finding-Prime-Numbers-With-Python.md rename to _posts/2026-01-30-Finding-Prime-Numbers-With-Python.md index 31d5defa6..469082c95 100644 --- a/_posts/2021-06-09-Finding-Prime-Numbers-With-Python.md +++ b/_posts/2026-01-30-Finding-Prime-Numbers-With-Python.md @@ -215,3 +215,13 @@ number_range.remove(prime) However, because we have to sort the list for each iteration of the loop in order to get the minimum value, it's slightly slower than what we saw with pop()! + + +In summary, this is a Python algorithm that calculates prime numbers within a user-defined range using set operations and a sieve-style elimination approach. The algorithm was later adapted to run interactively in the browser using JavaScript. You can have a go and test it here: + + +Find Prime Numbers + + diff --git a/_posts/2026-02-01-Earthquakes-Map-Tableau.md b/_posts/2026-02-01-Earthquakes-Map-Tableau.md new file mode 100644 index 000000000..97079aed9 --- /dev/null +++ b/_posts/2026-02-01-Earthquakes-Map-Tableau.md @@ -0,0 +1,17 @@ +--- +layout: post +title: Earthquake Tracking Dashboard Using Tableau +image: "/posts/Map.png" +tags: [Tableau, Data Viz] +--- + +This is a Tableau Dashboard that tracks global Earthquake activity across a 30-day period. You can use the slide bar to select the magnitude of earthquakes shown and/or the drop down menus to select the days and countries you want to include. Alternatively, you can click on the link below to see an example of that dashboard together with some data analyses. + + + + + +🌍 Earthquake Data Dashboard (Tableau) + diff --git a/_posts/2026-02-04-Guessing-Number-Game.md b/_posts/2026-02-04-Guessing-Number-Game.md new file mode 100644 index 000000000..f8369d95b --- /dev/null +++ b/_posts/2026-02-04-Guessing-Number-Game.md @@ -0,0 +1,63 @@ +--- +layout: post +title: Python Guessing Number Game +image: "/posts/games_guess_the_number.png" +tags: [Python] +--- + +Here is the Python Script for the game + +--- + + + + +import random + +random.randint(1, 10) + +LOWER_BOUND = 0 + +UPPER_BOUND = 100 + +GUESS_LIMIT = 5 + +GUESS_COUNTER = 0 + + +CORRECT_NUMBER = random.randint(LOWER_BOUND, UPPER_BOUND) + +print(f'Try guessing the number that I am thinking. It is between {LOWER_BOUND} and {UPPER_BOUND}. ' + f'Good luck, you have {GUESS_LIMIT} guesses') + +while True: + try: + user_guess = int(input('What is your guess?? ')) + except ValueError: + print('Please enter a valid number') + continue + + if not (LOWER_BOUND <= user_guess <= UPPER_BOUND): + print(f'Your guess is out of range! Try a guess between {LOWER_BOUND} and {UPPER_BOUND}') + continue + + GUESS_COUNTER += 1 + remaining_guesses = GUESS_LIMIT - GUESS_COUNTER + + if user_guess == CORRECT_NUMBER: + print(f'Wow, you got it in {GUESS_COUNTER} guesses - well done!') + break + elif user_guess < CORRECT_NUMBER: + print(f'Your guess is too low, try again! Guesses remaining: {remaining_guesses}') + else: + print(f'Your guess is too high, try again! Guesses remaining: {remaining_guesses}') + + if remaining_guesses == 0: + print(f"Sorry, you're out of guesses. The number you were after was {CORRECT_NUMBER}") + break +--- + + + +Play the Number Guessing Game + diff --git a/_posts/2026-02-07-AB-Testing.md b/_posts/2026-02-07-AB-Testing.md new file mode 100644 index 000000000..29c477f1f --- /dev/null +++ b/_posts/2026-02-07-AB-Testing.md @@ -0,0 +1,110 @@ +--- +layout: post +title: Assessing Campaign Performance +image: "/posts/Performance_Marketing_mod.jpg" +tags: [A/B Testing, Chi-Square, Stats] +--- + +A/B tests are used in business to describe a hypothesis test. An A/B Test is essentially a randomised experiment containing two groups, A & B that receive different experiences. Within an A/B test, we look to understand and measure the response of each group. Marketing campaign performance analysis measures the effectiveness of marketing initiatives against defined goals. + + +--- + +Example of a Marketing Campaign Pilot: + + - Group 1 – customers received Mailer 1 (a basic cheaper version) to sign up to loyalty scheme + - Group 2 – customers received Mailer 2 (a colourful more expensive version) to sign up to loyalty scheme + - Group 3 = control group - received no Mailer at all (but can still sign up to loyalty scheme via main menu) + +The marketing team is certain that customers who received a mailer (i.e. Groups 1 or 2) signed up to the company's loyalty scheme more than the control group but are unsure whether the quality of mailer they received made a significant difference. + +We can answer this question using hypothesis testing, in particular using the chi-square test for independence. + +--- + + + +##### IMPORT REQUIRED PACKAGES +``` +import pandas as pd +from scipy.stats import chi2_contingency as cc +from scipy.stats import chi2 +``` + +##### IMPORT DATA (ensure the spreadsheet is located in the same directory as this Python Script File) +``` +campaign_data = pd.read_excel('grocery_database.xlsx', sheet_name = 'campaign_data') +``` + + + +##### FILTER THE DATA (i.e take out all rows with CTL group using the .loc method) +``` +campaign_data = campaign_data.loc[campaign_data['mailer_type'] != 'Control'] +``` + + + +##### SUMMARISE TO GET OUR OBSERVED FREQUENCIES using .crosstab() method +``` +observed_values = pd.crosstab(campaign_data['mailer_type'], campaign_data['signup_flag']).values +observed_values = pd.crosstab(campaign_data["mailer_type"], campaign_data["signup_flag"]) +print(observed_values) + +mailer1_signup_rate = 123 / (252 +123) +mailer2_signup_rate = 127 / (209 +127) +print(mailer1_signup_rate, mailer2_signup_rate) +``` + + + +##### STATE HYPOTHESES & SET ACCEPTANCE CRITERIA +``` +null_hypothesis = 'there is no relationship between mailer type and signup rate. They are independednt.' +alternate_hypothesis = 'there is a relationship between mailer type and signup rate. They are NOT independednt.' +acceptance_criteria = 0.05 +``` + + + +##### CALCULATE EXPECTED FREQUENCIES & CHI SQUARE STATISTIC +``` +chi2_statistic, p_value, dof, expected_values = cc(observed_values, correction = False) +print(chi2_statistic, p_value) +``` + + +##### FIND THE CRITICAL VALUE FOR THE TEST using the percentage point function +``` +critical_value = chi2.ppf(1- acceptance_criteria, dof) +print(critical_value) +``` + + +##### PRINT THE RESULTS/CONCLUSION (Chi Square Statistic) +``` +if chi2_statistic >= critical_value: + print(f'As our chi-suqre-statistic of {chi2_statistic} is HIGHER than our citical value of {critical_value}, we REJECT the null hypothesis and conclude that {alternate_hypothesis}') +else: + print(f'As our chi-suqre-statistic of {chi2_statistic} is LOWER than our citical value of {critical_value}, we ACCEPT the null hypothesis and conclude that {null_hypothesis}') + ``` + + +##### PRINT THE RESULTS/CONCLUSION (p-value) +``` +if p_value <= acceptance_criteria: + print(f'As our p-value of {p_value} is LOWER than our citical value of {acceptance_criteria}, we REJECT the null hypothesis and conclude that {alternate_hypothesis}') +else: + print(f'As our p-value of {p_value} is HIGHER than our citical value of {acceptance_criteria}, we ACCEPT the null hypothesis and conclude that {null_hypothesis}') +``` +--- + + +#### Output in our console (after the code is run): +``` +As our chi-suqre-statistic of 1.9414468614812481 is LOWER than our citical value of 3.841458820694124, we ACCEPT the null hypothesis and conclude that there is no relationship between mailer type and signup rate. They are independednt. +As our p-value of 0.16351152223398197 is HIGHER than our citical value of 0.05, we ACCEPT the null hypothesis and conclude that there is no relationship between mailer type and signup rate. They are independednt. +``` +--- + +#### *Business Insight: The Marketing Team can safely utilise the basic (cheaper) mailer as a means to increase signups. Using the colourful more expensive mailer would result in unnecessary costs/expenses for the company.* diff --git a/_posts/2026-02-10-Image-Procesing-NumPy.md b/_posts/2026-02-10-Image-Procesing-NumPy.md new file mode 100644 index 000000000..0c09b4714 --- /dev/null +++ b/_posts/2026-02-10-Image-Procesing-NumPy.md @@ -0,0 +1,115 @@ +--- +layout: post +title: Image Processing Using NymPy +image: "/posts/camaro.jpg" +tags: [NumPy, Image Processing] +--- + +Digital images are seen as arrays. In the first part of this post, I am going to use NumPy to crop an image (horizontally and/or vertically). Grayscale images can be thought of as a collection of pixels on a 2D grid. Colour images can therefore be thought of as a similar pixel collection, only that each pixel is now containing information on 3 aspects - those being the Red-Green-Blue colours. In the second part of this post, I am going to create 3 separate copy images [by flooding the original image with either red, green or blue] which will then be stacked to create a poster. + +--- + +import numpy as np + +from skimage import io # the io module allows us to read in an image + +import matplotlib.pyplot as plt + +### provided our image is located in the same working directory as this script + +camaro = io.imread('camaro.jpg') +print(camaro) # image is actually an array + +camaro.shape # a coloured image is a 3d array + +plt.imshow(camaro) # looking at the pane 'Plots' we can see the image +plt.show() + + +### SLICING of the array will result on cropping the image +#NOTE: the 3rd channel/ axis is the colour channels + +cropped = camaro[:, :, :] # no cropping here since we have not specified any slice to crop +plt.imshow(cropped) +plt.show() + +cropped = camaro[0:500, :, :] # crop horizontally only [KEEP the slice specified, i.e. pixels 0 to 500] +plt.imshow(cropped) +plt.show() + +cropped = camaro[:, 0:400, :] # crop vertically only [KEEP the slice specified, i.e. pixels 0 to 400] +plt.imshow(cropped) +plt.show() + +cropped = camaro[350:1100, 200:1400, :] # Crop Vertically & Horizontally [KEEP the car only!] +plt.imshow(cropped) +plt.show() + + +#### SAVE THE CROPPED IMAGE TO THE SAME WORKING DIRECTORY +io.imsave('camaro_cropped.jpg', cropped) + +Here is the final cropped image created! +![alt text](/img/posts/camaro_cropped.jpg "Image-Processing-NumPy") + + +### We can also FLIP our image using the start-stop-step logic. Using -1 where step is will flip the image + +h_flip = camaro[::-1, :, :] # flip the image along an imaginary horizontal binder +plt.imshow(h_flip) +plt.show() + +io.imsave('camaro_h_flip.jpg', h_flip) # saves the image to our working directory + +v_flip = camaro[:, ::-1, :] # flip the image along an imaginary vertical axis [hint - see plants position!] +plt.imshow(v_flip) +plt.show() + +io.imsave('camaro_v_flip.jpg', v_flip) # saves the image to our working directory + +Here is the horizontally flipped image created! + +![alt text](/img/posts/camaro_h_flip.jpg "Image-Processing-NumPy") + + + +### Colour Channels (RGB) - By zeroing 2 out of 2 channels every time we can see the 1 remaining channel (R, G or B) + +### Create a new array with excatly same dimentions as the original image + +### create a new array which only has zero values in it (same shape as our original image) +red_array = np.zeros(camaro.shape, dtype = 'uint8') +red_array[:, :, 0] = camaro [:, :, 0] # imput the RED pixel values only, from the original image) +plt.imshow(red_array) + +### create a new 2nd array which only has zero values in it (same shape as our original image) + +green_array = np.zeros(camaro.shape, dtype = 'uint8') +green_array[:, :, 1] = camaro [:, :, 1] # imput the GREEN pixel values only, from the original image +plt.imshow(green_array) + +### create a new 3rd array which only has zero values in it (same shape as our original image) + +blue_array = np.zeros(camaro.shape, dtype = 'uint8') +blue_array[:, :, 2] = camaro [:, :, 2] # imput the BLUE pixel values only, from the original image +plt.imshow(blue_array) + + +### STACK the 3 newly created arrays to create a poster image + + +v_poster = np.vstack((red_array, green_array, blue_array)) # vertical stack --> portrait poster +plt.imshow(v_poster) + +h_poster = np.hstack((red_array, green_array, blue_array)) # horizontal stack --> landscape poster +plt.imshow(h_poster) + +io.imsave('camaro_h_poster.jpg', h_poster) + + +Here is the final poster created! + +![alt text](/img/posts/camaro_h_poster.jpg "Image-Processing-NumPy") + + + diff --git a/_posts/2026-03-01-ML.md b/_posts/2026-03-01-ML.md new file mode 100644 index 000000000..796b41f89 --- /dev/null +++ b/_posts/2026-03-01-ML.md @@ -0,0 +1,71 @@ +--- +layout: post +title: "Predictions for Targeted Marketing" +image: "/posts/ROI_ML.png" +tags: [Machine-Learning, Classification] +--- + + +A machine learning project to predict which customers are most likely to sign up to a service, enabling a business to run targeted and cost-effective marketing campaigns and increase return-on-investment (ROI). + +--- + +Business Problem + +A supermarket ran a marketing campaign promoting its home delivery service, targeting their entire customer base. While this approach generated signups, it was costly and inefficient because many customers had a very low likelihood of converting. + +The goal of this project was to determine whether machine learning could identify customers who are most likely to sign up, allowing the business to run more targeted and cost-effective marketing campaigns in the future. + +--- + +Data + +The dataset contained three months of historical customer data collected from the previous campaign. Customer attributes included variables such as: + +Distance from the store + +Number of transactions + +Customer purchasing behaviour + +Other engagement metrics + +These features were used to model the probability that a customer would sign up for the home delivery service. + +--- + +Approach + +The project followed a typical end-to-end data science workflow: + +Data cleaning and preprocessing + +Exploratory data analysis (EDA) to identify key patterns + +Feature engineering to improve predictive power + +Training several ML classification models (Logistic Regression, Decision Tree, Random Forest, K Nearest Neighbor) to predict the probability of customer signup. + +Evaluating model performance to identify the best predictor of customer signup probability. Evaluation metrics included: Accuracy, Precision / Recall, ROC-AUC. + +--- + +Business Impact + +The final model can assign a probability score for each customer, indicating their likelihood of signing up for the delivery service. + +This enables the supermarket to: + +Prioritise high-probability customers + +Run targeted marketing campaigns + +Reduce wasted marketing spend + +Improve conversion rates and campaign ROI + +--- + +Key Takeaway + +This project demonstrates how machine learning can transform customer data into actionable insights, helping businesses make smarter, data-driven decisions. diff --git a/_posts/Earthquakes_Map.png b/_posts/Earthquakes_Map.png new file mode 100644 index 000000000..96f707074 Binary files /dev/null and b/_posts/Earthquakes_Map.png differ diff --git a/_posts/Eartquakes_Map.png b/_posts/Eartquakes_Map.png new file mode 100644 index 000000000..96f707074 Binary files /dev/null and b/_posts/Eartquakes_Map.png differ diff --git a/_posts/Map.png b/_posts/Map.png new file mode 100644 index 000000000..96f707074 Binary files /dev/null and b/_posts/Map.png differ diff --git a/_posts/coffee_python.jpg b/_posts/coffee_python.jpg new file mode 100644 index 000000000..02d7aff37 Binary files /dev/null and b/_posts/coffee_python.jpg differ diff --git a/docs/Draft_CV_Jan_2026.pdf b/docs/Draft_CV_Jan_2026.pdf new file mode 100644 index 000000000..9a46bde41 Binary files /dev/null and b/docs/Draft_CV_Jan_2026.pdf differ diff --git a/img/Eartquakes_Map.png b/img/Eartquakes_Map.png new file mode 100644 index 000000000..96f707074 Binary files /dev/null and b/img/Eartquakes_Map.png differ diff --git a/img/ROI_ML.png b/img/ROI_ML.png new file mode 100644 index 000000000..42639840d Binary files /dev/null and b/img/ROI_ML.png differ diff --git a/img/SoY_Profile_Photo b/img/SoY_Profile_Photo new file mode 100644 index 000000000..ba9b17d95 Binary files /dev/null and b/img/SoY_Profile_Photo differ diff --git a/img/SoY_Profile_Photo.jpg b/img/SoY_Profile_Photo.jpg new file mode 100644 index 000000000..ba9b17d95 Binary files /dev/null and b/img/SoY_Profile_Photo.jpg differ diff --git a/img/coffee_python.jpg b/img/coffee_python.jpg new file mode 100644 index 000000000..02d7aff37 Binary files /dev/null and b/img/coffee_python.jpg differ diff --git a/img/posts/Map.png b/img/posts/Map.png new file mode 100644 index 000000000..96f707074 Binary files /dev/null and b/img/posts/Map.png differ diff --git a/img/posts/Performance_Marketing.jpg b/img/posts/Performance_Marketing.jpg new file mode 100644 index 000000000..ff4f1845b Binary files /dev/null and b/img/posts/Performance_Marketing.jpg differ diff --git a/img/posts/Performance_Marketing_mod.jpg b/img/posts/Performance_Marketing_mod.jpg new file mode 100644 index 000000000..eea77a7c0 Binary files /dev/null and b/img/posts/Performance_Marketing_mod.jpg differ diff --git a/img/posts/ROI_ML.png b/img/posts/ROI_ML.png new file mode 100644 index 000000000..42639840d Binary files /dev/null and b/img/posts/ROI_ML.png differ diff --git a/img/posts/camaro.jpg b/img/posts/camaro.jpg new file mode 100644 index 000000000..07e41815b Binary files /dev/null and b/img/posts/camaro.jpg differ diff --git a/img/posts/camaro_cropped.jpg b/img/posts/camaro_cropped.jpg new file mode 100644 index 000000000..27e384136 Binary files /dev/null and b/img/posts/camaro_cropped.jpg differ diff --git a/img/posts/camaro_h_flip.jpg b/img/posts/camaro_h_flip.jpg new file mode 100644 index 000000000..1d3720d2a Binary files /dev/null and b/img/posts/camaro_h_flip.jpg differ diff --git a/img/posts/camaro_h_poster.jpg b/img/posts/camaro_h_poster.jpg new file mode 100644 index 000000000..aa268be4f Binary files /dev/null and b/img/posts/camaro_h_poster.jpg differ diff --git a/img/posts/games_guess_the_number.png b/img/posts/games_guess_the_number.png new file mode 100644 index 000000000..a09ab4a5b Binary files /dev/null and b/img/posts/games_guess_the_number.png differ