Bonjour! I'm
Pragyan Jyoti Dutta
And I'm a Data Scientist:)

About Me

Hello, I'm Pragyan Jyoti Dutta, a multifaceted Data Scientist, Data Engineer, Data Analyst, and Machine Learning Engineer.

Currently pursuing a Master’s in Data Science and Artificial Intelligence from the University of Liverpool and complemented by a diverse educational portfolio, including a Data Science Diploma from Indian Institute of Technology Madras and a Bachelor's in Physics, my expertise spans across the full spectrum of data-centric roles. My experience as a Data Engineering Intern at DaaS and a Machine Learning Researcher at Spartificial has equipped me with a unique blend of skills in machine learning, complex data analysis, and robust data engineering practices. I have led and contributed to projects in various domains, including healthcare and astronomy, achieving milestones like publishing in IEEE Xplore.


My approach integrates the precision of a Data Analyst, the system-level thinking of a Data Engineer, the innovation of a Machine Learning Engineer, and the holistic insight of a Data Scientist. This versatility allows me to seamlessly navigate and connect different facets of data work, from developing full-stack ecosystems for advanced legal research to applying sophisticated Machine Learning techniques for astronomical classification. Driven by a passion for problem-solving and innovation, I am constantly seeking opportunities to explore and push the boundaries of AI and data science. My commitment extends beyond professional work into contributions in open-source projects and thought leadership in the field through my writings.

EFGHIJ
My Interests :
ABCD

Writing

Music

Football

Powerlifting

GET MY RESUME

My Works

Let me walk you through some of my work pieces!

Source Code
Uber Analytics ETL Project

A Data Engineering Project built on Google Cloud Platform and Mage Data Tool to analyze data from Uber.

GCP BigQuery Mage DataTool ETL Tools ComputeEngine LookerStudio
Source Code
Employee Churn Model Predictor

A predictive tool designed towards predicting the churn rate and enabling companies to retain their employees for longer periods using insights of the project.

GCP BigQuery PyCaret AutoML Python LookerStudio
Source Code
Live Demo
Grocify

Grocify is a your community-focused e-commerce grocery store webapp aimed at serving the community in the best possible way .

HTML CSS BOOTSTRAP JS PYTHON SQLite
Source Code
Movie Sentiment Reviews

This project aims to predict the sentiment of movie reviews using state of the art machine learning models.

scikit-learn Pyhon matplotlib regex Transformers
Source Code
Pulsar Star classification

Research project aimed at findint the optimal technique to use, Dimentionality reduction and then resampling or vice versa and then apply Machine Learning and Deep Learning models to classify pulsar stars for the HTRU dataset.

Dimentionality Reduction Resampling Techniques Neural Networks
Source Code
Live Demo
Pragyan's Portfolio

This is my portfolio site

JavaScript HTML CSS
Source Code
FaceBook Ad Analysis

Delved deep into the intricacies of ad campaigns launched on Facebook aimed at understanding the dynamics of these campaigns and extract actionable insights to further business objectives.

Jupyter Notbooks matplotlib plotly numpy
Source Code
Indian Premier League EDA

Performed Exploratory Data Analysis on data extracted from the Indian Premier League to find key insights from the data.

Jupyter Notbooks matplotlib plotly numpy
View Dashboard
NETFLIX Analytics Dashboard

A Dashboard made in Tableau showcasing the various Analytical stats of Netflix

Tableau Visualisation Presentation UI/UX
View Dashboard
British Airways Dashboard

A Dashboard made in Tableau showcasing the various Analytical stats of British Airways ranging from most travelled places to best performing service

Tableau Visualisation Presentation UI/UX
View Dashboard
King County, Washinton House Sales Dashboard

A Dashboard made in Tableau showcasing the various Analytical stats of King County, Washinton House Sales

Tableau Visualisation Presentation UI/UX
View Dashboard
Patients Analytics in PowerBI

A Dashboard made in PowerBI showcasing the various Analytical stats of a hospital with key fields being number in-patients and out-patients.

PowerBI Visualisation Presentation UI/UX

My skills

Here are some of the programming languages and technologies I'm learning and trying to become an expert at!

Experience

Aug 2022 - Feb 2023

Data Engineering Intern

DaaS

As a Data Engineering Intern at Developer As a Service (DaaS), I conducted data mining using OCRs and Regex models on unstructured pixelated image legal documents and then engineered the data for further modeling, collaborating with my team to create a full end-to-end ecosystem for advanced legal research services, and employed strong communication and leadership skills, leading to an 80% increase in legal document processing efficiency within the firm.

  • Python
  • GCP Vertex AI
  • Data Mining
  • Regular Expressions
  • OCR Technology
Jun 2022 - Sept 2022

Machine Learning Research Intern

Spartificial

During my time with Spartificial, I underwent training in RNN, Neural Networks, TensorFlow, OpenCV, Image Segmentation, etc., and conducted research on comparing the approaches of using Resampling Techniques before using Dimensionality Reduction techniques for an imbalanced Pulsar Star dataset followed by an alternate approach of using Dimensionality Reduction before Resampling the dataset and then proceed to use Machine Learning and Neural Networks to classify Pulsar Stars from observational data.

  • TensorFlow
  • RNN
  • Neural Networks
  • OpenCV
  • Image Segmentation
Sept 2021 - Oct 2021

Data Science Intern

The Sparks Foundation

This Internship introduced me to the basics of Data Science and allowed me to build projects on a variety of datasets.

  • Python
  • Exploratory Data Analysis
  • Data Cleaning
  • Market Analysis
Aug 2021 - Oct 2021

Summer Internship

Naxxatra Sciences and Collaborative Research

I worked on analyzing the forest cover of India and the land usage pattern of the last decade using Python, aiding in policy adjustments.

  • Python
  • Data Analysis
  • Research
  • Collaborative Work
  • Scientific Analysis

Education

2025

MSc. in Data Science and Artificial Intelligence

University of Liverpool, United Kingdom

CGPA: On course for a 2:1 degree

Relevant Modules: Applied Artificial Intelligence, Data Mining and Visualisation, Machine Learning and Bioinspired Optimisation, Computational Intelligence

Societies: Data Science Society, Strength & Conditioning Club

2023

Online Diploma in Data Science

Indian Institute of Technology Madras, India

CGPA: 7.75/10

Relevant Modules: Machine Learning Foundations, Machine Learning Techniques, Machine Learning Practice, App Development 1,App Development 2, Tools for Data Science, Business Data Management, Business Analytics, Computational Thinking, Programming in Python, Statistics for AI, Mathematics for AI

Societies: Jamsetji TATA Society for Innovation and Entrepreneurship(JITSIE),WYZ- The IITM Quiz Club

2023

BSc. in Physics

Tezpur University, India

CGPA: 8.06/10

Relevant Modules: Quantum Mechanics, Thermodynamics, Mathematical Physics

Dissertation: Modelling Radiative Transfer for the Coalsack Region

2020

Higher Secondary

Delhi Public School Digboi, India

Percentage: 95.67%

Publications

Here are a few pieces of work that I have published during my academic journey!

Publication Image

Health risk detection through web app using machine learning

Health is one of the important aspects of human life, and without proper health, the functioning of the human body is very tough. Diseases are one of the biggest problems that humans need to fight. This publication aims to help people detect diseases at an early stage using a web app.

Published at: IEEE Xplore

Date: 18 July 2022

Conference: 2022 2nd International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE)

Read Publication

Certifications

Have a look at some of my Certifications in the section below!

Certification Link
Generative AI using Large Language Model

Prompt Engineering Transformers Generative AI Large Language Models
Certification Link
Matillion Data Productivity Cloud, Foundations Training Course

Data Engineering ETL Pipelines ELT Pipelines Data Productivity
Certification Link
Building a Data Warehouse using Matillion Training Course

Data Warehousing Matillion DPC Project Management
Certification Link
HackerRank SQL(Basic) Certification

Basic Select Basic Joins Basic Aggregators
Certification Link
SQL(Intermediate) Certification

Data Querying Advanced Joins Aggregation Advanced Select Sub Querying
Certification Link
Deep Learning with PyTorch : Object Localization

PyTorch Object Localization Tensorflow Keras
Certification Link
Python for Data Science, AI and Development

Python scikit-learn classification Regression
Certification Link
Mathematics for Machine Learning: Multivariate Calculus

Calculus Linear Algebra Probability Statistics
Certification Link
Tools For Data Science

Web Scraping libraries Data Cleaning processes Visualization tools
`

Check Me On

Wanna know about my profile on other platforms? Just tap on the icons below to have a glance at them!

Contact me

Message me
Get in Touch

If you've got a project in mind, why not get in touch! Let's work together. You can send me message by filling this form.

Name
Pragyan Jyoti Dutta
Address
2 Crown Station Place, Liverpool-L7 3LA, Merseyside, United Kingdom
× Mail sent! Thank You:)