Hello I am
Viet-Thai Nguyen

Welcome to my data playground

🇫🇷 Évoli - FLE Agent

Claude API Flask Supabase fpdf2 Render WAT Framework

Évoli is an AI agent for French learners. Pick a CEFR level (A1–C1) and a topic - the agent autonomously sources a real, current French article, generates 5 comprehension questions, builds a curated vocabulary section with French and English definitions, and scores your answers instantly. Subscribers receive a fresh, personalised styled 4-page exercise every Monday by email - fully automated.

Built with Python and Flask, powered by Anthropic Claude API for on-demand content generation. PDF rendering is handled in-memory with fpdf2, subscriber data stored in Supabase (PostgreSQL), and the weekly pipeline is orchestrated by GitHub Actions. Deployed on Render following the WAT framework (Workflows → Agent → Tools).

Free hosting - app may take ~30 sec to wake up if idle.

Music Source Separation
& Karaoke

Deep Learning Audio Segmentation MLOps Web Dev

MMaVVie is an application that has 2 main features: Audio Separation & Karaoke. The core technology is the music source segmentation model built on U-Net architecture. The model is called from the Streamlit web app by FastAPI endpoints. External APIs like Whisper & GeniusLyrics are also used for the Karaoke session.

Kindle Books Reviews
ML application

NLP Sentiment Analysis MLOps Data Validation Web Dev

Built a NLP model for Sentiment Analysis of Amazon Kindle books reviews. Deployed the interface in a Streamlit app for users to interact with the model through FastAPI and PostgreSQL. Created Airflow jobs to ingest raw data and predict a batch of validated data. The pipeline is then monitored by a Grafana dashboard.

Starbucks Promotional Offers Analysis

Machine Learning EDA Blogging

Raised questions and analyzed Starbucks marketing strategy to find the best promotional offer, using a few prediction models and performed hyperparameters tuning with Python & Scikit-learn. EDA, preprocessing, and modelling steps are explained in a Medium post.

Disaster Response
Pipeline

Data Engineering ETL Machine Learning Web Dev

To help organizations respond to disaster events faster and more accurate, using Python, I built an ETL pipeline to process the raw data, an ML pipeline to classify into categories, and a Flask application to input a new message and visualize the data.

AWS Data Warehouse
Pipeline

Data Engineering ETL AWS

Built an ETL pipeline for Sparkify database hosted on Redshift by loading data from S3 to staging tables on Redshift and executed SQL statements to create the analytics tables.

Tax Identification Number Crawling

Web Scraping Data Wrangling

Used Selenium module to crawl more than 2000 companies web data of Tax Identification Number, Company Name and Address and wrote into an Excel file by Openpyxl package.

Alternator Market Analytics

Business Inteligence Data Cleaning

Cleaned over 120k of raw data and extracted valuable categories of the alternator market in Vietnam, visualized and gained multidimensional insights using Python and Power BI.

ERP Item Code
Automation

ERP Automation

Standardized name rules for 12k products in company used in Microsoft Dynamics 365, created a Python automation tool for IT admin to import, edit, and manage products information.