About

Paul Cho

Data Science Lead

Skills

Machine Learning & AI

PythonPyTorchSentence TransformersLLM OrchestrationEmbeddings & Vector SearchNLPRAG

Data & Infrastructure

SQLPostgreSQLSupabasepgvectorPandasNumPyETL Pipelines

Web & Product

TypeScriptNext.jsReactTailwind CSSVercelSanity CMS

Tools & Workflow

GitDockerGoogle CloudGemini APIREST APIsCI/CD

Now

Updated March 27, 2026

Building Araverus — a financial news intelligence platform that aggregates coverage across publications and clusters articles into developing narratives using ML embeddings and LLM-powered analysis.

Day to day, I work across the full stack: designing embedding pipelines for semantic article matching, tuning retrieval thresholds on real production data, and building the Next.js frontend that surfaces these insights to users. The entire system runs autonomously on a daily schedule for about $11/month.

Currently focused on improving the narrative threading system — replacing heuristic-based clustering with an LLM Judge that understands editorial context, not just vector similarity.

Outside of Araverus, I'm exploring retrieval-augmented generation patterns and how small, specialized models can replace expensive API calls in production ML pipelines.