Built a Python-based, containerized ETL pipeline that processes large-scale Common Crawl WET files into AI-ready datasets.
All data processing stages, ingestion, filtering, toxicity screening, deduplication, normalization, and tokenization, are implemented
in modular Python scripts, each running in its own Docker container on AWS Fargate via AWS Batch. Infrastructure, including VPC
networking, IAM roles, S3 storage, Batch compute environments, and Step Functions orchestration, is fully provisioned and managed with
Terraform for repeatable deployments.
A feature-rich web chat application designed for seamless AI communication and mental health support with real-time chat through WebSockets, ensuring instant updates and interactions. The app leverages advanced APIs such as OpenAI for custom AI chat functionality and Stripe for secure payment and subscription handling. Built with HTML, CSS, JavaScript, React, Node.js, Express.js, JWT for robust authentication, MySQL for structured data handling, and MongoDB for efficient chat history storage.
Video ▼
Watch the web chat app in action with a video that highlights key features and functionality.