Fine-Tuning Large Language Models with a Production-Grade Pipeline

Fine-Tuning Large Language Models with a Production-Grade Pipeline

Introduction - Solving cloud resources and reproducibility for LLMs A few of weeks ago, I wrote a post about the challenges of training large ML models, in particular: the need for more computing power and the complexity of managing cloud resources; the difficulty of keeping track of ML experiments and reproducing results. There I proposed a solution to these problems by using SkyPilot and DVC to manage cloud resources and track experiments, respectively....

September 8, 2023
ML experiments in the cloud with Skypilot and DVC

ML experiments in the cloud with SkyPilot and DVC

Introduction One of the things that makes machine learning hard is that you have to run a lot of experiments. You have to try different models, different data sets, different hyperparameters, different features. And each experiment can take a long time to run, especially if you’re working on deep learning problems. You can’t just run them on your laptop or desktop. You need more computing power, and you need it fast....

August 10, 2023

Week 1: Kick-starting an ML project

Slides 🖼️ Week 1: ML project lifecycle and MLOps best practices Learning objectives Understand the core philosophy behind MLOps ideas Apply best practices for establishing ML project structure and dependencies management Manage project dependencies with pip and virtualenv Version datasets with DVC Project Introduction Problem Description and Dataset This dataset contains 10,000 records, each of which corresponds to a different bank’s user. The target is Exited, a binary variable that describes whether the user decided to leave the bank....

Week 2: ML Pipelines, Reproducibility Experimentation

Slides 🖼️ Week 2: ML Pipelines, Reproducibility and Experimentation Learning objectives Refactor a Jupyter notebook into a reproducible ML pipeline Version artifacts of an ML pipeline in a remote storage Iterate over a large number of ML experiments in a disciplined way Steps Refactor Jupyter notebook in a DVC pipeline Docs: https://dvc.org/doc/start/data-pipelines Create the following files to read parameter values from a file params.yaml base: project: bank_customer_churn raw_data_dir: data/raw countries: - France - Spain feat_cols: - CreditScore - Age - Tenure - Balance - NumOfProducts - HasCrCard - IsActiveMember - EstimatedSalary targ_col: Exited random_state: 42 data_split: test_size: 0....

Week 3: CI/CD for ML and ML-based Web API

Slides 🖼️ Week 3: CI/CD for ML Learning Objectives Learn the basics of CI/CD Leverage the power of CI/CD tools for ML projects with CML Integrate an ML model into the FastAPI framework Build and test a Docker container running a web API service Deploy the resulting Docker container to cloud Steps Introduction to GitHub Actions and CML Introduction to GitHub Actions Introduction to CML CI/CD: Automatic reporting for model-related changes Add PERSONAL_ACCESS_TOKEN , AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to GH secrets: https://docs....

Week 4: Monitoring for ML Projects

Slides 🖼️ Week 4: Data Drift Monitoring for ML Projects Learning Objectives Distinguish between application monitoring and ML monitoring Use Alibi Detect framework to detect data drift Steps Introduction to Data Drift Monitoring What’s data drift and why do we need to monitor for it? Intro to Alibi Detect Add Churn_Modelling_Germany.csv to data/more_data/ Churn_Modelling_Germany.csv Add /more_data entry to data/.gitignore Create and explore notebooks/DriftDetection.ipynb DriftDetection.ipynb Incorporate drift detection into the DVC pipeline Create src/stages/drift_detector....