

Machine Learning Production Systems


Machine Learning Production Systems - Najlepsze oferty
Machine Learning Production Systems - Opis
Using machine learning for products, services, and critical business processes is quite different from using ML in an academic or research setting-especially for recent ML graduates and those moving from research to a commercial environment. Whether you currently work to create products and services that use ML, or would like to in the future, this practical book gives you a broad view of the entire field.Authors Robert Crowe, Hannes Hapke, Emily Caveness, and Di Zhu help you identify topics that you can dive into deeper, along with reference materials and tutorials that teach you the details. You'll learn the state of the art of machine learning engineering, including a wide range of topics such as modeling, deployment, and MLOps. You'll learn the basics and advanced aspects to understand the production ML lifecycle.This book provides four in-depth sections that cover all aspects of machine learning engineering:Data: collecting, labeling, validating, automation, and data preprocessing; data feature engineering and selection; data journey and storageModeling: high performance modeling; model resource management techniques; model analysis and interoperability; neural architecture searchDeployment: model serving patterns and infrastructure for ML models and LLMs; management and delivery; monitoring and loggingProductionalizing: ML pipelines; classifying unstructured texts and images; genAI model pipelines Spis treści:Foreword
Preface
Who Should Read This Book
Why We Wrote This Book
Navigating This Book
Conventions Used in This Book
Using Code (...) więcej Examples
OReilly Online Learning
How to Contact Us
Acknowledgments
Robert
Hannes
Emily
Di
1. Introduction to Machine Learning Production Systems
What Is Production Machine Learning?
Benefits of Machine Learning Pipelines
Focus on Developing New Models, Not on Maintaining Existing Models
Prevention of Bugs
Creation of Records for Debugging and Reproducing Results
Standardization
The Business Case for ML Pipelines
When to Use Machine Learning Pipelines
Steps in a Machine Learning Pipeline
Data Ingestion and Data Versioning
Data Validation
Feature Engineering
Model Training and Model Tuning
Model Analysis
Model Deployment
Looking Ahead
2. Collecting, Labeling, and Validating Data
Important Considerations in Data Collection
Responsible Data Collection
Labeling Data: Data Changes and Drift in Production ML
Labeling Data: Direct Labeling and Human Labeling
Validating Data: Detecting Data Issues
Validating Data: TensorFlow Data Validation
Skew Detection with TFDV
Types of Skew
Example: Spotting Imbalanced Datasets with TensorFlow Data Validation
Conclusion
3. Feature Engineering and Feature Selection
Introduction to Feature Engineering
Preprocessing Operations
Feature Engineering Techniques
Normalizing and Standardizing
Bucketizing
Feature Crosses
Dimensionality and Embeddings
Visualization
Feature Transformation at Scale
Choose a Framework That Scales Well
Avoid TrainingServing Skew
Consider Instance-Level Versus Full-Pass Transformations
Using TensorFlow Transform
Analyzers
Code Example
Feature Selection
Feature Spaces
Feature Selection Overview
Filter Methods
Wrapper Methods
Forward selection
Backward elimination
Recursive feature elimination
Code example
Embedded Methods
Feature and Example Selection for LLMs and GenAI
Example: Using TF Transform to Tokenize Text
Benefits of Using TF Transform
Alternatives to TF Transform
Conclusion
4. Data Journey and Data Storage
Data Journey
ML Metadata
Using a Schema
Schema Development
Schema Environments
Changes Across Datasets
Enterprise Data Storage
Feature Stores
Metadata
Precomputed features
Time travel
Data Warehouses
Data Lakes
Conclusion
5. Advanced Labeling, Augmentation, and Data Preprocessing
Advanced Labeling
Semi-Supervised Labeling
Label propagation
Sampling techniques
Active Learning
Margin sampling
Other sampling techniques
Weak Supervision
Advanced Labeling Review
Data Augmentation
Example: CIFAR-10
Other Augmentation Techniques
Data Augmentation Review
Preprocessing Time Series Data: An Example
Windowing
Sampling
Conclusion
6. Model Resource Management Techniques
Dimensionality Reduction: Dimensionality Effect on Performance
Example: Word Embedding Using Keras
Curse of Dimensionality
Adding Dimensions Increases Feature Space Volume
Dimensionality Reduction
Three approaches
Algorithmic dimensionality reduction
Principal component analysis
Quantization and Pruning
Mobile, IoT, Edge, and Similar Use Cases
Quantization
Benefits and process of quantization
MobileNets
Post-training quantization
Quantization-aware training
Comparing results
Example: Quantizing models with TF Lite
Optimizing Your TensorFlow Model with TF Lite
Optimization Options
Pruning
The Lottery Ticket Hypothesis
Pruning in TensorFlow
Knowledge Distillation
Teacher and Student Networks
Knowledge Distillation Techniques
TMKD: Distilling Knowledge for a Q&A Task
Increasing Robustness by Distilling EfficientNets
Conclusion
7. High-Performance Modeling
Distributed Training
Data Parallelism
Synchronous versus asynchronous training
Distribution awareness
Tf.distribute: Distributed training in TensorFlow
OneDeviceStrategy
MirroredStrategy
ParameterServerStrategy
Fault tolerance
Efficient Input Pipelines
Input Pipeline Basics
Input Pipeline Patterns: Improving Efficiency
Optimizing Your Input Pipeline with TensorFlow Data
Prefetching
Parallelizing data transformation
Caching
Training Large Models: The Rise of Giant Neural Nets and Parallelism
Potential Solutions and Their Shortcomings
Gradient accumulation
Swapping
Parallelism, revisited in the context of giant neural nets
Pipeline Parallelism to the Rescue?
Conclusion
8. Model Analysis
Analyzing Model Performance
Black-Box Evaluation
Performance Metrics and Optimization Objectives
Advanced Model Analysis
TensorFlow Model Analysis
The Learning Interpretability Tool
Advanced Model Debugging
Benchmark Models
Sensitivity Analysis
Random attacks
Partial dependence plots
Vulnerability to attacks
Measuring model vulnerability
Hardening your models
Residual Analysis
Model Remediation
Discrimination Remediation
Fairness
Fairness Evaluation
True/false positive/negative rates
Accuracy and AUC
Fairness Considerations
Continuous Evaluation and Monitoring
Conclusion
9. Interpretability
Explainable AI
Model Interpretation Methods
Method Categories
Intrinsic or post hoc?
Model specific or model agnostic?
Local or global?
Intrinsically Interpretable Models
Feature importance
Lattice models
Model-Agnostic Methods
Partial dependence plots
Permutation feature importance
Local Interpretable Model-Agnostic Explanations
Shapley Values
The SHAP Library
Testing Concept Activation Vectors
AI Explanations
Integrated gradients
XRAI
Example: Exploring Model Sensitivity with SHAP
Regression Models
Natural Language Processing Models
Conclusion
10. Neural Architecture Search
Hyperparameter Tuning
Introduction to AutoML
Key Components of NAS
Search Spaces
Macro search space
Micro search space
Search Strategies
Performance Estimation Strategies
Simple approach to performance estimation
More efficient performance estimation
AutoML in the Cloud
Amazon SageMaker Autopilot
Microsoft Azure Automated Machine Learning
Google Cloud AutoML
Using AutoML
Generative AI and AutoML
Conclusion
11. Introduction to Model Serving
Model Training
Model Prediction
Latency
Throughput
Cost
Resources and Requirements for Serving Models
Cost and Complexity
Accelerators
Feeding the Beast
Model Deployments
Data Center Deployments
Mobile and Distributed Deployments
Model Servers
Managed Services
Conclusion
12. Model Serving Patterns
Batch Inference
Batch Throughput
Batch Inference Use Cases
Product recommendations
Sentiment analysis
Demand forecasting
ETL for Distributed Batch and Stream Processing Systems
Introduction to Real-Time Inference
Synchronous Delivery of Real-Time Predictions
Asynchronous Delivery of Real-Time Predictions
Optimizing Real-Time Inference
Real-Time Inference Use Cases
Serving Model Ensembles
Ensemble Topologies
Example Ensemble
Ensemble Serving Considerations
Model Routers: Ensembles in GenAI
Data Preprocessing and Postprocessing in Real Time
Training Transformations Versus Serving Transformations
Windowing
Options for Preprocessing
Enter TensorFlow Transform
Postprocessing
Inference at the Edge and at the Browser
Challenges
Balancing energy consumption with processing power
Performing model retraining and updates
Securing the user data
Model Deployments via Containers
Training on the Device
Federated Learning
Runtime Interoperability
Inference in Web Browsers
Conclusion
13. Model Serving Infrastructure
Model Servers
TensorFlow Serving
Servables
Servable versions
Models
Loaders
Sources
Aspired versions
Managers
Core
NVIDIA Triton Inference Server
TorchServe
Building Scalable Infrastructure
Containerization
Traditional Deployment Era
Virtualized Deployment Era
Container Deployment Era
The Docker Containerization Framework
Docker daemon
Docker client
Docker registry
Docker objects
Docker image
Docker container
Container Orchestration
Kubernetes
Kubernetes components
Containers on clouds
Kubeflow
Reliability and Availability Through Redundancy
Observability
High Availability
Automated Deployments
Hardware Accelerators
GPUs
TPUs
Conclusion
14. Model Serving Examples
Example: Deploying TensorFlow Models with TensorFlow Serving
Exporting Keras Models for TF Serving
Setting Up TF Serving with Docker
Basic Configuration of TF Serving
Making Model Prediction Requests with REST
Making Model Prediction Requests with gRPC
Getting Predictions from Classification and Regression Models
Using Payloads
Getting Model Metadata from TF Serving
Making Batch Inference Requests
Example: Profiling TF Serving Inferences with TF Profiler
Prerequisites
TensorBoard Setup
Model Profile
Example: Basic TorchServe Setup
Installing the TorchServe Dependencies
Exporting Your Model for TorchServe
Setting Up TorchServe
Request handlers
TorchServe configuration
Making Model Prediction Requests
Making Batch Inference Requests
Setting batch configuration via config.properties
Setting batch configuration via REST request
Conclusion
15. Model Management and Delivery
Experiment Tracking
Experimenting in Notebooks
Experimenting Overall
Not just one big file
Tracking runtime parameters
Tools for Experiment Tracking and Versioning
TensorBoard
Tools for organizing experiment results
Introduction to MLOps
Data Scientists Versus Software Engineers
ML Engineers
ML in Products and Services
MLOps
MLOps Methodology
MLOps Level 0
MLOps Level 1
MLOps Level 2
Components of an Orchestrated Workflow
Three Types of Custom Components
Python FunctionBased Components
Container-Based Components
Fully Custom Components
TFX Deep Dive
TFX SDK
Intermediate Representation
Runtime
Implementing an ML Pipeline Using TFX Components
Advanced Features of TFX
Component dependency
Data dependency
Task dependency
Importer
Conditional execution
Managing Model Versions
Approaches to Versioning Models
Versioning proposal
Arbitrary grouping
Black-box functional model
Pipeline execution versioning
Model Lineage
Model Registries
Continuous Integration and Continuous Deployment
Continuous Integration
Continuous Delivery
Progressive Delivery
Blue/Green Deployment
Canary Deployment
Live Experimentation
A/B testing
Multi-armed bandits
Contextual bandits
Conclusion
16. Model Monitoring and Logging
The Importance of Monitoring
Observability in Machine Learning
What Should You Monitor?
Custom Alerting in TFX
Logging
Distributed Tracing
Monitoring for Model Decay
Data Drift and Concept Drift
Model Decay Detection
Supervised Monitoring Techniques
Statistical process control
Sequential analysis
Error distribution monitoring
Unsupervised Monitoring Techniques
Clustering
Feature distribution monitoring
Model-dependent monitoring
Mitigating Model Decay
Retraining Your Model
When to Retrain
Automated Retraining
Conclusion
17. Privacy and Legal Requirements
Why Is Data Privacy Important?
What Data Needs to Be Kept Private?
Harms
Only Collect What You Need
GenAI Data Scraped from the Web and Other Sources
Legal Requirements
The GDPR and the CCPA
The GDPRs Right to Be Forgotten
Pseudonymization and Anonymization
Differential Privacy
Local and Global DP
Epsilon-Delta DP
Applying Differential Privacy to ML
Differentially Private Stochastic Gradient Descent
Private Aggregation of Teacher Ensembles
Confidential and Private Collaborative learning
TensorFlow Privacy Example
Federated Learning
Encrypted ML
Conclusion
18. Orchestrating Machine Learning Pipelines
An Introduction to Pipeline Orchestration
Why Pipeline Orchestration?
Directed Acyclic Graphs
Pipeline Orchestration with TFX
Interactive TFX Pipelines
Converting Your Interactive Pipeline for Production
Orchestrating TFX Pipelines with Apache Beam
Orchestrating TFX Pipelines with Kubeflow Pipelines
Introduction to Kubeflow Pipelines
Installation and Initial Setup
Accessing Kubeflow Pipelines
The Workflow from TFX to Kubeflow
OpFunc Functions
Orchestrating Kubeflow Pipelines
Google Cloud Vertex Pipelines
Setting Up Google Cloud and Vertex Pipelines
Setting Up a Google Cloud Service Account
Orchestrating Pipelines with Vertex Pipelines
Executing Vertex Pipelines
Choosing Your Orchestrator
Interactive TFX
Apache Beam
Kubeflow Pipelines
Google Cloud Vertex Pipelines
Alternatives to TFX
Conclusion
19. Advanced TFX
Advanced Pipeline Practices
Configure Your Components
Import Artifacts
Use Resolver Node
Execute a Conditional Pipeline
Export TF Lite Models
Warm-Starting Model Training
Use Exit Handlers
Trigger Messages from TFX
Custom TFX Components: Architecture and Use Cases
Architecture of TFX Components
Use Cases of Custom Components
Using Function-Based Custom Components
Writing a Custom Component from Scratch
Defining Component Specifications
Defining Component Channels
Writing the Custom Executor
Writing the Custom Driver
Assembling the Custom Component
Using Our Basic Custom Component
Implementation Review
Reusing Existing Components
Creating Container-Based Custom Components
Which Custom Component Is Right for You?
TFX-Addons
Conclusion
20. ML Pipelines for Computer Vision Problems
Our Data
Our Model
Custom Ingestion Component
Data Preprocessing
Exporting the Model
Our Pipeline
Data Ingestion
Data Preprocessing
Model Training
Model Evaluation
Model Export
Putting It All Together
Executing on Apache Beam
Executing on Vertex Pipelines
Model Deployment with TensorFlow Serving
Conclusion
21. ML Pipelines for Natural Language Processing
Our Data
Our Model
Ingestion Component
Data Preprocessing
Putting the Pipeline Together
Executing the Pipeline
Model Deployment with Google Cloud Vertex
Registering Your ML Model
Creating a New Model Endpoint
Deploying Your ML Model
Requesting Predictions from the Deployed Model
Cleaning Up Your Deployed Model
Conclusion
22. Generative AI
Generative Models
GenAI Model Types
Agents and Copilots
Pretraining
Pretraining Datasets
Embeddings
Self-Supervised Training with Masks
Fine-Tuning
Fine-Tuning Versus Transfer Learning
Fine-Tuning Datasets
Fine-Tuning Considerations for Production
Fine-Tuning Versus Model APIs
Parameter-Efficient Fine-Tuning
LoRA
S-LoRA
Human Alignment
Reinforcement Learning from Human Feedback
Reinforcement Learning from AI Feedback
Direct Preference Optimization
Prompting
Chaining
Retrieval Augmented Generation
ReAct
Evaluation
Evaluation Techniques
Benchmarking Across Models
LMOps
GenAI Attacks
Jailbreaks
Prompt Injection
Responsible GenAI
Design for Responsibility
Conduct Adversarial Testing
Constitutional AI
Conclusion
23. The Future of Machine Learning Production Systems and Next Steps
Lets Think in Terms of ML Systems, Not ML Models
Bringing ML Systems Closer to Domain Experts
Privacy Has Never Been More Important
Conclusion
Index mniej
Machine Learning Production Systems - Opinie i recenzje
Na liście znajdują się opinie, które zostały zweryfikowane (potwierdzone zakupem) i oznaczone są one zielonym znakiem Zaufanych Opinii. Opinie niezweryfikowane nie posiadają wskazanego oznaczenia.