Why Use Pipelines?
Pipelines prevent data leakage by ensuring that all transformations (like scaling or encoding) are fitted only on the training folds during cross-validation.
Pipeline Example
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
pipeline = Pipeline([
('scaler', StandardScaler()),
('classifier', LogisticRegression())
])
pipeline.fit(X_train, y_train)