sumnplot.discretisation.QuantileDiscretiser
- class sumnplot.discretisation.QuantileDiscretiser(variable, quantiles=(0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9, 1.0))[source]
Bases:
sumnplot.discretisation.DiscretiserQuantile discretisation.
This tansformer uses cut points defined by quantiles of the given variable.
Note, this transformer handles weighted quantiles.
- Parameters
variable (str) – Column to discretise in X, when the transform method is called.
quantiles (tuple, default = (0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1)) – Quantiles defining the cut points to bucket variable at.
- __init__(variable, quantiles=(0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6000000000000001, 0.7000000000000001, 0.8, 0.9, 1.0))[source]
Methods
__init__(variable[, quantiles])fit(X[, y, sample_weight])Calculate cut points on the input data X.
fit_transform(X[, y])Fit to data, then transform it.
get_params([deep])Get parameters for this estimator.
set_params(**params)Set the parameters of this estimator.
transform(X)Cut variable in X at cut_points.
- fit(X, y=None, sample_weight=None)[source]
Calculate cut points on the input data X.
Cut points are (potentially weighted) quantiles specified when initialising the transformer.
- Parameters
X (pd.DataFrame) – DataFrame containing column to discretise. This column is defined by the variable attribute.
y (pd.Series, default = None) – Response variable. Not used. Only implemented for compatibility with scikit-learn.
sample_weight (pd.Series or np.ndarray, default = None) – Optional, sample weights for each record in X.
- fit_transform(X, y=None, **fit_params)
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.
- Returns
X_new – Transformed array.
- Return type
ndarray array of shape (n_samples, n_features_new)
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters
**params (dict) – Estimator parameters.
- Returns
self – Estimator instance.
- Return type
estimator instance
- transform(X)
Cut variable in X at cut_points. This function uses the pd.cut method.
A specific null category is added on the cut output.
- Parameters
X (pd.DataFrame) – DataFrame containing column to discretise. This column is defined by the variable attribute.
- Returns
variable_cut – Discretised variable.
- Return type
pd.Series