Clear all

Question about GetAge()  



As GetAge() is applied to all numerical_features, could someone advise on how it is able to correctly identify the YearBuilt column to apply the GetAge() function in the pipeline, instead of applying the transformation to all numerical columns? Thanks in advance!

class GetAge(BaseEstimator, TransformerMixin):
"""Custom Transformer: Calculate age (years only) relative to current year. Note that
the col values will be replaced but the original col name remains. When the transformer is
used in a pipeline, this is not an issue as the names are not used. However, if the data
from the pipeline is to be converted back to a DataFrame, then the col name change should
be done to reflect the correct data content."""

def fit(self, X, y=None):
return self

def transform(self,X):
current_year = int(
X.apply(lambda x: current_year - x)
"""TASK: Replace the 'YearBuilt' column values with the calculated age (subtract the
current year from the original values).

return X


preprocess = make_column_transformer(
(GetAge(), numerical_features),
(StandardScaler(), numerical_features),
(OneHotEncoder(), categorical_features)

Hey! Which lesson is this from? Some extra context will be useful in letting me help you.

This is from the exercise in AI4I-5 SUP-3 Regression


Delete your account