Forum

Notifications
Clear all

Encountered "AttributeError" when run "train_test_split(preprocessed_data, output_var, ..." after "..RandomOverSampler.."  

   RSS

0

Hi,

Good day. Need some help.

I encountered "AttributeError" when run "x_train_res, y_train_res = over_sample.fit_resample(x_train, y_train.ravel())" in

SUP-4: Classification /Handling Data Imbalance / Exercise/ Balancing the data .

The preceding code of  preprocess assignment, preprocess.fit , RandomOverSampler  seems OK  & did not flag out error.

The code is included as below. Any idea what is the cause? 

~~~~~~~~~~~~~~~~~~~~~

preprocess = ColumnTransformer(
transformers=[
('standardscaler', StandardScaler(), num_features),
('onehotencoder', OneHotEncoder(), cat_features)
],
remainder='passthrough'
)

preprocessed_data = preprocess.fit_transform(input_data)

x_train, x_test, y_train, y_test = train_test_split(preprocessed_data, output_var, test_size=0.3, random_state=42)

1 over_sample = RandomOverSampler(random_state=0)
----> 2 x_train_res, y_train_res = over_sample.fit_resample(x_train, y_train.ravel())
3 print("After OverSampling, counts of label '1': {}".format(sum(y_train_res==1)))
4 print("After OverSampling, counts of label '0': {} n".format(sum(y_train_res==0)))

C:\ProgramData\Anaconda3\lib\site-packages\imblearn\base.py in fit_resample(self, X, y)
75 check_classification_targets(y)
76 arrays_transformer = ArraysTransformer(X, y)
---> 77 X, y, binarize_y = self._check_X_y(X, y)
78
79 self.sampling_strategy_ = check_sampling_strategy(

C:\ProgramData\Anaconda3\lib\site-packages\imblearn\over_sampling\_random_over_sampler.py in _check_X_y(self, X, y)
77 def _check_X_y(self, X, y):
78 y, binarize_y = check_target_type(y, indicate_one_vs_all=True)
---> 79 X, y = self._validate_data(
80 X, y, reset=True, accept_sparse=["csr", "csc"], dtype=None,
81 force_all_finite=False,

AttributeError: 'RandomOverSampler' object has no attribute '_validate_data'

3 Answers
1

Hi

I encountered exactly the same problem.

I have anaconda on personal laptop. Without thinking, I went to the anaconda prompt and used:

conda install -c conda-forge imbalanced-learn

No error msg from this, but I get the same AttributeError as yours when I run the cell in the notebook.

Then I notice package was for Anaconda Cloud platform. I am not a technical person, but think I should use this instead at the prompt:

pip install -U imbalanced-learn

This time, the cell runs succesfully without AttributeError.

I am not sure what I had done wrong initially, and whether I should uninstall what was installed the first time round, and how this is to be done. Perhaps the technical friends amongst us can help advise. tks, ym.

 

 
0

Hi, 

I'm also facing the same problem as you. If you are using Jupyter Notebook via Anaconda, there is an issue with the version of SKlearn. The imbalanced-learn team on github is aware of this problem: 

https://github.com/scikit-learn-contrib/imbalanced-learn/issues

Unfortunately, the solutions I can think of are quite messy and the best way is to wait for Anaconda to update their SKlearn library. 

0

Hi,

Thank you.

Will try.

Regards

Share:

Delete your account