<앙상블> 보팅(Voting)

1. 보팅 회귀 (Voting Regression)

- 말 그대로 투표를 통해 결정하는 방식이다

- 보팅은 다른 알고리즘 model을 조합해서 사용한다

- 배깅은 같은 알고리즘 내에서 다른 sample 조합을 사용한다

1.1. Step

- 반드시 tuple 형태로 모델을 정의해야 한다

from sklearn.ensemble import VotingRegressor

single_models = [
    ('linear_reg', linear_reg), 
    ('ridge', ridge), 
    ('lasso', lasso), 
    ('elasticnet_pipeline', elasticnet_pipeline), 
    ('poly_pipeline', poly_pipeline)
]

voting_regressor = VotingRegressor(single_models, n_jobs=-1)
voting_regressor.fit(x_train, y_train)
voting_pred = voting_regressor.predict(x_test)
mse_eval('Voting Ensemble', voting_pred, y_test)

2. 보팅 분류기 (Voting Classification)

- 분류기에서는 중요한 파라미터가 존재한다

- voting = {'hard', 'soft'}

- hard 방식:

결과 값에 대한 다수의 class를 차용하는 방식이다
(Eg) 분류 예측 값이 (1, 0, 0, 1, 1)이었다면, 1이 3표 0이 2표이기에 1을 최종값으로 예측한다

- soft 방식:

각각의 확률의 평균 값을 계산하고 가장 높은 평균 값을 가지는 class를 차용하는 방식이다
class 0이 나올 확률이 (0.4, 0.9, 0.9, 0.4, 0.4)이었고, class 1이 나올 확률이 (0.6, 0.1, 0.1, 0.6, 0.6) 이었다면,
class 0이 나올 최종 확률은 (0.4+0.9+0.9+0.4+0.4) / 5 = 0.44
class 1이 나올 최종 확률은 (0.6+0.1+0.1+0.6+0.6) / 5 = 0.4

$\quad\rightarrow$ soft 방식이 더 선호된다

2.1. Step

from sklearn.ensemble import VotingClassifier

vc = VotingClassifier(models, voting='hard')

저작자표시 (새창열림)

1. 보팅 회귀 (Voting Regression)

1.1. Step

2. 보팅 분류기 (Voting Classification)

2.1. Step

티스토리툴바