Boosting Predictive Power: Random Forest and Gradient Boosted Trees in Ensemble Learning

Bashar Alhajahmad; Musa Ataş

doi:10.59287/as-proceedings.133

Authors

Bashar Alhajahmad Computer Engineering/ Faculty of Engineering and Architecture, Siirt University, Turkey
Musa Ataş Computer Engineering/ Faculty of Engineering and Architecture, Siirt University, Turkey

DOI:

https://doi.org/10.59287/as-proceedings.133

Keywords:

Random Forest, Gradient Boosted Trees, Knime, Ensemble learning, Machine Learning

Abstract

Ensemble learning is a powerful concept in the realm of supervised machine learning, emphasizing the combination of multiple base learners or "inducers" to enhance predictive performance. This study explores the effectiveness of two ensemble algorithms, Random Forest and Gradient Boosted Trees, and their influence on predictive outcomes. The significance of ensemble methods lies in their ability to mitigate common challenges in machine learning. First, they address the issue of overfitting, which occurs when a model fits training data perfectly but fails on unseen data. Ensemble methods achieve this by averaging diverse hypotheses, reducing the risk of selecting an incorrect one and improving overall predictive performance. Second, ensemble methods provide computational advantages by avoiding local optima. Third, ensemble methods enhance representation by expanding the search space to find the best hypothesis. This extended representation facilitates more accurate modeling of complex relationships within the data. This study leverages two distinct datasets: one is hte Mushrooms and another from is the CO2 Emission by Vehicles dataset. The latter dataset, containing information on CO2 emissions from vehicles, is used for regression tasks, applying the same algorithms. The results of this study demonstrate outstanding performance from both Random Forest and Gradient Boosted Trees. In the classification task, both algorithms achieved perfect accuracy, while in the regression task, they showed remarkable explanatory power, with R-squared values of 1 for Random Forest algorithm and 0.995 for Gradient Boosted Trees. These findings emphasize the potential of ensemble learning in improving predictive accuracy and model performance.

Boosting Predictive Power: Random Forest and Gradient Boosted Trees in Ensemble Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Most read articles by the same author(s)

Keywords

Information

Current Issue