Bagging Decision tree
Random Forest is an ensemble learning method that improves upon bagging by reducing correlation between decision trees. It achieves this by creating multiple decorrelated trees and combining their predictions through averaging.
Bagging trees leads to reduction in variance but not bias. The idea in random forest is to improve the variance reduction of bagging by reducing the correlation between trees. This is achieved through random selection of the input variables/features. 1) For each tree: Pick a bootstrap sample of data 2) For each split: Pick random sample of the features. Builds a large collection of de-correlated trees and averages them.
Key Characteristics
- Similar performance to boosting but simpler to train and tune.
- Creates a "forest" of multiple decision trees
- Built on Bagging principles
- Generates predictions by averaging results from decorrelated trees
- Offers comparable performance to boosting while being more straightforward to implement and optimize
When creating trees, each node splits on a random subset of features to ensure diversity in the forest structure.
Impurity and Information Gain
The splitting criterion at each node aims to minimize impurity (or maximize homogeneity) in the resulting subsets. In information theory, this reduction in uncertainty is known as Information Gain.
Extra-Trees
Extra-Trees (Extremely Randomized Trees) takes randomization a step further. Instead of searching for optimal thresholds for node splitting, it randomly generates splitting points and selects the best among these random splits. This approach significantly reduces computational time compared to standard Random Forest, which searches for optimal thresholds at each node.
A key advantage of Random Forest is its ability to measure feature importance. In Scikit-Learn, this is calculated by measuring how much each feature reduces impurity when used in node splitting decisions.
Applications
One practical application is visualizing feature importance, such as displaying pixel-wise importance in image analysis tasks.