# 分类算法 - 随机森林

## 随机森林算法

• 步骤1   -  首先，从给定的数据集中选择随机样本。

• 步骤2   -  接下来，该算法将为每个样本构造一个决策树。然后它将从每个决策树中获得预测输出。

• 步骤3   -  在此步骤中，将对每个预测输出进行投票。

• 步骤4   -  最后，选择投票最多的预测输出作为最终预测输出。

## 代码实现

```import numpy as np
import matplotlib.pyplot as plt
import pandas as pd```

`path="https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"`

`headernames=['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']`

```dataset=pd.read_csv(path, names=headernames)

0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa

```X=dataset.iloc[:, :-1].values
y=dataset.iloc[:, 4].values```

```from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.30)```

```from sklearn.ensemble import RandomForestClassifier
classifier=RandomForestClassifier(n_estimators=50)
classifier.fit(X_train, y_train)```

`y_pred=classifier.predict(X_test)`

```from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
result = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(result)
result1 = classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)
result2 = accuracy_score(y_test,y_pred)
print("Accuracy:",result2)```

```Confusion Matrix:
[[14 0 0]
[ 0 18 1]
[ 0 0 12]]
Classification Report:
precision   recall   f1-score   support
Iris-setosa    1.00     1.00       1.00        14
Iris-versicolor    1.00     0.95       0.97        19
Iris-virginica    0.92     1.00       0.96        12

micro avg    0.98     0.98        0.98       45
macro avg    0.97     0.98        0.98       45
weighted avg    0.98     0.98        0.98       45

Accuracy: 0.9777777777777777```

## 猜你喜欢

Selenium自动化测试实战 -〔郭宏志〕