我通过RandomForestClassifier构建了一个随机森林,并绘制了决策树.参数"值"(用红色箭头表示)是什么意思?为什么[]中的两个数字之和不等于"样本"的数量?我看到了其他一些例子,[]中的两个数字之和等于"样本"的数量.为什么我的情况不是这样?
df = pd.read_csv("Dataset.csv")
df.drop(['Flow ID', 'Inbound'], axis=1, inplace=True)
df.replace([np.inf, -np.inf], np.nan, inplace=True)
df.dropna(inplace = True)
df.Label[df.Label == 'BENIGN'] = 0
df.Label[df.Label == 'DrDoS_LDAP'] = 1
Y = df["Label"].values
Y = Y.astype('int')
X = df.drop(labels = ["Label"], axis=1)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.5)
model = RandomForestClassifier(n_estimators = 20)
model.fit(X_train, Y_train)
Accuracy = model.score(X_test, Y_test)
for i in range(len(model.estimators_)):
fig = plt.figure(figsize=(15,15))
tree.plot_tree(model.estimators_[i], feature_names = df.columns, class_names = ['Benign', 'DDoS'])
plt.savefig('.\\TheForest\\T'+str(i))