由于单个数组中有多个点,因此必须对它们进行群集.正如Rahul所提到的,K-means非常适合这项工作.
还有一个问题,你怎么知道how many ROIs are present?换句话说,how many clusters do you divide your points into?
我在this post中使用了silhouette method详细井,在scikit-learn
中可用
Code:
import cv2
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
# array of points/coordinates
r = np.array(ROI)
# this range can be increases if many ROIs are present
range_n_clusters = [2, 3, 4, 5]
# list to store silhouette score for each cluster
silhouette_avg = []
for num_clusters in range_n_clusters:
kmeans = KMeans(n_clusters=num_clusters)
kmeans.fit(r)
cluster_labels = kmeans.labels_
silhouette_avg.append(silhouette_score(r, cluster_labels))
plt.xlabel('Values of K')
plt.ylabel('Silhouette score')
plt.title('Silhouette analysis For Optimal k')
plt.plot(range_n_clusters,silhouette_avg,'bx-')
plt.show()
必须 Select 轮廓分数最高的聚类,以获得最佳聚类结果.基于上图的最优聚类数为2.
# K-Means model with 2 clusters
final_kmeans = KMeans(n_clusters = 2)
final_kmeans.fit(r)
final_kmeans.predict(r)
# points and labels to separate lists
r_list = r.tolist()
labels = final_kmeans.predict(r).tolist()
# number of unique clusters
num_clusters = np.unique(final_kmeans.predict(r)).tolist()
# sample mask for demonstration
mask = np.zeros((300, 250,1), dtype=np.uint8)
# select points by their cluster labels and draw them
for clus in num_clusters:
points = []
for i, j in zip(r_list, labels):
if j == clus:
points.append(i)
mask = cv2.fillConvexPoly(mask, np.array(points), 255)
希望这能给你一个 idea .您可以进一步优化它.记住数据集可能的最大ROI数.您可以try 其他聚类算法,也可以寻找可以获得最佳聚类数的方法