各位先进们好
我是一名基础不太好的python新手QQ
现有一批大量的经纬度座标要做KNN
但我想添加一个距离限制(例如小于1000m才会被cluster)
参考网络上的做法,我目前是这样写的:
假设我有十组经纬度资料,阵列为:
data_point=[[120.228986 22.92753 ]
[120.222007 22.9854525]
[120.21645 22.99625 ]
[120.221625 22.99833 ]
[120.1566975 22.987169 ]
[120.281875 23.106358 ]
[120.314682 23.319719 ]
[120.219985 22.998485 ]
[120.215055 22.99942 ]
[120.20783 22.999415 ]]
def np_getDistance(data, data_point, i, j):
ra = 6378140
rb = 6356755
flatten = 0.003353
midLatA=data_point[i,1]
midLonA=data_point[i,0]
midLatB=data_point[j,1]
midLonB=data_point[j,0]
radLatA = np.radians(midLatA)
radLonA = np.radians(midLonA)
radLatB = np.radians(midLatB)
radLonB = np.radians(midLonB)
pA = np.arctan(rb / ra * np.tan(radLatA))
pB = np.arctan(rb / ra * np.tan(radLatB))
x = np.arccos( np.multiply(np.sin(pA),np.sin(pB)) +
np.multiply(np.multiply(np.cos(pA),np.cos(pB)),np.cos(radLonA - radLonB)))
c1 = np.multiply((np.sin(x) - x) , np.power((np.sin(pA) + np.sin(pB)),2))
/ np.power(np.cos(x / 2),2)
c2 = np.multiply((np.sin(x) + x) , np.power((np.sin(pA) - np.sin(pB)),2))
/ np.power(np.sin(x / 2),2)
dr = flatten / 8 * (c1 - c2)
distance = 0.001 * ra * (x + dr)
return distance
for i in range(dataLen)):
knn = KNN(i, K, data_point, tree) #计算KNN
for j in knn:
if l[i] != l[j]:
if clusterSim(c[l[i]], c[l[j]], data, alpha) <= 1:
if np_getDistance(data, data_point,l[i], l[j]) <= 1000:
merge(c, l[i], l[j], l)
但是输出的数值会直接不见=口=
请问我是计算上哪里有问题,还是code写错了,还是array取错了呢?
若不加np_getDistance的话code跑起来是没问题
或者还有什么有效率的方法可以从np.array两两计算经纬度的距离呢?
谢谢大家的指教QQ