duplicated points in ADS

From: Goreaud Francois (francois.goreaud@CLERMONT.cemagref.fr)
Date: Wed Jul 19 2000 - 10:52:32 MET DST

<x-charset iso-8859-1>Bonjour chers amis,

Leonardo A. Saravia nous pose un probleme intéressant concernant les points
doubles dans l'analyse de la structure spatiale :

>Some time ago you sent me the C source for the computation of Ripley's K
>functions, I discovered a small problem, your code do not check for
>duplicated points. Duplicated points occur when somebody makes a mistake
>entering field data and when you generate confidence intervals. In the
>case spurious significant values at the first distance interval would be
>found and in the second one the confidence band will be wider for the
>smaller distances. I confirmed this problem with ADE-4 too.
>best regards,
>Leonardo A. Saravia - Programa de Invest. en Ecologia Matematica
>Universidad Nacional De Lujan - C.C. 221 - (6700) - Lujan - Argentina
>Te: 54 2323 423171/421030 Fax: 1 (801) 409-3991

Voici quelques elements de reponse, en anglais. Le debat reste bien sur

1) As mentioned by Leonardo, duplicated points (i.e. two points having the
same co-ordinates in a data set) can occur when somebody makes a mistake
entering field data, and the automatic detection of duplicated points could
help correcting the data set.
BUT duplicated points can also occur because two (or more) individuals are
very close to each other, at a smaller distance than the precision of the
co-ordinates measurement. It may be a very rare event when trees are
concerned (but I have seen some such cases with coppice), but not that rare
with smaller individuals.
IN THIS CASE, there is no reason why we should not compute the K(r) function
with such duplicated points, or delete some of this points. The obtained
values DO characterise the particular structure of the pattern (for instance
low distance aggregation).

2) Moreover, the K(r) function or other similar function are defined for all
(homogeneous and isotropic) point processes, with or without duplicated
points, and no problem occur either in the definition or in the computation
of the functions in case of duplicated points. There are no theoretical
reason why we should not compute the K(r) function in this case.

EVENTUALLY, we decided with Raphael Pelissier not to change the computation,
so that it is still possible to compute K(r) with duplicated points, but to
inform the user of the eventual presence of such points. He will then be
free to correct or not his data set.

3) As far as the computation of the confidence interval is concerned, the
problem is similar : there is no reason why we should delete duplicated
points. In a real Poisson pattern duplicated points can occur. Moreover,
because we truncate the co-ordinates to a fixed precision in our Poisson
pattern simulation, virtual points that are very close to each other (but
still different) will have the same co-ordinates. As we want to simulate a
confidence interval for the Poisson hypothesis, we must keep this
possibility. In fact, if you delete duplicated points in the Monte Carlo
simulation, you create a bias and thus obtain a confidence interval that
does no more correspond to a Poisson pattern, but to a hard-core Poisson
pattern. That's why the result is different.

THEREFORE, we do not change the computation of the confidence interval.

bien amicalement,

François Goreaud
24 avenue des Landais - BP 50085
63172 AUBIERE CEDEX 1 - France
Tel. : 33 (0)
Fax : 33 (0)
email : francois.goreaud@clermont.cemagref.fr



This archive was generated by hypermail 2b30 : Mon Feb 12 2001 - 09:24:56 MET