# An efficient outlier removal method for scattered point cloud data

^{}

## Abstract

Outlier removal is a fundamental data processing task to ensure the quality of scanned point cloud data (PCD), which is becoming increasing important in industrial applications and reverse engineering. Acquired scanned PCD is usually noisy, sparse and temporarily incoherent. Thus the processing of scanned data is typically an ill-posed problem. In the paper, we present a simple and effective method based on two geometrical characteristics constraints to trim the noisy points. One of the geometrical characteristics is the local density information and another is the deviation from the local fitting plane. The local density based method provides a preprocessing step, which could remove those sparse outlier and isolated outlier. The non-isolated outlier removal in this paper depends on a local projection method, which placing those points onto objects. There is no doubt that the deviation of any point from the local fitting plane should be a criterion to reduce the noisy points. The experimental results demonstrate the ability to remove the noisy point from various man-made objects consisting of complex outlier.

## Introduction

Scanning object with complex geometry and varying surface reflectiveness, the collected scanned point cloud may contain extensive outliers, which are inevitable by-products of 3D scanning [1–3]. As illustrated in Fig 1, we can see that it is prone to producing outliers and noise in the PCD due to occlusion or sensor imperfections. The resulting point clouds are thus often noisy, and this inevitably destroys fine details. Outlier points, usually unorganized, noisy, sparse, and inconsistent in local point density, have geometrical discontinuities, arbitrary surface shape with sharp features [4]. Sparse and dense outliers pose much more problematic issues to the applications of the scanned point cloud, especially in 3D shape analysis [5], object modeling [6] and object recognition [7]. Therefore, how to remove outliers from scattered point cloud data is the main focus of this paper.

Compared to the common manual removal process, which is time consuming and relies on the operator’s experience, it is highly desirable to develop an automatic outlier removal method. However, automatic and effective removal of outliers is challenging since the scanned object is unavailable and the estimation of the object surface shape would be inaccurate in the presence of extensive outliers.

In this paper, the outliers are classified into three categories: sparse outlier, isolated outlier and non-isolated outlier. We proposed an automatic method to remove those outliers based on local density and local projection. The sparse outlier can be detected easily according to low local point density. Although isolated outlier often formed clusters because of high point density, the local density may have a lower value when the local area is large enough. The two types of noise both can be detected and removed by the method based on local density. However, the non-isolated outlier is close to the model, we proposed to project those outliers locally onto the original object through the local fitting plane. Different from other denoising methods, the procedure does not remove noisy point but project noisy point onto the local fitting plane to make the model more regular. We can obtain the noise-free model through the two methods and prove that our method is effective to denoise the point cloud model.

## Related work

Scanned PCD acquired is usually polluted by noise for the existence of the scanner system’s inherent error and aircraft’s shock. In this research, the purpose of outlier removal is to identify and remove outliers efficiently in scanned PCD. Outliers can be removed by applying a spatial depth-pass filter to the 3D point data [8]. Many scholars home and abroad do research on this problem that is sorted in two kinds: discontinuous operators-based method and surface fitting-based method.

### Discontinuous operators-based method

Wang et al. [9] utilized a distance-based deviation factor to detect sparse outlier and then detected small outlier clusters using region growing. Rusu et al. [10] proposed an efficient approach to detect sparse outlier, which correspond to low point densities. In practice, however, the local density of scanned PCD for good surface points can be non-uniform and incomplete. Chenot [11] proposed a new method to aberrant outliers on a wider range of blind separation instances. Based on sparse signal modeling, it makes profit of an alternate reweighting minimization technique that yields a robust estimation of the sources and the mixing matrix simultaneously with the removal of the spurious outliers. Shao et al. [12] reconstructed dense depth maps from sparse point clouds and used them to remove points that are in significant visibility conflict and to augment the input point cloud. Similarly, a free space constraint was used to clean up depth maps in [13]. Unfortunately, non-isolated outlier clusters were not considered. Also methods based on robust descriptor and wavelet transform are also effective to reduce the noisy points. Zhang et al [14] proposed a distance-based method to detect the outliers. Tola et al. [15] used a robust descriptor for large-scale multi-view stereo matching in order to reduce the amount of outliers in the computed point cloud. The parameterization-free projection operator [16] results in a resampled point cloud by means of point projections, but onto a multivariate median, being more robust to noise and able to detect outliers. By taking into account the point density, the method was extended to deal with sharp features [17] and a high level of non-uniformity [18].

### Surface fitting-based method

Carsten et al. [19] presented a new method for anisotropic fairing of a point sampled surface using an anisotropic geometric mean curvature flow. Desbrun et al. [20] developed methods to rapidly remove rough features from irregularly triangulated data intended to portray a smooth surface. The main task is to remove undesirable noise and uneven edges while retaining desirable geometric features. Zeng et al. [21] combined the Moving Least Square surface fitting with Lagrange operator to implement point cloud filtering. Zheng et al. [22] proposed a point cloud filtering method based on variable radius circle and B-spline fitting, the filtering precision of the algorithm is improved 1 to 5 times of the traditional methods, it can be used for the city, mountains and forest. Weyrich et al. proposed three novel methods to detect outliers including the plane fitting criterion, mini-ball criterion and nearest-neighbor reciprocity criterion [23]. Shao et al [24] presents a novel outlier removal method which is capable of fitting ellipse in real-time under high outlier rate.

All methods mentioned above have the advantage of implementation friendly. The outlier removal methods based on discontinuous operators aforementioned generally focus on a certain type of outliers and are inapplicable to other types of outliers. It is more robust in sparse outlier detection and removing small clusters of outliers. But points on the edges also have a high proportion of unidirectional neighbors and will be detected. Non-isolated outliers are usually ignored. Although method based on surface fitting can deal with non-isolated outliers, it is too complicated and too time-consuming to be applied, and it also requires the continuity of data.

In the paper, we give two algorithms to response various types of outliers. The method based on local density have handled on isolated outlier cluster and sparse outlier and the method based on local projection can well trim non-isolated outliers. At the same time, small maintenance overhead is inevitable when we adopt the two methods.

## Overview

The input to our method is the raw scan of 3D object and real scene, represented as unorganized point clouds. Generally, the scanning data collected from reality are often noisy, uncertainty and incomplete. Lots of denoising methods exits, yet less can successfully deal with all type of noisy point. In order to achieve this goal, we propose a novel outlier removal method on the basis of two visual characteristics. Analyzing different kinds of noisy points, we raise different methods to solve the corresponding issue. Fig 2 displays the overview of our proposed algorithm, highlighting our method and the processing steps. Our algorithm essentially consists of local density based and local projection based method.

Our denosing algorithm removes outliers from a set of input point clouds {**p _{i}**|

*i*= 1, 2, …,

*N*} by analyzing different category of outliers and their distribution.

Outlier Distribution Analysis. A method considering the outlier distribution and the distance distribution is proposed to detect those outliers that could be easily found in object.

Isolated outlier clusters and sparse outlier removal. Local density based method was proposed to detect and remove the isolated outlier clusters and sparse outlier.

Detailed non-isolated outlier removal. In order to remove the remaining outliers from the detected shapes, a criterion is provided by the deviation of any point from the plane that fitted by its neighborhoods (we called it local fitting plane). According to the deviation information we project the noisy points to local fitting plane to trim the model.

For the original data with various outliers in Fig 2(A), the method based on local density information is used to remove isolated outlier clusters (in Fig 2(B)) and sparse outlier (in Fig 2(C)). After the two steps, the data would be further trimmed (in Fig 2(D)) by the local projection based method.

## Sparse outlier and isolated outliers removal

### Outlier distribution analysis

In this work, our outlier removal method is developed to effectively identify sparse outlier, isolated outlier clusters, and non-isolated outlier clusters in scanned point clouds, demonstrated in Fig 3.

Sparse outlier are erroneous measurement points with low local point density.

Isolated outlier have high local point density and are relatively separated from the scanned data.

Non-isolated outlier are attached to the scanned surface and cannot be easily separated.

In Fig 3, the outliers marked by red box belong to sparse outlier whose point density is smaller than others. Also the density of the isolated outlier clusters is remarked by blue box. So the two kinds of noisy point will be removed according to the density information. The non-isolated outlier in green box is near to the model. We will not remove these noisy points but project them to the local fitting plane. Using different strategies to deal with different kinds of noisy points we can achieve final denoising.

### Local density analysis

In this section, we detail the density estimation applied to remove the isolated outlier clusters and sparse outlier. The scattered point cloud denoted as **P** = {**p _{1}**,

**p**,

_{2}**p**, …,

_{3}**p**}, finding the optimal neighborhood of each point is important for computing the local covariance matrix of each point. The distribution of mobile laser point clouds has variable point densities because of occlusion, varying scanning angles, and varying distances to the laser scanner. Let the

_{N}*k*Nearest Neighbor points of

**p**be

_{i}*KNN*(

**p**), i.e.

_{i}**Q**= {

**q**,

_{1}**q**,

_{2}**q**, …,

_{3}**q**}. Our algorithm removes inconsistent points from point cloud

_{k}**P**by analyzing their geometric information and density.

To determine the density information, each point originating from scanning data has to be examined over the *k* nearest neighborhood. The local density is obtained by calculating the average distance of **p _{i}** to its

*k*nearest neighborhood

**q**(

_{j}**j**=

**1**, …,

**k**). The average distance of

**p**is defined as

_{i}
where *i* = 1, 2, …, *k* and **dist**(**p _{i}**,

**q**) is the Euclidean distance between

_{j}**p**and

_{i}**q**. The local density function

_{j}**LD**(

**p**) of

_{i}**p**is defined as Eq 2:

_{i}
where *k* is the number of nearest neighborhood, ${\overline{\mathbf{d}}}_{\mathbf{i}}$ is the average distance between **p _{i}** and

**q**. The probability of point belongs to outlier can be defined as Eq 3:

_{j}*p*

*r*

*o*(

**p**

_{i}) = 1 - LD(

**p**

_{i})

*pro*(**p _{i}**) ∈ [

**0**,

**1**]. The greater the value of

*pro*(

**p**), the more likely it is to be outliers.

_{i}Then we will decide whether the point **p _{i}** would be kept based on the local density

*pro*(

**p**). We retain the point

_{i}**p**if it satisfies the following condition that

_{i}*p*

*r*

*o*(

**p**

_{i}) <

*δ*

During the test, an appearance was found that the threshold *δ* was not fixed for all models. *δ* is different for different models because different models have complex settings in the scanning process. In practice, we choose *δ* as a fixed ration of $\overline{{\mathbf{d}}_{\mathbf{i}}}$, i.e. $\delta =0.1\xb7\overline{{\mathbf{d}}_{\mathbf{i}}}$.

The pseudo code of our sparse outlier and isolated outlier removal algorithm is defined in Algorithm 1. Setting proper threshold *δ* for the probability of point belongs to outlier and removing those points beyond the threshold *δ*, the model can discard the isolated outlier cluster and sparse outlier.

Fig 4 illustrate the effect of *δ* on a simulation study. We show a denoised point cloud with different *δ*. When *δ* = 0.0025, most of the points in bear model are deleted. When *δ* = 0.0075, the outlier removal method does not work.

**Algorithm 1** Non-isolated outlier removal algorithm

1: **Input**: Three dimensional scanned PCD with various outliers

2: **Output**: sparse outlier and isolated outlier removal results

3: **for** all point **p _{i}** in

**P**do

4: search the *k* nearest neighborhood of **p _{i}**, i.e.

*KNN*(

**p**)

_{i}5: calculate the local density **LD**(**p _{i}**) of

**p**

_{i}6: compute the probability *pro*(**p _{i}**)

7: **end for**

8: sort the *pro*(**p _{i}**) in ascending order

9: compare the first N number of *pro*(**p _{i}**)

10: **for**
*i* = 1: *N*

11: **if**
*pro*(**p _{i}**) >

*δ*

12: **then** delete **p _{i}** as outliers

13: **end for**

## Non-isolated outlier removal

The non-isolated outlier is very close to the surface of object, it is difficult to remove from the object since it may often cause errors or even remove the original point in object. We proposed a method that converting these outliers to object surface points. In this section, we will illustrate how to project those non-isolated outlier onto their corresponding fitted plane. Fitting local plane is the crux of the method based on local projection in local neighborhood, which we called it local fitting plane.

### Local fitting plane

A plane is parameterized by its normal vector **n** = {**n _{x}**,

**n**,

_{y}**n**} and any given point on the plane. In order to obtain the local fitting plane, we first compute its normal vector.

_{z}Generally, the principle components analysis (PCA) is performed by computing the eigenvalues and eigenvectors to evaluate the normal vector of plane. A minimal ellipsoid is obtained by enclosing the *k* nearest neighborhood of a point **p _{i}**. Let $\overline{\mathbf{p}}$ be the centroid and

*M*be the 3 × 3 covariance matrix defined as Eq 5:

where **p _{i}** ∈

**P**, λ

_{0}, λ

_{1}and λ

_{2}are the eigenvalues of

*M*and λ

_{0}≤ λ

_{1}≤ λ

_{2}. In essence, λ

_{0}, λ

_{1}and λ

_{2}respectively represent the length of the three semi-principal axes of the ellipsoid in 3D. The eigenvector of the smallest eigenvalue is the approximation of normal vector at the vertex

**p**. $\overline{\mathbf{p}}$ is the center point of

_{i}*k*nearest neighborhood of point

**p**. The local fitting plane

_{i}**L**can be represented accordingly.

_{i}### Detailed non-isolated outlier removal

In this section, the target is to project those outlier points onto the local plane **L _{i}**. Fig 5 displays the local fitting plane

**L**of point

_{i}**p**and the process of putting forward the neighboring point

_{i}**q**to the local plane.

_{j}The normal vector **n** = {**n _{x}**,

**n**,

_{y}**n**}, ∥

_{z}**n**∥ = 1 and a given point

**p**can parameterize the plane

_{0}**L**. The deviation of

_{i}**q**from the plane

_{j}**L**is given by Eq 6:

_{i}*d*

*i*

*s*

_{piqj}= (

**q**

_{j}-

**p**

_{0}) ·

**n**

Then projection of point **q _{j}** onto the corresponding fitted plane is to push

**q**along the opposite direction of normal vector

_{j}**n**. The projected

**q**

_{j}′ is defines as Eq 7:

**q**

_{j}

^{′}= (

**q**

_{j}- dis

_{piqj}·

**n**)

**Algorithm 2** Non-isolated outlier removal algorithm

1: **Input**: Three dimensional scanned point cloud data with various outliers

2: **Output**: non-isolated outlier removal

3: **for** all point **p _{i}**(

**i**=

**1**,

**2**, …,

**N**) in

**P**do

4: search the *k* nearest neighborhood **q _{j}**(

**j**=

**1**,

**2**, …,

**k**)

5: fitting a local plane **L _{i}** for

**p**and

_{i}**q**

_{j}6: compute the deviation *d**i**s*_{piqj} of **p _{i}**,

**q**from plane

_{j}**L**

_{i}7: project **q _{j}** onto the corresponding fitted plane

**L**

_{i}8: the projected point **q _{j}**′ is the new coordinate of

**q**

_{j}9: **end for**

The key step is to fix the local fitting plane and the rationality of the fitting plane depends on the value of *k* (*k* Nearest Neighbor). A bigger *k* may cause deformation of the model and a smaller *k* may offer an invalid fitting plane and an invalid projection. Fig 6 shows the results with different *k*.

## Experimental results

We experimentally evaluate our method primarily using scanned point cloud contain various categories of noise. The datasets used in our experiments include:

A gallery of models in Fig 7 is chosen from Princeton shape Database and variety of Gaussian noise is added in the models.

Several real indoor scene data are also selected.

Datasets from [6] are also selected, including DRAGON, TORCH, and STATUE models.

We first describe the experimental setting of our method and then demonstrate the results using our method. Meanwhile we summarize the performance of our algorithm where accuracy and completeness errors as well as runtime.

Experimental setting. Our method is implemented using C++ and run on a desktop PC with an Intel I7-6700 CPU (quad core, 3.4 GHz) and AMD Radeon R5 340X graphics card.

### Results on 3D models

We run our algorithm to six 3D models: chair, table, bird, monster, bear, and Nail. We demonstrate the experimental results of our method for those models in Fig 7. Fig 8(A) is table model which contains all kinds of outliers, after the local density based processing, isolated outlier is deleted greatly in Fig 8(B) and 8(C). The final data in Fig 8(D) having been trimmed after using local projection based method. It can show that our method is very efficient for denoising of point cloud data.

Figs Figs99–13 demonstrate respectively the experimental results for chair, bird, monster and bear model. In addition, to test the robustness of our method we run our method to two scanned real-life indoor scenes with 146044 points and 153218 points respectively. These real-life scenes contain a variety of object clutters and more outliers. Figs Figs1414 and and1515 demonstrate the two original scenes and the outlier removal results.

### Comparison against other denoiser

We compare our outlier removal method with two alternative methods (radius-based method and statistics-based method) in Point Cloud Library respectively. We evaluate the quality of our outlier removal system by running it on the six test models and test scenes S1, S2, and output the final point clouds after removing different outliers in Figs Figs1616–23. As shown in these figures, when there are sparse outlier and isolated outlier, both methods could obtain reasonably good removal results. When the non-isolated outliers exist, however, the quality of results decreases significantly. In contrast, our method produces good result even for indoor cluttered scenes.

Fig 24 plots the running time of our method for all the 3D models in the paper. For comparison, we also plot the results of a radius-based method and statistics-based method. The statistics-based method takes less time than ours, but the quality of outlier removal result is weak. As shown in Fig 25, it plots the running time of our method for two indoor scenes S1 and S2. We can see that our method takes less time than the other two methods when the point number of object increase significantly. Our outlier removal method is robust, making it particularly effective for removing different kinds of outliers.

In Figs Figs2626–28, we compare our method with other point cloud denonising method in [6]. We can see that the fine detailed information (such as smooth boundary) are retained for DRAGON model in Fig 28(C). For the TORCH model, our method can remove the noise without missing any point in data, however the method in [6] would lead to data missing.

### Performance analysis

We summarized the performance of our algorithm, where accuracy and completeness errors as well as the running time were measured. Table 1 reports timing and some related statistics of our algorithm running over different models. The value are, from left to right, the model, the number of original data *N*, the number after removing the isolated outlier clusters *N*_{−}
*i*, the number after removing the sparse outlier is *N*_{−}
*s*, the total time that method takes *t* and the ratio of detected noise ${P}_{c}=1-\frac{{N}_{-}s}{N}$.

### Table 1

model | N | N_{−}
i | N_{−}
s | t | P_{c} |
---|---|---|---|---|---|

table | 15845 | 15433 | 14968 | 0.05 | 0.055 |

chair | 17623 | 17113 | 16565 | 0.06 | 0.060 |

bird | 13180 | 12742 | 12470 | 0.04 | 0.054 |

monster | 28246 | 27510 | 26588 | 0.08 | 0.059 |

bear | 15365 | 14931 | 14461 | 0.05 | 0.059 |

Nail | 15009 | 14655 | 14507 | 0.05 | 0.033 |

To access the results more quantitatively, we measured the bias of the reconstructed meshes from ground truth results. We evaluate the accuracy and completeness of each mesh according to the metrics used in [25]. We measured errors in terms of accuracy (in world units) and completeness (in percent), using an accuracy threshold of 90%, and a completeness threshold of 0.005 world units. Table 2 shows the accuracy and completness of 3D models after using our methods. Tables Tables33 and and44 illustrates the timings and parameter setting in our method.

### Table 2

model | Accuracy | Completeness |
---|---|---|

table | 0.004197 | 93.4% |

chair | 0.00303 | 100% |

bird | 0.004293 | 93.6% |

monster | 0.002769 | 97.3% |

bear | 0.002205 | 100% |

Nail | 0.005783 | 87.3% |

### Table 3

Model | Original Data Number | After Removal | Time(s) | k | δ |
---|---|---|---|---|---|

TORCH | 3604726 | 2946940 | 2421.391 | 50 | 0.05 |

DRAGON | 5601652 | 4815690 | 514.392 | 400 | 0.1 |

STATUE | 5849643 | 4767875 | 789.507 | 100 | 0.0005 |

### Table 4

Model | k (sparse outlier and isolated outlier removal) | δ_{i} | k (non-isolated outlier removal) |
---|---|---|---|

table | 80 | 0.005 | 10 |

chair | 75 | 0.004 | 10 |

bird | 70 | 0.003 | 10 |

monster | 50 | 0.0015 | 10 |

bear | 50 | 0.0035 | 10 |

nail | 50 | 0.0015 | 10 |

We give the complexity of two key algorithmic components. The complexity is *O*(*N*ß*logN*) for the local density based method, with *N* being the number of points in an object, and *O*(*NlogN*) for the local projection based method.

### Limitation

As our method rely on the points distribution and need to calculate the local density of points, our method might fail for very dense outliers exist. When there are denser outliers, it may lead to deformation of object after removing the outliers. To eliminate this problem, we plan to adopt learning-based strategy that could estimate and distinguish the outlier from original points.

## Conclusion

In this paper, a robust method is presented in this paper to effectively remove isolated outlier, sparse outlier, and non-isolated outlier from scanned objects in point cloud. The local density and the deviation from the local fitting plane provide a fundamental way. The local density based method can remove sparse outlier and isolated outlier. The deviation of point from the local fitting plane should be a criterion to reduce the non-isolated outlier. Experimental results demonstrate the ability to remove complex outliers from various man-made objects. As demonstrated, the presented method is able to achieve robust results in removing three types of outliers and preserving distinct geometric features such as sharp edges in a scanned point cloud.

## Acknowledgments

The author would like to thank the anonymous reviewers for their valuable comments.

## Funding Statement

This work was supported in part by the National Natural Science Foundation of China (6187012372 and 61472319) (XN, YW); in part by China Postdoctoral Science Foundation (2014M552469); in part by Education Department of Shaanxi Province Key Laboratory Project (17JS099) (XN);in part by Shaanxi Postdoctoral Science Foundation (434015014) (XN); and in part by Shaanxi Natural Science Foundation (2017JQ6023) (XN).

## Data Availability

The data sets and the evaluation protocol are publicly available under the Section Experimental results, the Datasets of 3D model at: http://segeval.cs.princeton.edu/. Indoor data is aviable in the Supporting Information files. Comparison experiment is implmented on dataset at http://igl.ethz.ch/projects/noise-rem/.

## References

**Public Library of Science**