pyntcloud icon indicating copy to clipboard operation
pyntcloud copied to clipboard

The logic of `io.las.read_las_with_laspy()` may not meet the las data specification.

Open nokonoko1203 opened this issue 2 years ago • 3 comments

Hello. Thanks for the nice library! I think I may have found a bug, could you please check?

Describe the bug The logic of io.las.read_las_with_laspy() may not meet the las data specification. https://github.com/daavoo/pyntcloud/blob/c9dcf59eacbec33de0279899a43fe73c5c094b09/pyntcloud/io/las.py#L46

To Reproduce Steps to reproduce the behavior:

  • Download point cloud data (.las) of Kakegawa Castle.
    • https://www.geospatial.jp/ckan/dataset/kakegawacastle/resource/61d02b61-3a44-4a5b-b263-814c6aa23551
    • Please note that the data is 2GB in zip file and 5GB after unzipping.
    • The file name is in Japanese, so please be careful of the character encoding.
    • Kakegawa Castle is https://en.wikipedia.org/wiki/Kakegawa_Castle
  • Feel free to rename the file as you wish. Here, the file name is KakegawaCastle.las.
  • Execute the following code to get the xyz coordinates.
    • You will find 190 million points.
from pyntcloud import PyntCloud
cloud = PyntCloud.from_file(". /KakegawaCastle.las")
cloud.points
# x y z intensity bit_fields raw_classification scan_angle_rank user_data point_source_id red green blue
# 0 37.910053 71.114777 28.936932 513 0 1 0 0 29 138 122 127
# 1 37.690052 75.975777 28.918930 2309 0 1 0 0 29 15 5 14
# 2 38.465054 71.277779 33.523930 64149 0 1 0 0 29 44 15 35
# 3 32.406052 78.586777 30.808931 19758 0 1 0 0 29 99 54 59
# 4 30.372051 86.346779 30.809931 257 0 1 0 0 29 107 56 55
# ...	...	...	...	...	...	...	...	...	...	...	...	...
# 192366074 151.807999 172.604996 17.660999 50886 0 1 0 0 29 198 198 190
# 192366075 152.425003 173.162994 16.458000 25186 0 1 0 0 29 101 96 96
# 192366076 152.126007 172.781998 16.620001 30840 0 1 0 0 29 121 120 116
# 192366077 152.085007 172.682999 17.497000 40863 0 1 0 0 29 166 157 146
# 192366078 151.832993 173.360001 16.886000 31868 0 1 0 0 29 132 121 115
# 192366079 rows × 12 columns
  • At this time, the first point in column x is 37.910053
  • If you run the following command, the data should look like this.
    • pdal info: https://pdal.io/apps/info.html
% pdal info /KakegawaCastle.las -p 0
{
  "file_size": 5001518281,
  "filename": "KakegawaCastle.las",
  "now": "2022-06-14T09:39:43+0900",
  "pdal_version": "2.4.0 (git-version: Release)",
  "points":
  {
    "point":
    {
      "Blue": 32640,
      "Classification": 1,
      "EdgeOfFlightLine": 0,
      "Green": 31365,
      "Intensity": 513,
      "NumberOfReturns": 0,
      "PointId": 0,
      "PointSourceId": 29,
      "Red": 35445,
      "ReturnNumber": 0,
      "ScanAngleRank": 0,
      "ScanDirectionFlag": 0,
      "UserData": 0,
      "X": -44490.84295,
      "Y": -135781.1752,
      "Z": 54.58493098
    }
  }
  "reader": "readers.las"
}
  • The value of x is -44490.84295, which is different from the value output by pyntcloud!
  • The above value can be calculated from the data output when using the following laspy.
import laspy
las = laspy.read(". /KakegawaCastle.las")
header = las.header

# first x point value: 531578298
x_point = las.X[0]

# x scale: 7.131602618438667e-08 -> 0.0000007
x_scale = header.x_scale

# x offset: -44528.753
x_offset = header.x_offset

# x_coordinate output from above variables: -44490.842948180776
real_coordinate_x = (x_point * x_scale) + x_offset
  • The value calculated from laspy based on EPSG:6676 is indeed at Kakegawa Castle!
    • https://www.google.co.jp/maps/place/34%C2%B046'30.3%22N+138%C2%B000'50.1%22E/@34.775077,138.0117243,17z/data=!3m1!4b1!4m5!3m4!1s0x0:0x2ce21e9ef0b19341!8m2!3d34.775077!4d138.013913?hl=ja
  • But in read_las_with_laspy(), the logic is as follows, and the offset values are not added https://github.com/daavoo/pyntcloud/blob/c9dcf59eacbec33de0279899a43fe73c5c094b09/pyntcloud/io/las.py#L55

Expected behavior Offset values are taken into account for the xyz coordinates of the DataFrame.

Screenshots Does not exist.

Desktop (please complete the following information):

  • OS: macOS Monterey v12.4
  • Browser: Does not used.
  • Version
Conda -V
conda 4.12.0
❯ conda list | grep pyntcloud
pyntcloud 0.3.0 pyhd8ed1ab_0 conda-forge

Additional context If the above context looks OK, shall I create a PullRequest?

nokonoko1203 avatar Jun 14 '22 01:06 nokonoko1203

Hola @nokonoko1203 ! Thanks for reporting.

If the above context looks OK, shall I create a PullRequest?

It looks OK to me, don't hesitate on opening the P.R.

daavoo avatar Jun 14 '22 08:06 daavoo

@daavoo Thanks for checking! I thought I followed the documentation(https://github.com/daavoo/pyntcloud/blob/c9dcf59eacbec33de0279899a43fe73c5c094b09/docs/contributing.rst) to install, but I get errors in 12 tests.

Do you know this?

I am hiding personal information, but here are the complete steps I tested.

% cd ~
% git clone https://github.com/daavoo/pyntcloud.git
% conda create -n pyntcloud python=3.7
% conda activate pyntcloud
% pip install -e pyntcloud
% pip install numba flake8 pytest
% cd pyntcloud
% pytest -v
================================================================================================= test session starts ==================================================================================================
platform darwin -- Python 3.7.12, pytest-7.1.2, pluggy-1.0.0 -- ~opt/anaconda3/envs/pyntcloud/bin/python3.7
cachedir: .pytest_cache
rootdir: ~/pyntcloud
collected 148 items / 12 errors  

... # more error log

_________________________________________________________________________ ERROR collecting tests/unit/structures/test_voxelgrid_structures.py __________________________________________________________________________
import file mismatch:
imported module 'test_voxelgrid_structures' has this __file__ attribute:
  
~/pyntcloud/tests/integration/structures/test_voxelgrid_structures.py
which is not the same as the test file we want to collect:

~/pyntcloud/tests/unit/structures/test_voxelgrid_structures.py
HINT: remove __pycache__ / .pyc files and/or use a unique basename for your test file modules
=============================================================================================== short test summary info ================================================================================================
ERROR tests/unit/filters/test_kdtree_filters.py
ERROR tests/unit/filters/test_xyz_filters.py
ERROR tests/unit/samplers/test_mesh_samplers.py
ERROR tests/unit/samplers/test_points_samplers.py
ERROR tests/unit/samplers/test_voxelgrid_samplers.py
ERROR tests/unit/scalar_fields/test_eigenvalues_scalar_fields.py
ERROR tests/unit/scalar_fields/test_k_neighbors_scalar_fields.py
ERROR tests/unit/scalar_fields/test_normals_scalar_fields.py
ERROR tests/unit/scalar_fields/test_rgb_scalar_fields.py
ERROR tests/unit/scalar_fields/test_voxlegrid_scalar_fields.py
ERROR tests/unit/scalar_fields/test_xyz_scalar_fields.py
ERROR tests/unit/structures/test_voxelgrid_structures.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 12 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
================================================================================================== 12 errors in 0.43s ==================================================================================================         

nokonoko1203 avatar Jun 14 '22 12:06 nokonoko1203

However, it does not affect the part I am modifying, so I have created a PR. Please confirm. https://github.com/daavoo/pyntcloud/pull/335

nokonoko1203 avatar Jun 15 '22 06:06 nokonoko1203