singapore-postal-codes icon indicating copy to clipboard operation
singapore-postal-codes copied to clipboard

Data Accuracy Issues

Open joelkarunungan opened this issue 7 years ago • 2 comments

For those downloading the data, who may think this is accurate since it comes from government sources, set your expectations lower and be aware of the following issues:

  1. The postal code information in buildings.json is full of spam locations, especially banks. ATM machines are not buildings, nor are their temporary sales sites. How did this information even end up there?
  2. Data values are 99% in all caps. This is problematic since the actual building spelling nuances are not respected. iSuite vs ISUITE, MacDonald vs MACDONALD, etc.
  3. A lot of typographical errors.
  4. A lot of incomplete building names. A lot of the official building names are not provided.

Overall, very low quality considering its supposed to be scrubbed and well-maintained government information.

Postal codes should be cross referenced with the actual building names from another cleaner source, maybe URA SPACE?

@xkjyeah thanks for this, it is very helpful. Just want others to be aware of the issues.

joelkarunungan avatar Sep 01 '18 05:09 joelkarunungan

Yes it's embarrassing. I would have hoped their vendors were more up to scratch in providing clean data, but this is the best we've got.

I would suggest using it mainly as a source for lat long information, but even so that information's usefulness may be limited (e.g. for vehicle pickups you generally want to identify the entrance to a building, not the centre of a building).

I would suggest sending feedback at https://docs.onemap.sg/#contact-us

On Sat, 1 Sep 2018, 13:54 joelkarunungan, [email protected] wrote:

For those downloading the data, who may think this is accurate since it comes from government sources, set your expectations lower and be aware of the following issues:

  1. The postal code information in buildings.json is full of spam locations, especially banks. ATM machines are not buildings, nor are their temporary sales sites. How did this information even end up there?
  2. Data values are 99% in all caps. This is problematic since the actual building spelling nuances are not respected. iSuite vs ISUITE, MacDonald vs MACDONALD, etc.
  3. A lot of typographical errors.
  4. A lot of incomplete building names. A lot of the official building names are not provided.

Overall, very low quality considering its supposed to be scrubbed and well-maintained government information.

Postal codes should be cross referenced with the actual building names from another cleaner source, maybe URA SPACE?

@xkjyeah https://github.com/xkjyeah thanks for this, it is very helpful. Just want others to be aware of the issues.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/xkjyeah/singapore-postal-codes/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/ACiTR_xKwU5Vw-fw7bQntFLvK8urzLHJks5uWiEsgaJpZM4WWFyx .

xkjyeah avatar Sep 01 '18 10:09 xkjyeah

Will do.

Besides the coordinates, another useful part is the BLK_NO and ROAD_NAME parameters. It appears that the values in these can be trusted. Say, in autofill for address fields, when POSTAL is provided by the user, the underlying system can use the OneMap data to autofill the BLK_NO and ROAD_NAME parameters.

The BUILDING parameter definitely can't be trusted and will have to be filled-in by the user.

joelkarunungan avatar Sep 01 '18 15:09 joelkarunungan