RoadDetections icon indicating copy to clipboard operation
RoadDetections copied to clipboard

tsv file issue

Open sukuchha opened this issue 2 years ago • 15 comments

Its not a valid geojson file so i cannot open it any GIS software.

Any simpler way to convert downloaded tsv file into valid geojson file ?

sukuchha avatar Dec 28 '22 09:12 sukuchha

Thanks sukuchha - I noticed this too. I only downloaded the Europe and Carribean files but they were both tsv rather than geojson.

alasdairrae avatar Dec 28 '22 09:12 alasdairrae

It just requires a bit more processing but maxolasersquad replied with information in another comment

alasdairrae avatar Dec 28 '22 18:12 alasdairrae

see: https://gist.github.com/johnwbryant/06b504e2cfb4044c5216a1627ccc6180

jeffcsauer avatar Dec 30 '22 17:12 jeffcsauer

Thanks Jeff 👍👍

On Fri, 30 Dec 2022, 17:47 Jeff Sauer, @.***> wrote:

see: https://gist.github.com/johnwbryant/06b504e2cfb4044c5216a1627ccc6180

— Reply to this email directly, view it on GitHub https://github.com/microsoft/RoadDetections/issues/6#issuecomment-1368029780, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5FKDR47EP3FCEU7LD7SKLWP4N2HANCNFSM6AAAAAATLDJ4UY . You are receiving this because you commented.Message ID: @.***>

alasdairrae avatar Dec 30 '22 17:12 alasdairrae

For what it's worth I've written some code that splits the region files into GeoPackage (gpkg) country files here.

I've also had a go at hosting some of these on GitHub here under the transport-network directory but I'm likely to have to get rid and try and find somewhere else to host them as I'm 1296% over my Large File Storage limit...

anisotropi4 avatar Jan 01 '23 15:01 anisotropi4

Is there a reason offering data in a format of tsv which need processing to import in GIS software? Even capacity wise, neither geojson/tsv is a good idea since it is not compressed at all.. Wouldn't shp format more ideal for easy use?

mkmdivy avatar Jan 02 '23 11:01 mkmdivy

@mkmdivy FWIW I would suggest GeoPKG over SHP as it holds all data in a single file, unlike SHP

Equally, if the files were in CSV/WKT with a header of key,WKT then a standard GIS libraries like GDAL would accept this as input. For example:

$ cat test.csv
key,WKT
GBR,"LINESTRING (-1.388108 53.194664, -1.386799 53.195011, -1.386917 53.195332, -1.38711 53.195686)"
GBR,"LINESTRING (-1.131978 52.654494, -1.131592 52.654181, -1.131538 52.654149)"
GBR,"LINESTRING (-2.118162 53.399298, -2.117958 53.398754, -2.118044 53.398454)"
GBR,"LINESTRING (-1.821392 53.804871, -1.820995 53.805213)"
GBR,"LINESTRING (-0.435398 53.786043, -0.436192 53.785916, -0.435977 53.785466, -0.435151 53.785548)"

Which is a valid GIS file

$ ogrinfo CSV:test.csv -al -so
NFO: Open of `CSV:test.csv'
  using driver `CSV' successful.
Layer name: test
Geometry: Unknown (any)
Feature Count: 5
Extent: (-2.118162, 52.654149) - (-0.435151, 53.805213)
Layer SRS WKT:
(unknown)
key: String (0.0)
WKT: String (0.0)

To convert the key/WKT file to whatever format required you can then use ogr2ogr for example GeoJSON

$ ogr2ogr -f GeoJSON test.json CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326
$ cat test.json
{"type": "FeatureCollection", "name": "GBR", "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.388108, 53.194664 ], [ -1.386799, 53.195011 ], [ -1.386917, 53.195332 ], [ -1.38711, 53.195686 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.131978, 52.654494 ], [ -1.131592, 52.654181 ], [ -1.131538, 52.654149 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -2.118162, 53.399298 ], [ -2.117958, 53.398754 ], [ -2.118044, 53.398454 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.821392, 53.804871 ], [ -1.820995, 53.805213 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -0.435398, 53.786043 ], [ -0.436192, 53.785916 ], [ -0.435977, 53.785466 ], [ -0.435151, 53.785548 ] ] } }
]}

Or similarly for GeoPackage

 $ ogr2ogr -f GPKG test.gpkg CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326

anisotropi4 avatar Jan 02 '23 13:01 anisotropi4

@mkmdivy FWIW I would suggest GeoPKG over SHP as it holds all data in a single file, unlike SHP

Equally, if the files were in CSV/WKT with a header of key,WKT then a standard GIS libraries like GDAL would accept this as input. For example:

$ cat test.csv
key,WKT
GBR,"LINESTRING (-1.388108 53.194664, -1.386799 53.195011, -1.386917 53.195332, -1.38711 53.195686)"
GBR,"LINESTRING (-1.131978 52.654494, -1.131592 52.654181, -1.131538 52.654149)"
GBR,"LINESTRING (-2.118162 53.399298, -2.117958 53.398754, -2.118044 53.398454)"
GBR,"LINESTRING (-1.821392 53.804871, -1.820995 53.805213)"
GBR,"LINESTRING (-0.435398 53.786043, -0.436192 53.785916, -0.435977 53.785466, -0.435151 53.785548)"

Which is a valid GIS file

$ ogrinfo CSV:test.csv -al -so
NFO: Open of `CSV:test.csv'
  using driver `CSV' successful.
Layer name: test
Geometry: Unknown (any)
Feature Count: 5
Extent: (-2.118162, 52.654149) - (-0.435151, 53.805213)
Layer SRS WKT:
(unknown)
key: String (0.0)
WKT: String (0.0)

To convert the key/WKT file to whatever format required you can then use ogr2ogr for example GeoJSON

$ ogr2ogr -f GeoJSON test.json CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326
$ cat test.json
{"type": "FeatureCollection", "name": "GBR", "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
"features": [
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.388108, 53.194664 ], [ -1.386799, 53.195011 ], [ -1.386917, 53.195332 ], [ -1.38711, 53.195686 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.131978, 52.654494 ], [ -1.131592, 52.654181 ], [ -1.131538, 52.654149 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -2.118162, 53.399298 ], [ -2.117958, 53.398754 ], [ -2.118044, 53.398454 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -1.821392, 53.804871 ], [ -1.820995, 53.805213 ] ] } },
{ "type": "Feature", "properties": { "key": "GBR" }, "geometry": { "type": "LineString", "coordinates": [ [ -0.435398, 53.786043 ], [ -0.436192, 53.785916 ], [ -0.435977, 53.785466 ], [ -0.435151, 53.785548 ] ] } }
]}

Or similarly for GeoPackage ?

 $ ogr2ogr -f GPKG test.gpkg CSV:test.csv -oo KEEP_GEOM_COLUMNS=NO -nln GBR -s_srs EPSG:4326 -t_srs EPSG:4326

dear anisotropi, can you suggest a code package to run the commands in windows powershell?

krassakis avatar Jan 02 '23 18:01 krassakis

@krassakis as I use a Debian based version of Linux I don't know really know what to suggest. A quick internet search for "install ogr2ogr on windows" gives this here. The ogrinfo and ogr2ogr software is part of the gdal library (My current Road Detection project is here)

anisotropi4 avatar Jan 02 '23 19:01 anisotropi4

Thank you very much.

I have the following problem, have I typed correctly input and output path ?

Thank you in advnace~

[cid:5304cac7-993f-480a-8df0-449ca38793f1]

Με φιλικούς χαιρετισμούς,

Παύλος Κρασάκης

MSc Geologist - GIS Specialist Secretary of the R.S.S.A.C. (G.S.G.)

Tel: +30- 6947849806

URL: www.gistraining.grhttp://www.gistraining.gr http://etde.space.noa.gr/


From: Will Deakin @.> Sent: Monday, January 2, 2023 9:16 PM To: microsoft/RoadDetections @.> Cc: krassakis @.>; Mention @.> Subject: Re: [microsoft/RoadDetections] tsv file issue (Issue #6)

@krassakishttps://github.com/krassakis as I use a Debian based version of Linux I don't know really know what to suggest. A quick internet search for "install ogr2ogr on windows" gives this @.***/ogr2ogr-quick-start-guide-ef3f5fe6f595>. The ogrinfo and ogr2ogr software is part of the gdal libraryhttps://gdal.org/index.html (My current Road Detection project is herehttps://github.com/anisotropi4/robin)

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/RoadDetections/issues/6#issuecomment-1369158951, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGMXQM5CPMW7TJ76YTSSY7DWQMSQRANCNFSM6AAAAAATLDJ4UY. You are receiving this because you were mentioned.Message ID: @.***>

krassakis avatar Jan 03 '23 09:01 krassakis

Here is another tool to work with this data: https://github.com/rabenojha/microsoft-road-data

ramcandrews avatar Jan 04 '23 07:01 ramcandrews

@ramcandrews I have raised an issue with the rabenojha tool as this worked for with Caribbean Islands but crashed with a memory error on the Europe-Full dataset

anisotropi4 avatar Jan 04 '23 10:01 anisotropi4

this worked for with Caribbean Islands but crashed with a memory error on the Europe-Full dataset

I shouldn't recommend things before I try them. Sorry!

ramcandrews avatar Jan 04 '23 11:01 ramcandrews

For what it's worth I've written some code that splits the region files into GeoPackage (gpkg) country files here.

I've also had a go at hosting some of these on GitHub here under the transport-network directory but I'm likely to have to get rid and try and find somewhere else to host them as I'm 1296% over my Large File Storage limit...

image

It is easy to download GRC.zip ? it ask me to first download an Git LFS exe.

krassakis avatar Jan 05 '23 21:01 krassakis

@krassakis I was told I had exceeded my GitHub LFS allocation by 1296% last week so I suspect the files have been deleted. Although I am looking at alternative hosting but that won't be quick, if you would like a copy it might be quicker to contact me via social media and I'll see what I can do

anisotropi4 avatar Jan 05 '23 22:01 anisotropi4

Data's format is a TSV file (tab separated values) with 2 columns - CountryCode (which is an alpha-3 code for the country where that geojson is) and a GeoJson linestring.

There is just simply too much data to store it in one file of geojson format. One simple way to create a combined geojson is to do something like this (all can be done manually in a notepad++):

{"type":"FeatureCollection","features":[ linestring1 , linestring2 , ... linestringN ]}

Here is an example I made for Cayman Islands with the above method with visualization from geojson.io: image

Otherwise, data from this repo can be easily converted into any other format with simple python or any other programming language. (You can ask your favorite generative AI to write a script for you). CountryCode can be used to filter out a single country to try out the data and not get an out of memory problems