core
core copied to clipboard
PBFOsmStreamTarget not writing valid data
Given the following function which operates on any osm.pbf file and writes the output to the destinationFileName
/// <summary>
/// Method which will read a file and strip out any tags which are not railway=
/// </summary>
/// <param name="souceFileName">Where to read</param>
/// <param name="destinationFileName">Where to write</param>
internal static void FilterForRailway(string souceFileName, string destinationFileName)
{
//Read an inputfile
using (var sourceFile = File.OpenRead(souceFileName))
{
//Use a PBFOsmStreamSource to read the nodes.
var source = new OsmSharp.Streams.PBFOsmStreamSource(sourceFile);
//Create an output file
using (var outputFile = File.OpenWrite(destinationFileName))
{
//Create the writer
var target = new OsmSharp.Streams.PBFOsmStreamTarget(outputFile);
//Initialize it
target.Initialize();
//Loop all nodes in the source
foreach (var node in source)
{
//If the node doesn't have a railway attribute then continue
if(false == node.Tags.ContainsKey("railway"))
{
continue;
}
//handle writing of way, relation or node
if(node is OsmSharp.Way way)
{
Console.WriteLine($"Writing Way {way.ToString()}");
target.AddWay(way);
}
else if (node is OsmSharp.Relation relation)
{
Console.WriteLine($"Writing Relation {relation.ToString()}");
target.AddRelation(relation);
}
else if (node is OsmSharp.Node osmNode)
{
Console.WriteLine($"Writing Node {osmNode.ToString()}");
target.AddNode(osmNode);
}
else
{
Console.WriteLine($"Writing Node from Unknown: {node.ToString()}");
target.AddNode(new OsmSharp.Node()
{
Id = node.Id,
ChangeSetId = node.ChangeSetId,
//Latitude = 0,
//Longitude = 0,
Tags = node.Tags,
TimeStamp = node.TimeStamp,
UserId = node.UserId,
UserName = node.UserName,
Version = node.Version,
Visible = node.Visible,
//Type = node.Type
});
}
target.Flush();
}
}
}
}
I get a resulting osm.pbf file but the file is not able to be parsed back by other readers such as QGIS.
I am wondering if I did something wrong in my function and if so what I did.
Thank you for your time!
I tried doing something like this but the result seems to be the same, it just takes longer
var filtered = from element in source
where element.Type == OsmSharp.OsmGeoType.Node ||
(element.Type == OsmSharp.OsmGeoType.Way && (element.Tags.ContainsKey("railway")))
select element;
var complete = filtered.ToComplete();
I did not know QGIS could open osm pbf files. You are not talking about mvt (mapbox vector tiles) also encoded using protobuf?
Yes, QGIS can open osm.opbf files and show a visual representation before importing with ogr2ogr
although I am looking at https://github.com/OsmSharp/sqlserver-dataprovider to replace that also.
No I am not talking about mvt, I have a planet.osm.pbf file and I want to get similar results to what:osmium tags-filter -o planet-rail.osm.pbf planet.osm.pbf nw/railway
would provide.
I would like to achieve this in code to simplify my processes for creating routerdb so they can be updated when required.
When I use the var complete = filtered.ToComplete();
from the above example the resulting file is almost 10 times as large as the original.. I don't think I need it as I loop all nodes in the original example anyway...
What I did was I downloaded pa.osm.pbf from http://download.geofabrik.de/north-america/us/pennsylvania-latest.osm.pbf
Then I ran the above function on the file and output what was supposed to be a smaller file with only the railyway tags; What I got what a file that is over 1GB and it's still growing..
Is there a problem in my above logic?
It is possible QGIS doesn't support the uncompressed variety of the PBF files; try adding compress=true here:
https://github.com/OsmSharp/core/blob/develop/src/OsmSharp/Streams/PBFOsmStreamTarget.cs#L57
This value isn't true by default because it would a breaking change in OsmSharp. When I release v7 it will be true by default.
I think your filter also include all the nodes, not just those part of a railway. To do that you need two passes. FIrst pass you take all railway ways, index all nodes in a hashset. Second pass you return all nodes with ids in the hashset and all railway ways. That should give you all the railway data and only the railway data.
I will give it a try, thank you!
/// <summary>
/// Method which will read a file and strip out any tags which are not railway=
/// </summary>
/// <param name="souceFileName">Where to read</param>
/// <param name="destinationFileName">Where to write</param>
internal static void FilterForRailway(string souceFileName, string destinationFileName)
{
//Read an inputfile
using (var sourceFile = File.OpenRead(souceFileName))
{
//Use a PBFOsmStreamSource to read the nodes.
var source = new OsmSharp.Streams.PBFOsmStreamSource(sourceFile);
//Create an output file
using (var outputFile = File.OpenWrite(destinationFileName))
{
//Create the writer with compressed data
var target = new OsmSharp.Streams.PBFOsmStreamTarget(outputFile, true);
//Initialize it
target.Initialize();
//First pass you take all railway ways [nodes], index all nodes in a hashset.
var filtered = source.Where(element => element.Type == OsmSharp.OsmGeoType.Node && element.Tags.ContainsKey("railway"));
//Create the HashSet
HashSet<long?> index = new HashSet<long?>(filtered.Select(g => g.Id));
//Second pass you return all nodes with ids in the hashset and all railway ways.
filtered = source.Where(element => index.Contains(element.Id) || element.Type == OsmSharp.OsmGeoType.Way && element.Tags.ContainsKey("railway"));
//Loop all nodes in the source
foreach (var node in filtered.ToComplete())
{
//handle writing of way, relation or node
if (node is OsmSharp.Way way)
{
Console.WriteLine($"Writing Way {way.ToString()}");
target.AddWay(way);
}
else if (node is OsmSharp.Relation relation)
{
Console.WriteLine($"Writing Relation {relation.ToString()}");
target.AddRelation(relation);
}
else if (node is OsmSharp.Node osmNode)
{
Console.WriteLine($"Writing Node {osmNode.ToString()}");
target.AddNode(osmNode);
}
else
{
Console.WriteLine($"Not Node from Unknown: {node.ToString()}");
continue;
}
target.Flush();
}
}
}
}
Still not working but the file is smaller :)
Also why do I have to make 2 passes?
Couldn't I just do:
//Combine first and 2nd pass
var filtered = source.Where(element => (element.Type == OsmSharp.OsmGeoType.Node && element.Tags.ContainsKey("railway")) || element.Type == OsmSharp.OsmGeoType.Way && element.Tags.ContainsKey("railway"));
Furthermore, I don't care if QGIS can read the file or not honestly (although it would help) but the big problem is that I can't build a routerDb from the resulting osm.pbf :( it can't find any routes positions.
I am basically just trying to achieve what osmium does with the aforementioned command above:
osmium tags-filter -o planet-rail.osm.pbf planet.osm.pbf nw/railway
var filtered = from element in source
where element.Type == OsmSharp.OsmGeoType.Node ||
(element.Type == OsmSharp.OsmGeoType.Way && (element.Tags.ContainsKey("railway")))
select element;
Still results in a file which is larger than the original even with compress = true. (still almost 4 times as large)
This is strange to me as it should only be writing a subset of the data.
Several other attempts to handle this have resulted in smaller files but none of which seem to be valid:
foreach (var node in filtered.ToComplete())
{
//handle writing of way, relation or node
if (node is OsmSharp.Way way)
{
Console.WriteLine($"Writing Way {way.ToString()}");
target.AddWay(way);
}
else if (node is OsmSharp.Complete.CompleteWay completeWay)
{
Console.WriteLine($"Writing Complete Way {completeWay.ToString()}");
var simple = completeWay.ToSimple() as OsmSharp.Way;
target.AddWay(simple);
}
else if (node is OsmSharp.Relation relation)
{
Console.WriteLine($"Writing Relation {relation.ToString()}");
target.AddRelation(relation);
}
else if (node is OsmSharp.Complete.CompleteRelation completeRelation)
{
Console.WriteLine($"Writing Complete Relation {completeRelation.ToString()}");
var simple = completeRelation.ToSimple() as OsmSharp.Relation;
target.AddRelation(simple);
}
else if (node is OsmSharp.Node osmNode)
{
Console.WriteLine($"Writing Node {osmNode.ToString()}");
target.AddNode(osmNode);
}
}
Valid meaning as that I cannot get a routerDb generated from them which resolves anything nor can I open with QGIS.
Even tools like osmconvert
report:
osmconvert Error: block raw size expected at: 0x1A.
Just as an FYI, the example given @ Sample.CompleteStream does not result in a file which loads in QGIS either. And tools like osmconvert report the same error block raw size expected at: 0x1A.
Can you make a simple reproducible test, then I can have a look.
You can just use your CompleteStream example or any of the other examples you provide. Ignore my functions for now as needed.
- Load the Source PBF with QGIS before and verify it will load
- Run the Sample on the Source and write out a new PBF
- Attempt load the new PBF in QGIS or use any tools such as
osmconvert
etc and you will receive the error cited above:block raw size expected at: 0x1A.
My team has tested this on pretty much all the examples provided and they all exhibit the same issue after being written out.
@xivk @juliusfriedman #132 the same error
Unconfirmed but probably fixed by fixing #132, feel free to reopen if not the case.