interpolation icon indicating copy to clipboard operation
interpolation copied to clipboard

Unclear instructions how to build address.db from openaddress data

Open vrozental opened this issue 5 years ago • 2 comments

The instruction says:

./interpolate oa address.db street.db < /data/oa/nz/countrywide.csv

How to build worldwide adress.db? Should one iterate over all CSV files in the /data/oa? There are 57 countries directories, but only 28 countrywide.csv files

Please add detailed instructions to build worldwide address.db Thank you!

vrozental avatar Jul 21 '19 20:07 vrozental

Hello @vrozental,

here's an example of how to build address.db and street.db for a single country (here: Luxembourg) without installing anything except Docker. Please let me know if this works for you. If it does, I can also provide an example for worldwide databases.

If you have a Dockerfile with this content:

FROM golang:1.13-alpine AS builder

# install pbf converters
WORKDIR /pelias
RUN apk add git wget gcc musl-dev && \
    go get github.com/missinglink/pbf && \
    git clone https://github.com/pelias/pbf2json.git

# download datasets for Luxembourg
WORKDIR /pelias/osm
# Source: https://download.geofabrik.de/europe.html
RUN wget https://download.geofabrik.de/europe/luxembourg-latest.osm.pbf

WORKDIR /pelias/openaddresses
# Source: http://results.openaddresses.io/?runs=all#runs
RUN wget https://data.openaddresses.io/runs/677497/lu/countrywide.zip && \
    unzip countrywide.zip && rm countrywide.zip

# extract polylines from openstreetmap data
WORKDIR /pelias/polylines
RUN pbf streets /pelias/osm/luxembourg-latest.osm.pbf > /pelias/polylines/luxembourg-latest.osm.0sv

FROM pelias/interpolation

COPY --from=builder /pelias /pelias
WORKDIR /code/pelias/interpolation

ENV BUILDDIR /data/builddir
ENV WORKINGDIR /data/workingdir
ENV POLYLINE_FILE /pelias/polylines/luxembourg-latest.osm.0sv
ENV OAPATH /pelias/openaddresses
ENV PBF2JSON_FILE /pelias/osm/luxembourg-latest.osm.pbf
ENV PBF2JSON_BIN /pelias/pbf2json/build/pbf2json.linux-x64

# run script that converts input data into address.db and street.db
CMD [ "./interpolate", "build"]

You can build it with docker build -t interpolation-data-generation . Afterwards, you need to create the output directory for the files that will be created and run the ./interpolate build script like this:

mkdir -p /tmp/interpolation/builddir
docker run -v /tmp/interpolation:/data -ti interpolation-data-generation

On my machine, the run takes about five minutes and the result looks like this:

tree /tmp/interpolation/builddir
/tmp/interpolation/builddir
├── address.db
├── conflate_oa.err
├── conflate_oa.out
├── conflate_oa.skip
├── conflate_osm.err
├── conflate_osm.out
├── polyline.err
├── polyline.out
├── street.db
├── tmp
│   └── leveldb
│       ├── 000002.log
│       ├── 000003.ldb
│       ├── CURRENT
│       ├── LOCK
│       ├── LOG
│       └── MANIFEST-000000
├── vertices.err
├── vertices.out
└── vertices.skip

arne-cl avatar Sep 24 '19 15:09 arne-cl

There is a script https://github.com/pelias/interpolation/blob/master/script/concat_oa.sh which can combine multiple OA files in to a single CSV stream for you.

missinglink avatar Sep 26 '19 14:09 missinglink