couchdb-best-practices icon indicating copy to clipboard operation
couchdb-best-practices copied to clipboard

What is the optimal and/or recommended way to store Hierarchical data on CouchDB

Open jofomah opened this issue 10 years ago • 2 comments

This should make it easy and possible to do the following in one request:

  • Get all descendant of a node. E.g given a State Id, retrieve all location (State Zones, LGA, Ward and Health Facilities under the state), give a Ward Id, return LGA Id, return all (Wards and Health Facilities under the Ward) and Given a State Id, return a given level under it, maybe all Wards, all LGAs and all Health Facilities.
  • Able to get each Node's ancestors in one query.
  • Run other kinds of deeply nested queries.

jofomah avatar Jun 24 '15 10:06 jofomah

Possible Answer:

i think i have found something that can work on CouchDB, i looked into patterns used to store hierarchical data on relational database

  • Adjacency list : won't work for me because i can't query deeply nested trees.
  • Path Enumeration: this will work unless there is a downside that am not seeing (edited) in this case, i will just store each document's ancestors as an array property of the document this is similar to @jo's suggestion to emit ([country, national level, state], doc), the only difference is that i won't have to store the ancestor as string property of the document e.g doc.stateId, doc.wardId, doc.lgaId but as array of ancestors i can use administrative levels not tied to any country's administrative level names.

but in this case, i will just emit each ancestor array element as key

for(var i in doc.ancestors){
  var ancestorId = doc.ancestors[i];
  emit(ancestorId)
}
var healthFacility = {

  "ancestors": ["NG", "NW", "KN"]
}

if i now query with key = stateId, where stateid = 'KN', i will be able to get all state sub-levels that has 'KN' as one of their ancestors, which what the current adjacent list am using could not give me. (edited)

jofomah avatar Jun 27 '15 10:06 jofomah

This is a good solution. I recommend to emit the whole ancestors array:

function(doc) {
  emit(doc.ancestors)
}

This has two advantages over using one view per level:

  • ids do not have to be unique
  • require less disk space

But also a disadvantage:

  • you have to know the full path for querying

If you now want to get a list of all docs in state KN you query for the full path:

{
  "startkey": ["NG", "NW", "KN"],
  "endkey": ["NG", "NW", "KN", {}]
}

The same view returns all documents for a higher level:

{
  "startkey": ["NG", "NW"],
  "endkey": ["NG", "NW", {}]
}

jo avatar Jun 29 '15 09:06 jo