Add sqlite version of the corpus
Make sqlite version of the corpus Rabble routers team at PLOS is interested in this!
@eseiver : I've been talking with Simon on this. I got lot of clues for me to start a research. The only thing I need from you is to tell me about the sources of the XML files. Do they come from Rhino? I was told that Rhino XML has more information because they have the subject areas, and they can be used as a search parameter. When I see the XML I can see the subjects so I presume that these are XML from Rhino, can you confirm?
I'm not sure which system rhino is, but I'm pulling them from either content-repo or the journal pages directly. Those two XML forms are equivalent