mbslave icon indicating copy to clipboard operation
mbslave copied to clipboard

Fixed small bugs in Solr import

Open buremba opened this issue 12 years ago • 2 comments

Fixed bugs caused by schema change. And also fixed memory overloading problem in solr recording data. Recordings table has more than 10 million records and the sql query without LIMIT may cause memory overloading.

buremba avatar Jun 02 '12 00:06 buremba

Sorry for not merging this earlier. I just noticed it, when I was looking for making the Solr export working again. I'm not sure I understand the memory problem. What is allocating that much of memory? The Python code is all based on iterators, so it should only process one row at a time.

lalinsky avatar Oct 08 '12 12:10 lalinsky

The memory problem is not caused by Python. SELECT r.gid,rn.name, an.name FROM recording r JOIN track_name rn ON r.name = rn.id JOIN artist_credit ac ON r.artist_credit = ac.id OIN artist_name an ON ac.name = an.id returns more than 10m records and Postgresql consumes ~1.5gb memory for this query. If the server does not have enough memory to run this query, Postgresql kills the process that created by Postgresql for the query and throws an exception. It means the query returns an empty result and if it occurs Python can't create index file for recordings.

buremba avatar Oct 08 '12 19:10 buremba