nepalmap_app icon indicating copy to clipboard operation
nepalmap_app copied to clipboard

Add agriculture production data

Open ravinepal opened this issue 8 years ago • 20 comments
trafficstars

Here's the data (https://github.com/Code4Nepal/data/tree/master/datasets/agriculture) available

Source: Ministry of Agriculture Development, Nepal (PDF)

Here's a guide on how to visualize data on NepalMap.

ravinepal avatar Oct 29 '17 15:10 ravinepal

Please note that the data sets sometimes include things other than districts, like "C.REGION". Non-district data should be removed when importing.

Please also note that districts that have no values may legitimately be given a value of zero for a data point depending on how confident we are that there really is nothing of that thing in the missing districts. For some data we may not be confident that the missing districts have none of the thing being counted. In those cases, using zero for the missing districts may NOT be appropriate. For example, the tea data has an other category. Obviously, it is incorrect that all the districts that are not listed have zero tea production because the statistics tell us there is tea produced in "other" districts. I am not sure how we should handle this case.

cliftonmcintosh avatar Oct 29 '17 16:10 cliftonmcintosh

heya, i am interested in contributing to the project. could you guide me, in what exactly help you need? do you need to create a database, or stream the data through the main program?

I would love to be active in your project : )

wizofe avatar Oct 29 '17 19:10 wizofe

I am attempting solution for #214 Add agriculture production data. Will submit pull request for approval when successful. Thank you.

bbulpett avatar Oct 29 '17 21:10 bbulpett

There is currently not an agriculture section in NepalMap. With the first agriculture data integration, the agriculture section will need to be created, much like we have sections on Demographics, Forest and Land Use, Disasters, etc.

cliftonmcintosh avatar Oct 29 '17 21:10 cliftonmcintosh

@wizofe and @bbulpett

There are several data sets for agriculture. Please submit one pull request per data set. Please also consider "claiming" a specific data set so that other people will know what is already being worked on.

cliftonmcintosh avatar Oct 29 '17 21:10 cliftonmcintosh

As I mentioned earlier, some data sets may not be complete because we lack data for some districts. They should not be included without further analysis. Here is a list of data sets that appear to be complete enough to work on:

Please note these should be considered valid data sets only if they have data for all 75 districts.

Here is a list of data sets that appear to be incomplete and, in my opinion, should not be worked on without further evaluation.

  • Horses and Asses Population -- there is not data for every district, and I believe it is highly unlikely that the missing districts have no horses (or asses)
  • Rabbit population -- missing some districts
  • Tea -- has an "other" category that needs sorting out
  • Water and fish production -- has some incomplete data for some districts
  • Yaks -- zeroes in some districts are likely to be legitimate, but we should take a closer look
  • Coffee -- has an "other" category that needs sorting out
  • Cotton -- only a few districts, and need to verify if all the others have no cotton
  • Jute -- only a few districts, and need to verify if all the others have no jute

Some of these may be valid but we should verify that the lack of data really means that the missing districts have zero of those things.

cliftonmcintosh avatar Oct 29 '17 21:10 cliftonmcintosh

Thank you @cliftonmcintosh for the explanation. Setting up my dev environment now. Will begin by adding Agriculture section. I will then start work on the Egg production data set.

bbulpett avatar Oct 29 '17 21:10 bbulpett

@cliftonmcintosh @bbulpett I am going to do the Milk Animals and Milk Production.

wizofe avatar Oct 29 '17 22:10 wizofe

Thanks, @wizofe and @bbulpett

cliftonmcintosh avatar Oct 29 '17 22:10 cliftonmcintosh

@bbulpett and @wizofe

It is perfectly fine to submit the work in steps. For example, you could submit a PR with just the SQL files for the statistics in your data sets. Like this one for forests.

Also please note that your data set may contain more than one data point, and each one would require its own integration into NepalMap. For example, the egg production data set is probably two separate data sets, one for number of laying animals by type (chicken versus duck) and another for eggs laid by type (chicken eggs versus duck eggs). So there would be an egg-laying animal table and an eggs table. The data on milk animals looks similar. It's likely there will be two tables, one for the type of animals, another for the amount of milk.

If you choose one to start with, please choose the actual eggs and the actual milk.

cliftonmcintosh avatar Oct 29 '17 23:10 cliftonmcintosh

@nikeshbalami

What is the unit for milk in the milk data? Litres?

cliftonmcintosh avatar Oct 29 '17 23:10 cliftonmcintosh

Hi @cliftonmcintosh its Unit: Mt.

nikeshbalami avatar Oct 30 '17 01:10 nikeshbalami

@nikeshbalami

What is an "Mt."?

cliftonmcintosh avatar Oct 30 '17 02:10 cliftonmcintosh

It's a "Metric Ton (Mt.)" @cliftonmcintosh

nikeshbalami avatar Oct 30 '17 02:10 nikeshbalami

Thanks

cliftonmcintosh avatar Oct 30 '17 02:10 cliftonmcintosh

Hi @cliftonmcintosh I took the liberty of adding the meat production data in https://github.com/Code4Nepal/nepalmap_app/pull/217 since it wasn't claimed by anyone else.

Bezzy1999 avatar Oct 31 '17 08:10 Bezzy1999

@nikeshbalami and @ravinepal

I'm working on the egg data, and it seems like it must be incorrect. The number of hens and ducks is much, much higher than the number of eggs laid. For example, there are over 12 million laying hens but only about 1.3 million hen eggs laid. That means there is only one egg for every ten hens. That seems crazy. There's no way anyone would have ten hens and only expect one egg a year out of those ten hens. I grew up with chickens, and if we were in that situation, we would just kill them all and eat them. Can you help me understand the data? Is it just messed up?

cliftonmcintosh avatar Jun 16 '18 22:06 cliftonmcintosh

@nikeshbalami and @ravinepal Here is the problem: The egg numbers are for thousands, so "25" means "25000".

See page 48 of the PDF report

eggs-by-thousand

cliftonmcintosh avatar Jun 16 '18 23:06 cliftonmcintosh

Thanks @cliftonmcintosh and so sorry, I forget to add "Unit" in all datasets which had created a problem. Will be taking care of it from now-onwards while scrapping data.

nikeshbalami avatar Jun 17 '18 02:06 nikeshbalami

@nikeshbalami

No worries. Thanks for the response.

cliftonmcintosh avatar Jun 17 '18 14:06 cliftonmcintosh