ExPanDaR icon indicating copy to clipboard operation
ExPanDaR copied to clipboard

Cross Section Identifier Problem

Open KardelenCicek opened this issue 2 years ago • 1 comments

I have panel data for many countries and many years. When I choose the country column as a cross-sectional identifier it doesn't recognize countries and I can't produce country-level graphs. I only see graphs based on years. Also I am getting that error: Warning: Error in : Column name Years must not be duplicated. Use .name_repair to specify repair. Caused by error in stop_vctrs(): ! Names must be unique. x These names are duplicated:

  • "Years" at locations 1 and 2. 202: <Anonymous> geom_smooth() using method = 'loess' and formula 'y ~ x' Warning: Removed 1 rows containing missing values (position_stack). Warning: guides(<scale> = FALSE) is deprecated. Please use guides(<scale> = "none") instead.

KardelenCicek avatar Mar 15 '22 20:03 KardelenCicek

Hi there and sorry for the delay. This seems to be related to the structure of your data. Are you providing a data frame to Expand or are you uploading data? If you are still having issues and provide a data frame to Expand() could you please the structure of you data frame (str(df)). Alternatively, I would need access to your data to see what is going on.

If this is not feasible, maybe comparing the structure of your data with the structure of the package's example World Bank data could give you an idea about how to proceed. You can call the raw version of this data with ExPanD as follows:

library(ExPanDaR)
ExPanD(worldbank, cs_id = "country", ts_id = "year")

You will see that by default, ExPanD does not offer graphing by the cross-section as in most real-life cases there are too many units in the cross-section for meaningful visualisation. You can, however, add an additional country identifier to your data frame and then prepare graphs along that one. In the example above, you can create visuals along, for example, iso3c, but, as mentioned above, these are not really informative given that there are too many countries in the sample. Aggregating data by region is often more informative in these cases.

Good luck!

Joachim

joachim-gassen avatar Apr 09 '22 15:04 joachim-gassen