pandas icon indicating copy to clipboard operation
pandas copied to clipboard

ENH: Restore support for reading Stata 104 format dta files, and add support for 103

Open cmjcharlton opened this issue 1 year ago • 3 comments

  • [x] closes #58554
  • [x] Tests added and passed if fixing a bug or adding a new feature
  • [x] All code checks passed.
  • [ ] Added type annotations to new arguments/methods/functions.
  • [x] Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

I wasn't sure whether to label this as a bug fix or enhancement as support has existed in the Pandas previously, but it last worked a long time ago.

I have included 103 support in this too as for the purposes of reading the format is identical (103 does not support data of byte type).

cmjcharlton avatar May 03 '24 18:05 cmjcharlton

cc @bashtage

jbrockmendel avatar May 03 '24 21:05 jbrockmendel

Seems pretty simple. Are the dta files produced by Stata or something else?

bashtage avatar May 08 '24 18:05 bashtage

The new test files are produced with a program that I wrote based on the published specifications, but I test them with Stata (and occasionally a hex editor to be sure). I have also tested with a variety of historic files available from the Stata Technical Bulletin and the Stata Journal.

cmjcharlton avatar May 08 '24 19:05 cmjcharlton

Thanks @cmjcharlton

mroeschke avatar Jun 03 '24 18:06 mroeschke