ReadStat icon indicating copy to clipboard operation
ReadStat copied to clipboard

Written dta file with version=15 not readable in Stata 17 Basic Edition

Open ofajardo opened this issue 3 years ago • 1 comments

If writing a dta file setting version as 15, when opening it in Stata 17, the user gets the error:

This .dta file format was created by Stata/MP and has more variables than your Stata can handle.

The file can be opened however if the version is set to 14.

Looks kind of similar to this

Original report

ofajardo avatar Dec 02 '21 11:12 ofajardo

FWIW, this actually appears to be correct behaviour. Stata itself uses a different version numbering scheme. Current Stata versions use format 118 for most files, and 119 when there are more than 32,767 variables.

These are mapped to version = 14 and 15 respectively in pyreadstat (we do the same in haven), when actually they're both current formats but only Stata/MP supports more than 32,767 variables. I guess that in Stata the restriction on the number of variables is enforced using the file format rather than the actual number of variables included.

From the spec for reference:

The format of .dta files has changed over time. Stata 17 writes what are known as .dta format-118 files and can read all formats of files that have ever been released. The recent history of .dta formats is

   Format    Current as of
   ---------------------------------------
     119     Stata 15 - 17 (when dataset has more than 32,767 variables)
     118     Stata 14 - 17
     117     Stata 13 
     116     internal; never released
     115     Stata 12
     114     Stata 10
     113     Stata  8
   ---------------------------------------

gorcha avatar Mar 08 '22 05:03 gorcha