ietoolkit
ietoolkit copied to clipboard
iebaltab rewrite kb
Please provide comments in this PR
Ok I have looked through this now and it looks hugely improved, great work! I just have one last question, which is, how can I request a mean and a variance statistic for the pairwise comparisons at the same time (ie, beta
and p
)? That is the only thing I would want to do that I can't figure out and there appears to be room in the table for that where there are currently blank spaces.
For example, I run:
iebaltab pop medage, grpvar(region) savex("/users/bbdaniels/desktop/test.xlsx") replace stats(pair( beta ))
And get:

But this fails, so it is not clear how to fill the blanks (as in K5):
. iebaltab pop medage, grpvar(region) savex("/users/bbdaniels/desktop/test.xlsx") replace stats(pair( beta se))
The [pair(beta se)] in [stats(pair( beta se))] is not formatted properly. Make sure that each test name is follewed by a single stat
inside a ().
invalid syntax
r(198);
how can I request a mean and a variance statistic for the pairwise comparisons at the same time (ie,
beta
andp
)?
This is not supported yet. This would be neat but somewhere I had to draw the line to to get it out the door at some point. But how I set up the stats()
option had this in mind so that it would later not be too much work to add.
Here is issue to track this if you want to add some more comments.
I will not hold this up further! That seems like a useful and straightforward feature to add next. If you are happy with the functionality then it is good from my side!
Writing down thoughts as I test this:
- [x] The
browse
option should be abbreviated tobr
- [x] Significance levels should read 0.01, 0.05 and 0.1 instead of 1%, 5%, 10%
- [x] Why does the csv file have quotation marks around the numbers? Can we remove it? There are also spaces around the numbers.
- [ ] The sign of the difference in means is not changing when we change the base group using option
control()
- [x] add option to make "Total" the first column in the table
- [x] make option
onerow
the default when the number of observations is not changing - [x] use of option
format()
returns error "too few quotes" - [x] (FOR THE FUTURE) when using the option
stats()
for the pairwise comparison, one may want to show a combination of stats. in particular, one may want to show the beta + se, or the diff + p-value. I think this was a particular request we received and may be useful to think about how to implement it
Why does the csv file have quotation marks around the numbers? Can we remove it? There are also spaces around the numbers.
Cells with labels used for row or col headers using ,
needs to be enclosed in ""
to not break the comma delimitator. I am not adding ""
manually but telling Stata to add it to all values when exporting. It will be some work to do this manually so that it only applies to data points that are labels. What is a scenario where this is an actual issue? If there is a good reason we can work on adding ""
manually.
add option to make "Total" the first column in the table
How important? Can this be pushed to a later version? I am keen to get this version out, then we can add new features that we want.
make option
onerow
the default when the number of observations is not changing
I vote no for backward compatibility. We have broken backward compatibility in this version for some cases where it is a really good case for it. I do not think this is as important.
(FOR THE FUTURE) when using the option stats() for the pairwise comparison, one may want to show a combination of stats. in particular, one may want to show the beta + se, or the diff + p-value. I think this was a particular request we received and may be useful to think about how to implement it
Absolutely. And the stats()
option is set up with that in mind, but Ben and I decided it did not fit in the scope of this version. It is tracked here: https://github.com/worldbank/ietoolkit/issues/293
The
browse
option should be abbreviated tobr
Done in 10bcb20e95d679156f04968974e885449664b6a8
Significance levels should read 0.01, 0.05 and 0.1 instead of 1%, 5%, 10%
Done in fc167cc6a7b176f5e3717308d99c2aa032d387bf
add option to make "Total" the first column in the table
How important? Can this be pushed to a later version? I am keen to get this version out, then we can add new features that we want.
Yeah, this can come later. It's just a bit counter intuitive, but nothing major. Moved it to issue #298
make option
onerow
the default when the number of observations is not changingI vote no for backward compatibility. We have broken backward compatibility in this version for some cases where it is a really good case for it. I do not think this is as important.
My vote would be to do it now. If we're already breaking backward compatibility, better to make all changes we want to that will break it now. That's a weak preference, though. @bbdaniels , what do you think?
Why does the csv file have quotation marks around the numbers? Can we remove it? There are also spaces around the numbers.
Cells with labels used for row or col headers using
,
needs to be enclosed in""
to not break the comma delimitator. I am not adding""
manually but telling Stata to add it to all values when exporting. It will be some work to do this manually so that it only applies to data points that are labels. What is a scenario where this is an actual issue? If there is a good reason we can work on adding""
manually.
It's not a big issue at all. I just find it harder to read the changes on GitHub. But it's super small and not worth spending a lot of time.
use of option
format()
returns error "too few quotes" Fixed in 161309cc88d61bc72e07cbb1576e0592f5091ee2
make option
onerow
the default when the number of observations is not changingI vote no for backward compatibility. We have broken backward compatibility in this version for some cases where it is a really good case for it. I do not think this is as important.
My vote would be to do it now. If we're already breaking backward compatibility, better to make all changes we want to that will break it now. That's a weak preference, though. @bbdaniels , what do you think?
I am always in favor of breaking compatibility for big functionality upgrades at major versions! But I don't think this is supposed to be a major version -- primarily a rewrite so that we can do that in the future? So in this case I would say keep things backward compatible and don't have a major version, then let's carry over a list of compatibility-breaking functionality and syntax changes for the next version.
add option to make "Total" the first column in the table
How important? Can this be pushed to a later version? I am keen to get this version out, then we can add new features that we want.
Yeah, this can come later. It's just a bit counter intuitive, but nothing major. Moved it to issue #298
Fixed in 851a9c25888ef1711ec875c694531887432473e9
make option onerow the default when the number of observations is not changing
Wont implement due to this argument made on Slack:
I am still not convinced this is the best way to got. Lets say all my columns has the same N. With the new default I would only have my Ns on a single row at the bottom. If then something changes to my data so one data point is missing. Then I would not get an error or a warning, instead the structure of my table would tacitly change. Is it only me who sees that as concerning or confusion at best. I still vote for keeping it as it is. As in the current default, the layout of the table does not change as the data change. And by adding the option
onerow
you add a requirement that the data should be in a certain way (i.e. N the same in a column across all rows). For those that know their data will always meet this condition they can runiebaltab
withonerow
always there, just like all of us who know we want to replace the already existing file runiebaltab
withreplace
.
From looking at the compiled latex file and comparing with the current SSC version, all the estimates are consistent, but the display still has some issues. v6.4.pdf v6.3.pdf
- [x] significance stars are not being exported, even when the
stars
option is explicitly called - [x] can we leave the double bars at the bottom of the table?
- [x] can we also leave the extra spacing between variables? I find that easier to read
- [x] SEs are not in parenthesis
- [x] can we add a horizontal line before number of obs if using one row?
- [x] can we add an option to not show the number of obs for the pairwise tests?
SEs are not in parenthesis
Fixed in 3cef80d80d81b57704b10062cb2065e7b9c07e4c
significance stars are not being exported, even when the stars option is explicitly called
There is no option called stars
, you should have received an error in the new version with that option. Anyways, there was a bug suppressing the stars in the tex tables, that is fixed in 3657647f1d44ca529a037178669f47cc67f396ec
can we leave the double bars at the bottom of the table? can we add a horizontal line before number of obs if using one row?
fixed in c6418bf4735943d8acd2502422f246a58e7ce604 - but lests discuss if there should be one or two single lines when both f-test and onerow are used
can we also leave the extra spacing between variables? I find that easier to read
fixed in 0c5b7562d5cf0c88105c1cc32e742062021d4d85 - I also took the liberty to deprecate the option texvspace(string)
that customize that space, but for every row. The whole point of returning results in matrices is for users to customize their tex tables outside iebaltab
.
can we add an option to not show the number of obs for the pairwise tests?
Suggest adding to an issue to get this version out
significance stars are not being exported, even when the stars option is explicitly called
There is no option called
stars
, you should have received an error in the new version with that option. Anyways, there was a bug suppressing the stars in the tex tables, that is fixed in 3657647
Sorry, I meant the starlevels()
option