sumhdfe
sumhdfe copied to clipboard
Summary and diagnostic information for evaluating within-fixed-effect variation.
Sumhdfe is a Stata package that produces summary and diagnostic information of linear fixed effect models.
You can use sumhdfe
to:
- Check the frequency of fixed effects
- Check the number of groups that have no variation within fixed effects
- Check the residual within-fixed-effect variation of the regression variables
- Generate publication-ready Word and Latex tables for all fixed-effects diagnostics
Sumhdfe is currently in beta version and we welcome comments and suggestions in the issue
tab!
For a discussion on the issues that sumhdfe
addresses, see deHaan (2021).
Similarly, if you find these diagnostics to be useful, please cite:
deHaan, Ed. (2021). Using and Interpreting Fixed Effects Models.
Available at SSRN: https://ssrn.com/abstract=3699777.
Authors
Table of contents
- Installation
- Usage & Features
- Pending Items
- Questions?
Installing sumhdfe
Sumhdfe
is an extension to reghdfe
and requires version 6+ of reghdfe
and ftools
to work. In order to generate .rtf files you also need to have rtfutil
installed.
To install sumhdfe
and its dependencies follow the steps below:
* Uninstall any old versions of ftools, reghdfe, sumhdfe
cap ado uninstall ftools
cap ado uninstall reghdfe
cap ado uninstall sumhdfe
* Install the most recent version of ftools, reghdfe, and sumhdfe
net install ftools, from("https://raw.githubusercontent.com/sergiocorreia/ftools/master/src/")
net install reghdfe, from("https://raw.githubusercontent.com/sergiocorreia/reghdfe/master/src/")
net install sumhdfe, from("https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/src/")
* To generate rtf files you also need to install rtfutil
ssc install rtfutil
Note: sumhdfe
does not work with reghdfe
version 5, which is the version that is installed by when running ssc install reghdfe
.
Make sure to use the commands above to install reghdfe
version 6.
Usage & Features
Example usage
Sumhdfe
can be used in one of two ways:
- As a postestimation command following
reghdfe
- As a standalone command
Post-estimation version
First run reghdfe
and then run sumhdfe
. A simple example is show below, see the Stata help file for additional examples.
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
reghdfe y x1 x2 , a(firm year)
sumhdfe
Standalone version
Run sumhdfe
directly.
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
sumhdfe y x1 x2 , a(firm year)
Features
The sumhdfe
command will provide four panels by default:
- Panel A - summary statistics for the full sample
- Panel B - summary statistics for the fixed effects
- Panel C - groups without any within-fixed-effect variation
- Panel D - variation lost (absorbed) due to fixed effects
Additionally, sumhdfe
can provide:
- A histogram that shows the frequencies of observations within a fixed effect group
- Publication ready tables
Panel A - Summary statistics
Summary statistics for the sample used in reghdfe
.
Example:
data:image/s3,"s3://crabby-images/15c93/15c930f813e3878e70fe7babdebeb9ef8bc6bd7b" alt=""
Notes:
- It can be customized similar to
estat summarize
- N includes singletons, so it differs from N shown in the
reghdfe
output - When using the
panels(str)
option, this panel can be selected using thesum
accronym:panels(sum)
Panel B - Summary statistics for fixed effects
Summary statistics for the fixed effects themselves.
Example:
data:image/s3,"s3://crabby-images/18d49/18d491355ddfe9f1b4335fb2c643e638eaa1fca4" alt=""
Notes:
- Interpretation of the above example:
- There are 189 unique firms within the firm fixed effects, 28 of which are singletons (i.e., appear just once). An individual firm has between 1 and 8 observations.
- There are 39 unique years within the year fixed effects, 8 of which are singletons.
- Iterating across both firm and year eliminates 2 more "joint singletons," for a total of 38 singletons eliminated from the
reghdfe
output.
- When using the
panels(str)
option, this panel can be selected using thefe
accronym:panels(fe)
Panel C - Groups without any within fixed effect variation
Panel C quantifies how often each variable is constant within a given fixed effect group (such as within a given firm). These observations can have unexpected effects on regression coefficients and, if numerous, should be carefully evaluated.
Example:
data:image/s3,"s3://crabby-images/70eee/70eee61fc678a9a5da0993ebcd3022f1b8e2271d" alt=""
Notes:
- Interpretation of the above example:
- Variable x1 has (623-38=) 585 observations excluding singletons.
- Within the non-singleton data, 58 firms have no variation in x1; i.e., each firm has the same x1 in all years. Those 58 firms relate to 217 observations.
- X1 is constant within 4 years, relating to 28 observations.
- When using the
panels(str)
option, this panel can be selected using thezero
accronym:panels(zero)
Panel D - Variation lost (absorbed) due to fixed effects
Panel D shows how much variation in each variable is lost (or absorbed) due to the fixed effects, in terms of both standard deviations and r-squared.
Example:
data:image/s3,"s3://crabby-images/016e2/016e2458c963cc9c2ac2a3fed7dbe2490ba6793e" alt=""
Notes:
- Interpretation of the above example:
- The standard deviation of x1 is 79.7 in the pooled sample (as also showed in Panel A), but the within-fixed-effect standard deviation of x1 is 22.7. Thus, the within-fixed effect variation of x1 is roughly 28.4% of the pooled sample.
- In terms of r-squared, the firm fixed effects explain roughly 87% of the variation in x1 while the year fixed effects explain roughly 13%. Combined, the fixed effects explain 92.4% of the variation in x1.
- Technical note: the r-squared is relative to the sample including singletons, for which the r-squared is mechanically equal to 100%.
- When using the
panels(str)
option, this panel can be selected using therss
accronym:panels(rss)
Histogram
The histogram(#)
option tabulates the frequencies of observations within a fixed effect grouping.
Example:
For example, sumhdfe, histogram(1)
shows the frequencies of observations for the first fixed effect grouping listed within a(firm year)
, i.e., firm. You can also specify the fixed effect name; for example sumhdfe, histogram(year)
.
data:image/s3,"s3://crabby-images/660d2/660d2fc1e4c46e8d39308dd7ee85267472d29540" alt=""
Publication ready tables
All panels can be exported to a publication ready RTF or Latex table. The RTF table can be used in Word or Excel (by copying the contents to an Excel sheet).
To export the tables:
- First run
sumhdfe
- Run the
sumhdfe_export
command- You can optionally specify the panels you want using "
panels(a b c d)
" - For the export help file run
help sumhdfe_export
- The filename you pass to
sumhdfe_export
will determine the output, use.rtf
or.tex
- You can optionally specify the panels you want using "
Example 1: RTF
reghdfe y x1 x2, a(firm year)
sumhdfe
sumhdfe_export using table.rtf, panels(a b c d)
You can open the .rtf
file using Word and you can copy the table to Excel as well.
data:image/s3,"s3://crabby-images/e02ae/e02ae6f9e6d62e63f4f3a8dd3bf800c66309c33b" alt=""
Example 2: Tex
reghdfe y x1 x2, a(firm year)
sumhdfe
sumhdfe_export using table.tex, panels(a b c d) standalone
You can render the .tex
file using your prefered LaTeX editor (e.g., Overleaf).
data:image/s3,"s3://crabby-images/57f7e/57f7e972888edf2f67417c2814efd9965f33285b" alt=""
Additional options
For additional examples and additional options, see the stata help file with help sumhdfe
and help sumhdfe_export
Pending Items
- ~~Allow for easy export of each table to Word/Excel/LaTeX~~
- Full walkthrough with real-word example
- Add an option to visually compare the pooled- and within-fixed-effect variation in a variable. In the meantime, it can be manually done as follows:
use "https://raw.githubusercontent.com/ed-dehaan/sumhdfe/master/sumhdfe_demo_data.dta", clear
qui: reghdfe y x1 x2, a(firm year)
qui: reghdfe x1 if e(sample), a(firm year) resid
twoway (histogram x1, fcolor(green%75) lcolor(none)) (histogram _reghdfe_resid, ///
fcolor(navy%70) lcolor(none)), legend(on order(1 "x1" 2 "within-FE x1"))
data:image/s3,"s3://crabby-images/11c91/11c91d2d706c72334fffbfe91eed98b1cdba4422" alt=""
Questions and bug reports
If you have questions or experience problems please use the issues tab of this repository.
Known bugs:
- ~~The RTF file might have blank pages in the beginning or end if only a selection of panels are returned.~~