Web Analytics Made Easy -
Data Science in Stata 16: Frames, Lasso, and Python Integration | David Jacho-Chavez

Data Science in Stata 16: Frames, Lasso, and Python Integration


Stata (StataCorp 2019) is one of the most widely used software for data analysis, statistics, and model fitting by economists, public policy researchers, epidemiologists, among others. Stata’s recent release of version 16 in June 2019 includes an up-to-date methodological library and a user-friendly version of various cutting edge techniques. In the newest release, Stata has implemented several changes and additions (see https://www.stata.com/new-in-stata/) that include lasso, multiple data sets in memory, meta-analysis, choice models, Python integration, Bayes-multiple chains, panel-data extended regression models, sample-size analysis for confidence intervals, panel-data mixed logit, nonlinear dynamic stochastic general equilibrium (DSGE) models, numerical integration. This review covers the most salient innovations in Stata 16. It is the first release that brings along an implementation of machine-learning tools. The three innovations we consider in this review are: (1) Multiple data sets in Memory, (2) Lasso for causal inference, and (3) Python integration. The following three sections are used to describe each one of these innovations. The last section are the final thoughts and conclusions of our review.

Journal of Statistical Software, (98), pp. 1-9