Simulating Data with A Known Correlation Structure in Stata

Monte Carlo simulations are most commonly used to understand the properties of a particular statistic such as the mean, or an estimator like maximum likelihood (ML) regression methods.

The principal is straight forward. Create a data set with a known correlation or covariance structure. Then add in some random error, and estimate your statistic or model.

Replicate this process 1,000 or 10,000 times – collecting the relevant information from each trial – and you’ll have a nice sampling distribution with which to evaluate the properties of your model or statistic.

The replication can be accomplished easily enough with a -forvalues- loop.

In this article, you’ll find out how to accomplish the other part of the task: creating a data set with a known correlation structure.
Continue reading


Data Analytics and the Three-Headed Monster

three headed dragonA recent theme in the blogosphere centers on how newcomers can get into the field of data science and statistical analysis. What are the necessary qualifications? And how can you go about getting those skills?

Unfortunately, the answers to these questions seem to present a quandary that was eloquently summed up by a comment I read on another blog I seem to have forgotten (perhaps it was Chandoo):

You need experience to get a job as an analyst. But the only way to get experience is to work in a job as an analyst.

Employers today are asking for more from all of their employees. And data analysts are no exception. In fact, the pressure to produce more with less is pushing many employers to merge business functions across smaller workforces.

For the data scientist and more importantly the aspiring marketing researcher or business intelligence analyst there is a three-headed monster to contend with. Each head represents a different role that you will need to fulfill in your career.
Continue reading