Simulating Data with A Known Correlation Structure in Stata

Monte Carlo simulations are most commonly used to understand the properties of a particular statistic such as the mean, or an estimator like maximum likelihood (ML) regression methods.

The principal is straight forward. Create a data set with a known correlation or covariance structure. Then add in some random error, and estimate your statistic or model.

Replicate this process 1,000 or 10,000 times – collecting the relevant information from each trial – and you’ll have a nice sampling distribution with which to evaluate the properties of your model or statistic.

The replication can be accomplished easily enough with a -forvalues- loop.

In this article, you’ll find out how to accomplish the other part of the task: creating a data set with a known correlation structure.
Continue reading


How to Preserve Missing Values with Stata’s Collapse Command

You are a code-writing machine.

That 3-day project you started this morning might actually be completed by the end of the day.

As your fingers fly across the keyboard, you think you can hear Stata singing your praise softly in the background.

Then IT happens…

Your programs stops working right. The data begin looking like something from one of Lord Voldemort’s nightmares.

Your finely-tuned debugging skills kick in, and you track down the problem. That -collapse- command you issued a while back did something rather odd. It replaced all of the missing values in your data set with zeros!

But that’s not at all what you wanted! You wanted those to be missing values, not zeros.

Yep, we’ve all been there. Even the most seasoned Stata users get bit by this quirk every once in a while.

In this article, I show three ways Stata can treat missing values when using the -collapse- command and the sum() function.
Continue reading

Adding Code Snippets to Your WordPress Posts

From time to time, I’ll be including code snippets of various programming languages in my posts. And I thought you might find it interesting to know how these are being created.

There are a few different methods for creating and highlighting code snippets. We can refer to them as the <code> method, the <pre> method, and the plugin method.

And in fact, I used some special codes to write the <code> and <pre> in the previous sentence…but we’ll get to that in just a minute. Let’s start with the main methods for introducing code snippets.

Continue reading

Data Analytics and the Three-Headed Monster

three headed dragonA recent theme in the blogosphere centers on how newcomers can get into the field of data science and statistical analysis. What are the necessary qualifications? And how can you go about getting those skills?

Unfortunately, the answers to these questions seem to present a quandary that was eloquently summed up by a comment I read on another blog I seem to have forgotten (perhaps it was Chandoo):

You need experience to get a job as an analyst. But the only way to get experience is to work in a job as an analyst.

Employers today are asking for more from all of their employees. And data analysts are no exception. In fact, the pressure to produce more with less is pushing many employers to merge business functions across smaller workforces.

For the data scientist and more importantly the aspiring marketing researcher or business intelligence analyst there is a three-headed monster to contend with. Each head represents a different role that you will need to fulfill in your career.
Continue reading