How to Preserve Missing Values with Stata’s Collapse Command

You are a code-writing machine.

That 3-day project you started this morning might actually be completed by the end of the day.

As your fingers fly across the keyboard, you think you can hear Stata singing your praise softly in the background.

Then IT happens…

Your programs stops working right. The data begin looking like something from one of Lord Voldemort’s nightmares.

Your finely-tuned debugging skills kick in, and you track down the problem. That -collapse- command you issued a while back did something rather odd. It replaced all of the missing values in your data set with zeros!

But that’s not at all what you wanted! You wanted those to be missing values, not zeros.

Yep, we’ve all been there. Even the most seasoned Stata users get bit by this quirk every once in a while.

In this article, I show three ways Stata can treat missing values when using the -collapse- command and the sum() function.
Continue reading


How to Call R from Stata

When it comes to data analysis, if you’re anything like me you probably work across several different platforms. Depending on your analytical needs you might get basic descriptives from Excel, but use programs like Stata and R for more complex routines.

One of the frustrations that go with this form of data science is the need to transfer data from one program to another.

It’s straight forward to export data in .csv format, and then import the data in a different program. But you may lose some important formatting such as variable and value labels in the data set.

Programs such as Stat Transfer make it easy to convert data from one program format to another. But as with the .csv export, it takes valuable time to convert and transfer the data. And you end up with multiple copies of the same data set clutering up your machine.

Wouldn’t it be way easier if you could just call one data analysis program from inside another? As a Stata user, I’ve often wished I could perform a quick analysis in R without having to go through all of this effort.

In this article, I’ll show you a method for writing your R code, running R, feeding it data, returning R output in a text file, and returning any changes in your dataset to Stata…all while working in Stata’s native environment. I’m doing this on a PC, so Mac users will need to forgive me.
Continue reading

Adding Code Snippets to Your WordPress Posts

From time to time, I’ll be including code snippets of various programming languages in my posts. And I thought you might find it interesting to know how these are being created.

There are a few different methods for creating and highlighting code snippets. We can refer to them as the <code> method, the <pre> method, and the plugin method.

And in fact, I used some special codes to write the <code> and <pre> in the previous sentence…but we’ll get to that in just a minute. Let’s start with the main methods for introducing code snippets.

Continue reading

Data Analytics and the Three-Headed Monster

three headed dragonA recent theme in the blogosphere centers on how newcomers can get into the field of data science and statistical analysis. What are the necessary qualifications? And how can you go about getting those skills?

Unfortunately, the answers to these questions seem to present a quandary that was eloquently summed up by a comment I read on another blog I seem to have forgotten (perhaps it was Chandoo):

You need experience to get a job as an analyst. But the only way to get experience is to work in a job as an analyst.

Employers today are asking for more from all of their employees. And data analysts are no exception. In fact, the pressure to produce more with less is pushing many employers to merge business functions across smaller workforces.

For the data scientist and more importantly the aspiring marketing researcher or business intelligence analyst there is a three-headed monster to contend with. Each head represents a different role that you will need to fulfill in your career.
Continue reading