Expressing Pi in Your Favorite Statistical Software

Pi_Warhol_sTo celebrate Pi Day, and provide some (hopefully) useful knowledge, I’ll show you how to represent Pi in your favorite statistical packages.

If your favorite isn’t on the list, I’m sorry…I can only do so much.

One thing to keep in mind about these examples is that most software packages use floating point arithmetic (FPA).

I won’t get into exactly what this means in this post. Just know that FPA will generally result in some rounding errors with highly precise numbers (i.e. lots of decimal places). However, below 16 decimal places, you can be reasonably assured that these packages return the same values.

Excel

=PI()

Note this function does not have any arguments. The value returned is accurate to 14 decimal places.

R

>pi

This returns pi to 6 decimal places. If you need more precision, you can get up to 15 decimal places with the following code (the integer 3, is the 16th digit):

>options(digits=16)
>pi

The digits option can go as high as 22, but the default R algorithm is only accurate up to 15 decimal places (see http://www.joyofpi.com/pi.html).

For greater precision, I recommend using the Rmpfr package. I set it to 256-bit precision, and achieved accuracy up to 75 decimal places.

Stata

. di c(pi)
or
. di _pi

As with R, the default precision is 6 decimal places. If you need to increase the precision, you can format the constant for up to 16 decimal places.

.di %19.0g _pi

SAS

I know less about the nuances of representing Pi in SAS. But my research in the SAS documentation suggests that pi can be stored with precision above 16 decimal places.

The basic code is:

data _null_;
  pi=constant('pi');
  put pi=;
run;

SPSS

This may be the worst package to use for representing pi, as IBM still has not included pi as a system constant in the program. Instead, we get to make use of our knowledge in trigonometry (did you just cringe? I did.)…

If you dig back far enough in your memory, you might recall that the tangent of (pi/4) =1. Using the inverse tangent function (the arctangent), you can create a variable to represent pi:

compute  pi = 4*ARTAN(1).

Hope you find this interesting and useful…Happy Pi Day!

How to Call R from Stata

When it comes to data analysis, if you’re anything like me you probably work across several different platforms. Depending on your analytical needs you might get basic descriptives from Excel, but use programs like Stata and R for more complex routines.

One of the frustrations that go with this form of data science is the need to transfer data from one program to another.

It’s straight forward to export data in .csv format, and then import the data in a different program. But you may lose some important formatting such as variable and value labels in the data set.

Programs such as Stat Transfer make it easy to convert data from one program format to another. But as with the .csv export, it takes valuable time to convert and transfer the data. And you end up with multiple copies of the same data set clutering up your machine.

Wouldn’t it be way easier if you could just call one data analysis program from inside another? As a Stata user, I’ve often wished I could perform a quick analysis in R without having to go through all of this effort.

In this article, I’ll show you a method for writing your R code, running R, feeding it data, returning R output in a text file, and returning any changes in your dataset to Stata…all while working in Stata’s native environment. I’m doing this on a PC, so Mac users will need to forgive me.
Continue reading