Using Stata's -reshape- Command with Data Files

An unfortunate downside of many on-line data sources is that the manner in which they arrange data in spreadsheets may not be well-suited for analysis in statistical packages. Stata's -reshape- command, however, is a powerful and helpful tool that allows users to rearrange their data for easier analysis. This tutorial will provide examples of how you can use this command to "reshape" datasets into analyzable formats, working with data from a commonly-used data source. This tutorial is not a comprehensive, exhaustive review of the -reshape- command - think of it more as an introduction to the command and an illustration of its potential and usefulness. We are assuming here that you have already transferred the dataset into Stata format using StatTransfer or the -insheet- command or other means. Note that even if you do not wish to do your analysis in Stata, you can still reshape the data into your desired format and then use StatTransfer to convert the reshaped file into your stats package of choice. Also note that, depending on your data and what you need to do, Stata's -stack- command or -xpose- command may be more appropriate.


Reshaping from "Wide" to "Long":

The examples we are using here are from the on-line version of the World Bank's World Development Indicators (WDI) which typically displays data in the "wide" format. For datafiles that contain information across time, this means that the columns denote years of data, while the rows are for each country; in addition, if the files contain multiple variables, then rows are country-variables.

Example:

Country
1981
1982
1983
Austria
25
10
8
Belgium
14
10
17
Denmark
2
18
4

For analysis in statistical packages, the data must be "reshaped" into the 'long' format, which means that the rows denote country-years while the columns are variables of interest.

Example:

Country
Year
Variable
Austria
1981
25
Austria
1982
10
Austria
1983
8
Belgium
1981
14
Belgium
1982
30
Belgium
1983
17
Denmark
1981
2
Denmark
1982
18
Denmark
1983
4

In this introduction to using the -reshape- command, we will look at two examples: one where we have a data file with only one variable of interest, and one where we have a data file with multiple variables of interest:

Reshaping Data With One Variable of Interest

Reshaping Data With Many Variables of Interest

Again, this guide is not a comprehensive, top-to-bottom review of the -reshape- command. It is more an introduction to the command and how it might be used with a particular resource.


Other Guides for Reshaping Data in Stata:

"FAQ: Problems With Reshape," by the Stata Corporation

"Reshape World Development Indicators for Stata Analysis," by Data and Statistical Services, Princeton University

"Reshaping Panel Data Using Excel and Stata," by Moonhawk Kim at the University of Colorado, Boulder

"Reshaping a Data File," UNC Carolina Population Center

"Reshaping Data Long to Wide," UCLA Academic Technology Services

"Reshaping Data Wide to Long," UCLA Academic Technology Services





Data Analysis

Page adapted from Electronic Data Center, Emory University Libraries
Original text by Amy Yuen