
An unfortunate downside of many on-line data sources is that the manner in which they arrange data in spreadsheets may not be well-suited for analysis in statistical packages. Stata's -reshape- command, however, is a powerful and helpful tool that allows users to rearrange their data for easier analysis. This tutorial will provide examples of how you can use this command to "reshape" datasets into analyzable formats, working with data from a commonly-used data source. This tutorial is not a comprehensive, exhaustive review of the -reshape- command - think of it more as an introduction to the command and an illustration of its potential and usefulness. We are assuming here that you have already transferred the dataset into Stata format using StatTransfer or the -insheet- command or other means. Note that even if you do not wish to do your analysis in Stata, you can still reshape the data into your desired format and then use StatTransfer to convert the reshaped file into your stats package of choice. Also note that, depending on your data and what you need to do, Stata's -stack- command or -xpose- command may be more appropriate.
The examples we are using here are from the on-line version of the World Bank's World Development Indicators (WDI) which typically displays data in the "wide" format.
For datafiles that contain information across time, this means that the columns denote years of data, while the
rows are for each country; in addition, if the files contain multiple variables, then rows are country-variables.
Example:
|
Country
|
1981
|
1982
|
1983
|
|
Austria
|
25
|
10
|
8
|
|
Belgium
|
14
|
10
|
17
|
|
Denmark
|
2
|
18
|
4
|
For analysis in statistical packages, the data must be "reshaped" into the 'long' format, which means that the
rows denote country-years while the columns are variables of interest.
Example:
|
Country
|
Year
|
Variable
|
|
Austria
|
1981
|
25
|
|
Austria
|
1982
|
10
|
|
Austria
|
1983
|
8
|
|
Belgium
|
1981
|
14
|
|
Belgium
|
1982
|
30
|
|
Belgium
|
1983
|
17
|
|
Denmark
|
1981
|
2
|
|
Denmark
|
1982
|
18
|
|
Denmark
|
1983
|
4
|
In this introduction to using the -reshape-
command, we will look at two examples: one where we have a data file with
only one variable of interest, and one where we have a data file with
multiple variables of interest:
Reshaping Data With One Variable of Interest
Reshaping Data With Many Variables of Interest
Again, this guide is not a comprehensive, top-to-bottom review of the -reshape- command. It is more an introduction to the command and how it might be used with a particular resource.
"FAQ: Problems With Reshape,"
by the Stata Corporation
"Reshape World Development
Indicators for Stata Analysis," by Data and Statistical Services, Princeton University
"Reshaping Panel Data Using Excel and Stata," by Moonhawk Kim at the University of Colorado, Boulder
"Reshaping a Data File," UNC Carolina Population Center
"Reshaping Data Long to Wide," UCLA Academic Technology Services
"Reshaping Data Wide to Long," UCLA Academic Technology Services