Data Cleaning Using SPSS and SAS: A Practical Guide to Improving Data Quality is an introduction to data cleaning for both scholarly and practicing researchers. Intended for users of both SPSS and SAS, the book provides an historical overview of events that shaped survey research, a summary of the survey research life cycle, as well as the benefits, practice, and application of data cleaning. These approaches will be useful for researchers, both professional and students, in education, sociology, political sciences, and social psychology.
It is generally agreed that 60 to 80 percent of any data or statistical project is comprised of data cleaning, and yet many professionals do not have the necessary skill sets to successfully cleanse and manage their data. Thus, time is lost and accuracy is compromised. This book encourages researchers to allocate the time necessary to become truly knowledgeable about cleaning data. It clearly and succinctly describes the need for, and description of, cleaning data, which may not always be obvious. Finally, the book encourages researchers to think more carefully about the practice of cleaning data, as part of data quality, especially in the area of survey research. By using both SPSS and SAS, the book can offer solutions that apply to virtually any research project.
Topics covered include working with unique identifiers, fixing duplicate records, compensating for absence of respondent data, dealing with out-of-range values, determining when to keep/delete incomplete survey responses, finding and editing inconsistent (out-of-range) responses, addressing issues of missing data, strategies for cleaning multiple response items, and understanding and repairing problems of acquiescence bias. The book includes an SPSS Data Dictionary and an SAS Data Dictionary.
Contents;
1: Data Cleaning and Your Research
2: Survey Research Life Cycle and Sources of Error
3: Overview of Data Cleaning
4: Working with Unique ID’s
5: Dealing With Duplicate Records
6: When There’s No Respondent Data
7: Out-of-Range Values (Numeric and Character)
8: Missing Values
9: Inconsistent Responses (two variables): Strategies
10: Multiple Response Items (Check all that apply)
11: Acquiescence Bias
Appendix A: SPSS Data Dictionary for Fitness Survey Dataset (formats, labels)
Appendix B: SAS Data Dictionary for Fitness Survey Dataset (formats, labels)
Appendix C: SPSS Validation Procedure (new in SPSS Version 20)