data:image/s3,"s3://crabby-images/1eaf9/1eaf9519e1dbbcf3139b74f429068485add3a24a" alt="Transform record variable pspp"
- Transform record variable pspp install#
- Transform record variable pspp download#
There are other alternatives such as sjlabelled::read_spss() and foreign::read.spss(), but haven is my personal recommendation – you can pick a favourite and have these available in your backpocket. In my experience, it rarely has any problems and is generally fast enough it is also part of the tidyverse. Haven::read_sav() is my favourite way of loading in SPSS files.
You can download the SAV file from the ARDA site here.
Not the most exciting – it’s the 1991 General Social Survey, which is a nationally representative sample of adults in the United States. sav dataset with variable and value labels that I could use for examples, I simply went online and found the first dataset that ticked these boxes. Since all I really needed is just an open-source, simple, and accessible SPSS /.
Library(surveytoolbox) # install with devtools::install_github("martinctc/surveytoolbox") One of these packages surveytoolbox is my own and available on Github only, and if you’re interested you can install this by running devtools::install_github("martinctc/surveytoolbox"). labelled::val_label()) so it’s easy to see where each function comes from. For clarity, I will still make the package-source of the functions explicit (e.g. Let us first load in all the packages that we’ll use in this post.
It’s a SPSS file that I will use as a demo in this post – and the importing functions which I will briefly go through are SPSS-specific. SPSS is still one of the most popular data formats for survey data. Most people starting out on survey data analysis will tend to first come across SPSS files (.sav). Despite the title, it’s not just about SPSS: there are plenty of other formats (e.g. SAS files) out there which carry variable and value labels, but I think this title is justified because: This post provides a tour of the various functions (from different packages) that I wish I had known at the time. Of course, another big reason was my own ignorance of all the different methods and packages available out there at the time, which would have otherwise made a lot of this easier! ?
# 3 I would be inclined to quit my job i~ 70.1% 40.5% # 2 I don't like to spend time in front ~ 40.5% 39.1% # 1 Coding R is one of my hobbies 88.1% 60.0% # `Q10 Top 2 Box Agree` `R Users Segmen~ `Python Users Seg~ Here’s an example (with completely made up numbers) of what I would typically need to produce as an output: # A tibble: 3 x 3 therefore is then necessary in order to turn the analysis into neat output contingency tables that you typically get via other specialist survey analysis software, like SPSS or Q. My experience was that the base data frame in R does not easily lend itself to work easily with these labels.
In my talk at the EARL conference last year, I also discussed a specific type of trade-off agreement question where any interpretation of the data is particularly sensitive to the value labels:. Respondents with a different classification within the survey (e.g. “full-time employees” vs “retirees”) may also have answered a statement that is worded slightly differently but their responses are reflected using a single variable in the data: for instance, employees may be asked about their satisfaction with their current employer in the survey, and retirees asked about their previous employer. What is your gender?) and value labels (e.g. 1 = Male, 2 = Female, 3 = Other, …), which is true in the case of categorical variables.Įven for ordinal Likert scale variables such as “ On a scale of 1 to 10, how much do you agree with…”, the meaning of the value is highly dependent on the nuanced wording of the agree-disagree statement. Survey data generally cannot be analysed independently of the variable labels (e.g. One of the big reasons for this “pain” was due to survey labels. Funnily enough, when I first started out to use R a couple of years ago, I didn’t think R was at all intuitive or easy to work with survey data. Since a significant proportion of my typical analysis projects involves survey data, I’m always on the look out for new and better ways to improve my R analysis workflows for surveys.