To RIPLS or Not to RIPLS: That is Only Part of the Question

By Connie Schmitz, PhD, and Barbara Brandt, PhD

We live in two inter-related worlds of interprofessional education and collaborative practice (IPECP):  simultaneously implementing and evaluating the University of Minnesota IPECP program, 1Health, across 21 schools and programs on three campuses and in our work in the National Center for Interprofessional Practice and Education.  We are constantly grappling with “on the ground” challenges and national issues that bubble up in the center; therefore, we are gaining a unique perspective about IPECP.  A recent editorial by Mahler, et al. (JIC, 2015) provides a cautionary tale about a popular instrument for measuring attitudes about IPECP.  In this case, the instrument in question is the Readiness for Interprofessional Learning Scale (RIPLS), and the editorial authors argue convincingly that the evidence for its validity is weak.   

Rapid adoption and dissemination of promising tools, especially in emerging fields, is not uncommon.   Developed in 1999, the RIPLS represents a thoughtful, early attempt to fill a massive void.  By using it over time, the field has learned some things about the challenge of assessing IPECP.  After administering RIPLS for five years, we recently discontinued it here, at the University of Minnesota, because we weren’t confident that it produced valid responses among students who had no previous exposure to interprofessional education and barely knew their own professions, much less others.  We were concerned about the nature of most of its items, which encourage students to respond in ways that are socially expected or desired.  Over time, we learned that its scores proved stubbornly insensitive to course improvements and to pre-post change.

 

Validity is the Question

Unfortunately, these and other problems cited in the editorial about the RIPLS (i.e., unstable reliability estimates at subscale levels and unstable factor structure) can probably be said of other measurement tools as well (especially if they were replicated as often as the RIPLS!).  Some of these problems may be attributed to the instruments themselves.  Some may be due to differences among the populations being surveyed, the different course contexts involved, or the timing and modes of administration.  Some may be due to our unformed theories about what we need to measure (as the editorial writers point out).  And some may be due to the assumptions we bring to the table in trying to ascertain “validity.”  

By themselves, tools are neither valid nor invalid.  In validity studies, we collect scores from an instrument and analyze them to see whether or not they confirm our assumptions about what we believe the scores to mean.   Many researchers studying IPECP turn to a particular statistical technique, exploratory factor analysis, as a way to detect the presence of distinct constructs being measured by items in an instrument.   Confirmatory factor analysis (i.e., expecting the factor structure emerging from the responses of one population of respondents to be replicated across time, place, and other populations) has less support as a means to build theory or ascertain validity, at least among some communities of statisticians (Norm & Streiner, 2003).

One of the original, if not primary goals of factor analysis is to understand the extent to which an underlying latent factor (or factors) can explain response patterns in a set of data.  Often an explicit goal of factor analysis is to reduce the number of variables (or items within an instrument) down to one or two overall traits, abilities, or attitudes (Brown, 1972).  Running, throwing a ball, and leaping over tall buildings, for example, probably all reflect underlying athletic ability.  Instruments with a single factor accounting for a large proportion of variance in responses lend themselves well to summary scoring (i.e., using some sort of total score).  This can be helpful for further validity testing as well as learner assessment and course evaluation.  

So here is an interesting question: are the constructs we often try to measure in IPECP truly distinct?  It could be argued that many (e.g., communication, coordination, collaboration, understanding roles and responsibilities, beliefs in teamwork) are closely linked.  High inter-correlations among items measuring these constructs seem nearly inevitable.  That’s because when we teach IPECP to students, we usually teach these constructs together as a whole; they form a gestalt – even a world view or belief system.  It would not be surprising, therefore, if the data generated by many of our tools suggest only one or two latent factors.  This may be especially true with data from self-report instruments that measure attitudes.  This leaves us in a bit of a quandary, in terms of our validity assumptions.  We may need to question the basis for expecting stable factor scores from one administration to another if the constructs being measured in a tool are highly inter-correlated to begin with.

 

Role of the National Center

Our purpose is not to argue for the RIPLS, but to use the RIPLS and its widespread use as an example of the learning pains involved with validity research.  The need for high quality instruments still exists, but so does the need to build assessment capacity for the field.  We have learned through the flood of requests to the National Center that many people do not understand the measurement field.  They contact us, looking for instruments as what we now call “magic bullets.”  To respond to this need, in March (2015), we published a monograph on the nature of validity and the considerations involved with selecting measurement tools (Schmitz & Cullen, 2015). This primer, Evaluating Interprofessional Education and Collaborative Practice:  What Should I Consider When Selecting a Tool? (https://nexusipe.org/evaluating-ipecp), guides readers on what to look for when selecting a tool, the importance of defining one’s purpose of assessment, and steps to take when appraising validity. 

In June (2015), we engaged an well-known international expert on teamwork, Dr. Eduardo Salas, Professor of Psychology, Rice University, and Human Resources Research Organization, a national consulting firm specializing in personnel management, education research and evaluation (https://www.humrro.org/corpsite/about). Their charge is to create an online toolkit and practical guide for the selection, adaption, or creation of teamwork assessment tools.  As readers know, “teamwork” represents one of the most needed and important dimensions of IPECP evaluation. 

Last but not least, this fall we are starting a major redesign of our measurement instrument collection located on www.nexusipe.org.  The original intent of the National Center curation was to make existing measurement instruments available to people, along with the current literature behind them.  We sought out each author and developer and discussed the creation and use of the instruments from their perspective.  We approached the task from an “open source,” community resource exchange perspective, which encourages self-submissions and feedback from users.

Because of the tremendous growth of IPECP in the United States in the last two years, we have come to recognize the collection is now outdated and in need of more rigorous standards of review.  Over the next year, we will work to build an expert curated collection of peer-reviewed instruments, guided by a managing editor and advisory board.  The site will include critical reviews of instruments meeting a high standard of inclusion.  Authors submitting reviews of recommended instruments will be expected to synthesize the literature about the validity and utility of an instrument, and to use their expert judgment to offer practical recommendations regarding the instrument’s use and interpretation of results.

This may result in a smaller collection of instruments, at least initially.  In updating the collection, however, we will broaden the search strategy to find high quality instruments from health services research and other disciplines.  Even so, “small” is also not necessarily “bad.”  In order to best support research on IPECP, we believe the field needs to avoid the proliferation of new tools, administered to small samples at single sites.  (For a field preaching “collaboration,” researchers and practitioners need to collaborate!)  We need to figure out how to make a long-term investment in validity studies of a relatively small, select group of “best” instruments.  This requires funding; an expert work group to identify “best” instruments; and a network of experienced researchers and practitioners to administer tools and share data.   

We are excited by these challenges and opportunities, and invite your comments.

References

Brown, FG.  Principles of Educational and Psychological Testing, 2nd ed.  New York City, NY: Holt, Rinehart and Winston, 1976.

Mahler C, Berger S, & Reeves S.  The Readiness for Interprofessional Learning Scale (RIPLS): A problematic evaluative scale for the interprofessional field.  JIC 2015;29(4):289-291.

Norman, GR & Streiner, DL.  PDQ Statistics, 3rd ed., Hamilton (Ontario); BC Decker, Inc., 2003.

Schmitz, CC & Cullen, MJ. Evaluating Interprofessional Education and Collaborative Practice: What Should I Consider When Selecting a Measuring Tool?  Minneapolis (MN): University of Minnesota, Academic Health Center, 2015, https://nexusipe.org/evaluating-ipecp.

325