FINESSE Facilitation: What Are Best Practices for Qualitative Assessment Analysis?

Properly using opinion-based information to improve decisions is a function of survey design, analysis, and administration.

Feb 23, 2023

FINESSE is a communication approach where the burden of effective communication is on the sender. Facilitators should understand opinion-based surveys if they are to bring participants to solutions that are created , understood, and accepted by all. — Facilitators should understand opinion-based surveys if they are to bring participants to solutions that are created, understood, and accepted by all.

Opinion-based data is the foundation of qualitative assessments. Qualitative assessments are used in various applications, including asset management, risk management, human reliability analysis, and customer surveys. The usefulness of any qualitative assessment is a function of design, analysis, and administration.

The article provides tips for improving qualitative assessment analysis. Facilitators develop and use qualitative assessments in the execution of their work. Facilitators should be aware of qualitative assessment analysis as they seek to bring a group of participants to solutions that are created, understood, and accepted by all.

A Long History with Many Forms

The modern basis of such opinion-based data's scientific use and evaluation can be traced from the western hemisphere to the late 1800s. Educators and psychologists were seeking to quantify their clinical observations of human behavior. A similar movement was underway in the fields of natural science and statistics.

Many good practices for qualitative assessments that were developed during this time period are still applicable today. From hotel stays to dining experiences to equipment condition, opinion-based qualitative assessments are used successfully every day in a variety of fields.

Rensis Likert is credited with creating one of the first ordinal data instruments, which employs the 5-point scales currently used in most opinion-based surveys. There are five major qualitative measurement scales: Likert, ranking, Thurstone, Guttman, and semantic differential.

Ordinal Data

According to the Handbook of Human Factors Testing and Evaluation, “Obviously, questionnaire data are not a ratio measurement scale like meters or feet. Instead, questionnaire data are often ordinal measures, for example, very acceptable is better than effective. Under the best cases, where response scales and descriptor sets are based on normative data (as in the scales and descriptors recommended earlier), questionnaire data will approximate an interval scale. At their worst, questionnaires based on scales and descriptors with unknown properties represent qualitative, or categorical scales.”

Converting Ordinal Data to Interval Data

The desire is ultimately for the construction to mimic an interval scale to provide the greatest analysis flexibility. However, “Questionnaire data, however, still should not be subjected to statistical analyses without careful consideration of the way in which the data are distributed.”

Interval scales indicate the same magnitude of difference between items, but there is no absolute zero point. In other words, the difference between very likely and likely (a 5 and a 4) is the same as between possible and unlikely (a 3 and a 2). Using interval scales is much easier in theory than in practice.

Likert’s Qualitative Assessment Analysis Recommendations

Likert used two primary methods—the Sigma Method based on classic statistics for central tendency and variance. The Simpler Method, as Likert called it, was based on non-parametric methods and visualization.

Likert’s primary purpose was not to create a new method. However, he desired to compare it with the emerging practice of parametric statistics and compare results with Thurstone’s more complex methodology of measuring opinion-based attitudes. Likert’s measure of survey reliability, or repeatability, was verified using the Spearman-Brown methodology, a non-parametric test that, in a simple form, compares the results from odd and even data sets.

The Simpler Method yielded essentially the same results as either the Sigma method or Thurstone’s method. Likert concluded that opinions and social attitudes are best analyzed by “clustering.” Likert cited the need to "cut through the statistical confusion" created even in his time.

Likert’s analysis recommendations are summarized as follows:

Plot the data. Examine for visual inconsistencies.
Report the most representative clustered value (central tendency)
Report the range of responses, and better report the value that most responses were above or below
Compare the clustered values (medians) between data sets to evaluate reliability (repeatability) and validity

Modern References

Several modern-era sources of good practices are available. In addition to the author's own experiences, the basis for the ones provided relates closely to the field of reliability engineering. It includes The Handbook of Human Factors Testing and Evaluation, the Institute for Defense Analysis, and the International Handbook of Survey Methodology.

Plot the frequency of responses in a table or a bar chart to obtain a quick and simple visualization of the results. Identify anomalies like:
1. The tendency for all questions to be skewed in one direction.
2. Bimodal distributions, where there are two different clusters of responses
Provide a measure of central tendency, either the median or the mode. The mean is not appropriate for ordinal data.
Provide an indicator of range. This is usually the total range but could be based on quartiles.
Provide a measure of variance. The Handbook of Human Factors Testing and Evaluation recommends the 80th percentile. By way of example, a statement on variability may read something like “80 percent of the ratings fell at effective or better". This is referenced as the 80-50 rule when combined with the median.

Potential for Crazy Math

Suppose the Likert scales are constructed as interval scales, and the respondents understand and follow the scaling logic. In that case, the data can be analyzed with a wider range of non-parametric and parametric statistics. These include:

Non-parametric tests (differences between the medians of comparable groups)

Mann‐Whitney U test
Wilcoxon signed‐rank test
Kruskal‐Wallis test

Parametric tests (interval data mimicking ratio/continuous data)

Mean
Standard Deviation
Analysis of Variance (ANOVA)

Summed across multiple Likert questions (interval data mimicking ratio/continuous data)

all questions must use the same Likert scale
there must be a defendable approximation to an interval scale
all items measure a single latent variable (the variable is not measured, but the added values are indicators of it).

Additional Observations

Qualitative assessments are cost-effective and highly flexible tools. However, using Likert-type questionnaires as a principal evaluation test must be accompanied by the appropriate analysis. Keeping it simple and consistent with good practices for ordinal data is the best approach for facilitators and most technically trained professionals.

Facilitating with FINESSE

There are greater opportunities than at any time in our history to leverage different types and combinations of data and utilize this information to improve decision quality. All data potentially leads to knowledge, and knowledge can lead to greater understanding.

Qualitative assessments are used in various applications, including asset management, risk management, human reliability analysis, and customer surveys. The usefulness of any qualitative assessment is a function of design, analysis, and administration. This article explores aspects and good practices of survey analysis.

Are you Facilitating with FINESSE?

References

Rensis Likert, R.S. Woodruff, editor, Archives of Psychology, Columbia University, New York, Volume XXII, Nos. 146-146, 1932-1933, pp.4-43.

Handbook of Human Factors Testing and Evaluation, 2nd edition, edited by S.G. Charlton and T.G. O’Brien, publishers Lawrence Erlbaum Associates, Inc., 2002.

Institute of Defense Analysis, “ICH Q9 Briefing Pack II”, July 2006.

European Association of Methodology (EAM), International Handbook of Survey Methodology, edited by E.D. de Leeuw, J.J. Hox, and D.A. Dillman, 2008.

J.D. Solomon, Daniel Vallero, and Kathryn Benson, “Evaluating Risk: A Revisit of the Scales, Measurement Theory, and Statistical Analysis Controversy," Proceedings of the 2017 international Reliability and Maintainability Symposium.

JD Solomon Inc provides solutions for facilitation, asset management, and program development at the nexus of facilities, infrastructure, and the environment. Founded by JD Solomon, Communicating with FINESSE is the community of technical professionals dedicated to being highly effective communicators and facilitators. Learn more about our publications, webinars, and workshops. Join the community for free.