How accurate are personality tests? Evaluating the Myers-Briggs Type Indicator

Editor Note: This academic essay is available at ResearchGate

Personality tests are widely available and are utilised by organisations in assessing talent and potential, capability and behaviours.

This academic essay looks at my personal experience of the Myers-Briggs Type Indicator, which is completed over 2-million times a year, and asks how accurate its results are.

Nearly three decades ago, I started working in contact centres, after several years in industrial/labourer roles, and was asked to undertake a series of unfamiliar assessments.

I’d experienced tests to prove my role-based skills – such as interpreting warehouse documents – but I’d never been asked word preferences or to agree/disagree with statements that felt in no way connected to the role of answering retail service calls.

These assessments are popular: 75-80% of FTSE500 organisations utilise personality tests to predict performance and for talent acquisition. Multiple non-commercial tests are available online, where users can “reveal who you really are” and “find your strengths” (CIPD, 2021; see also Independent, 2019; Salako, 2021; Truity, n.d.).

I was informed that my training would be tailored to my reported preferences. Perhaps this was its intention. Rather than being a deciding datum in employment, it might be utilised to alter teaching methods (Moyle & Hackston, 2018; Randall et al., 2017).

I remember worrying that I might be giving incorrect answers.

Surely self-reported surveys can be faked? Perhaps I could manipulate answers to reflect the type of candidate I thought they wanted to employ (Carter, 2016; Furnham, 1990; Martin et al., 2002)?

I arrived on-time, relaxed and prepared for questions. Would my answers be different if I were delayed in traffic, stressed, caught off-guard, or more nervous in nature?

If I were financially desperate for the job (compared to just browsing), would that change my responses?

If I’d been recommended by another employee and wanted to leave a good impression of them, would my responses reflect positively?

With the popularity of personality testing, might I have had previous experience/practice in completing and faking results – and how would an assessor know?

These challenges go to the heart of personality testing. What are we testing for? Is personality something inherent within us (trait) or a reaction to our environment (state)? Does testing at different times or under different conditions change results? Do different types of tests identify the same personality?

If FTSE500 organisations are utilising these assessments, does that suggest rigour, reliability and validity, or have they been hoodwinked into a myth of what personality tests provide?

Personality Test Selection

The test I completed in 1996 was the proprietary Myers-Briggs Type Indicator^® (MBTI^®), which groups respondents into one of sixteen “personalities” based on four dichotomies associated with Jungian psychology (Maltby et al., 2022, p54-56; see also 16personalities, n.d.; Mattoon & Hinshaw, 2003; Myers & Briggs Foundation, n.d.):

(a) persona;
(b) anima/animas;
(c) the shadow; and
(d) the self

Completed c2-million times each year through a c95-, 144- or 244-question self-reported survey, MBTI^® was developed by Isabel Myers and her mother, Katherine Cook Briggs, as a means to simplify C.G. Jung’s work and to understand individual differences – specifically between Isabel and her husband, Clarence – as this psychological instrument might prove useful in World War II recruitment. However, development took longer than anticipated and assessments weren’t available until 1962 (CAPT, n.d.; Myers-Briggs Company, n.d.).

Evaluation of MBTI^®

Given the prevalence of its organisational usage, we can ask a simple question: Does MBTI^® measure what it claims to measure?

MBTI^® focuses on patterns of traits to characterise an individual according to four dimensions into one of sixteen types (Funder, 2007).

A trait is a dimension of personality that categorises people on a spectrum, based on the level to which they manifest that dimension.

A type is a collection of traits grouped into observed specific or situational/habitual responses (Burger, 2008; Maltby et al., 2022, p.175-17; Moyle & Hackston, 2018).

MBTI^® Structure

MBTI^® outputs one of 16 four-letter combinations (see above), based on four dichotomies:

Introversion/Extraversion (I/E);
Sensing/Intuiting (S/N);
Thinking/Feeling (T/F); and
Judging/Perceiving (J/P)

Introversion/Extraversion identifies where an individual gains/loses their energy. Extroverts prefer external stimuli (e.g., socialising, active involvement, music etc.). Introverts find these same stimuli reduces their personal energy, and prefer ideas, pictures, memories and reactions that are internalised (Myers & Briggs Foundation, n.d.).

Despite MBTI^® being based on Jung’s work, introversion and extraversion are defined differently in Jungian psychology and not directly related to energy. Extraversion meant outgoing, candid and quickly forming attachments, whereas introverts were defensive, hesitant, reflective and mistrustful (Maltby et al., 2022, p.56).

Sensing/Intuiting is related to processing information. Do we focus on the core information or do we prefer to interpret and add meaning?

Thinking/Feeling considers how we utilise that information in decision-making processes. Do we consider the logic and consistency of situations or do we first look at people and special circumstances?

Judging/Perceiving is the external manner in which we demonstrate that decision-making process. Do we prefer to get things decide or are we open to new information and options (Myers & Briggs Foundation, n.d.)?

MBTI^® Statements and Descriptors

Although variations exist, MBTI^® statements are generally arranged on a 5- or 7-point Likert scale between “strongly disagree” and “strongly agree”. Alternatives include word association variants, and a recent study analysed published social media content, identifying that “people’s personality traits could be effectively predicted using social media profiles, their use of language, and their [sic] behavioral patterns.” (Li, 2021).

These statements might include examples such as:

“You spend a lot of your free time exploring various random topics”;
“You usually stay calm, even under a lot of pressure”; and
“You like to use organizing tools like schedules and lists”

On completing the 1996 self-assessment, my resulting ENTP and its description seemed valid, despite my realisation that the small, simplified number of outcomes was comparable to horoscopes; indeed, some have attempted to link MBTI^® to astrological signs (Esteves, 2022).

The “Debater” personality is quick-thinking, knowledgeable and energetic, but can be argumentative, insensitive and favour abstract ideas over getting things done (16personalities, n.d.). The ENTP profile indicates that I broadly prefer:

social, active environments;
abstract ideas and the big picture;
to base decisions on pragmatism and logic; and
to be open to new ideas and options

On review of these preference descriptors, some aspects seem vague. Readers could apply these statements to themselves and, if the respondent was agreeable (a personality dimension not directly expressed in MBTI^®[1]), there is a strengthened correlation with the Barnum or Forer effect: the tendency to believe that generally flattering and sufficiently vague personality statements apply specifically to oneself (Dickson & Kelly, 1985; Poškus, 2014; Shtulman, 2015; VandenBos, 2007).

This effect has been linked with self-reported computer-based tests since the mid-to-late-80’s, when multifactor assessments could first be conducted digitally. However, not all validity issues were due to the Barnum effect; some were caused by subjective interpretations of questions (Guastello et al., 1989).

[1] Though MBTI^® and NEO PI-R (Big Five) are not directly correlated at each dichotomy, there are corelations with: Agreeableness/TF; Extraversion/EI; Conscientiousness/JP; Openness/SN; and Neuroticism/EI (though to a smaller magnitude) (Furnham et al., 2003)

MBTI^® Validity

Validity is concerned with whether a measure is measuring what it claims (Maltby et al., 2022, p.663). All self-reported tests are reliant on honesty and objectivity, and MBTI^® has been criticised for its ability to be faked. Certain roles – and, in western environments, certain specific MBTI^® results – can be considered more favourable and this social pressure could influence respondents to answer inauthentically (Carter, 2016; McPeek et al., 2011; Moyle & Hackston, 2018). Using questionnaire data as one datum in a suite of components, supported by trained practitioners, Moyle & Hackston suggest, could minimise fakeability concerns.

How an assessee interprets questions/statements or the context in which they’re asked can impact validity. MBTI^® has been translated into multiple languages, but we must be cognisant of bias introduced to assessment approaches through the overwhelming focus on western, educated, industrialised, rich, democratic (WEIRD) cohorts (Schulz et al., 2018; see also Henrick et al., 2010; Lundgren et al., 2019; Muthukrishna et al., 2020).

Sutin et al. (2020) also found that responses to “I try to go to work or school even when I’m not feeling well” changed during the COVID-19 pandemic. Historically, this was an indicator of conscientiousness (linked to J/P in MBTI^®), with a favourable score representing a higher level of the trait. Pandemic macroenvironmental and societal pressures changed the interpretation and, as the assessment was not updated, an inappropriate result manifested.

If we consider one of the four dimensions within MBTI^®, it would be reasonable to expect a distribution curve within each type indicator, with two distinct local maxima (see below).

Pittenger (2005; see also Bess & Harvey, 2002) notes the conspicuous absence of this and opines that the high frequency of mid-point scores results in a distribution curve across the dimension, with no evidence that an extrovert type is qualitatively different to an introvert type.

MBTI^® Reliability

Reliability is concerned with the consistency of measurement at different points in time (Howitt & Cramer, 2014, p.306). MBTI^® has received criticism for test/retest reliability. Pittenger (2005) argued that many people assessed twice often get different type results, despite personality traits being considered stable and cross-situationally consistent (Banicki, 2017).

I have completed MBTI^® assessments on 20+ occasions, always reporting ENTP. As my career includes the use of MBTI^® assessments, it might be that I understand the dimension each question is referencing, and am being disingenuous in my responses or perhaps interpreting instructions differently (Kubinger, 2002; Mahar et al., 1995; Martin et al., 2002).

Even without this insider view, respondents can answer questions in a way that favours themselves, endorsing desirable and rejecting undesirable traits (Monaro et al., 2021). It can be argued that western societies tend to prefer extroverted employees, and social pressures might encourage retest fakery or that respondent’s views of descriptors may be viewed as deeper and more meaningful than is objectively accurate (Caldwell & Burger, 1997; Stein & Swan, 2018).

A second consideration in reliability is standard deviation: the extent to which dataset values deviate from the mean. If deviation is large and retest reliability is low, differences between individual scores become lessened unless they, too, are large (APA Dictionary, n.d.; Pittenger, 1993). Pittenger (2005) later opined that this deviation doesn’t meet expectations, with a low of r(38) = .48 for TF scale and r(38) = .73 for EI scales when retested over 14-months.

Rather than measuring individual traits of personality on a spectrum or scale, MBTI^® claims to group multiple these together into a more general type. This simplification in approach will result in any output also being grouped and simplified, losing any nuance between each spectrum of personality trait.

Conclusion

Using the Myers-Briggs Type Indicator as our test theory, our initial question of “how accurate are personality tests?” remains unanswered. Considering validity and reliability evidence, MBTI^® theory doesn’t represent a robust or suitable framework for studying personality (Stein & Swan, 2019). The Myers-Briggs organisation also state that the tool isn’t intended as a personality assessment or to predict performance, but for self-awareness and the awareness of others (Hayes, 2014). It’s easily faked, has low retest reliability and its validity is questionable.

MBTI^® isn’t a personality model. But, if its limitations are appreciated, it can be both harmless and potentially useful in heuristically classifying peoples’ general tendencies. However, many organisations utilise MBTI^®as if it were objective, and use its findings in HR decisions around placement and development. This is the core challenge with MBTI^®. It’s a tool with its own mythology that many organisations often interpret as both ideology and scientific, claiming to measure something it does not, being utilised to inform talent attraction (in both role advert creation and candidate selection), retention, and development strategies (Burnett, 2013; CIPD, n.d.; Essig, 2014; Mahar et al., 2006).

The insight provided from MBTI^® can prove useful in identifying an individual’s preferences. One can appreciate Sensors’ natural preference is for structure, facts and truths. When assembling flat-packed furniture, they may arrange the pieces in order, check the box contents and read instructions. Conversely, Intuitors might picture the finished product, then work towards that abstract goal. Neither is the best or worst way to achieve the task. However, Sensors might look too structured, rigid and inflexible in their approach to Intuitors; Intuitors may appear laisse-faire and disorganised to Sensors.

Therefore, the value in the MBTI approach lies not in the 16 “personalities” that are grouped post-assessment, but in the understanding of each of the continuums and their relationship to others.

Does MBTI^® measure personality accurately? No. Does it still have a place in today’s organisations? Yes – provided it’s not used in isolation, as the deciding factor in decision making, or as a measurement of personality.

References

- 16personalities. Free personality test, type descriptions, relationship and career advice. 16personalities.com. Retrieved 18 August 2022, from https://www.16personalities.com/.
- APA Dictionary. standard deviation. APA Dictionary of Psychology. Retrieved 23 August 2022, from https://dictionary.apa.org/standard-deviation.
- Banicki, K. (2017). The character–personality distinction: An historical, conceptual, and functional investigation. Theory &Amp; Psychology, 27(1), 50-68. https://doi.org/10.1177/0959354316684689
- Bess, T., & Harvey, R. (2002). Bimodal Score Distributions and the Myers-Briggs Type Indicator: Fact or Artifact?. Journal Of Personality Assessment, 78(1), 176-186. https://doi.org/10.1207/s15327752jpa7801_11
- Black, A., & Black, C. (2007). Get That Job: Interviews: How to Keep Your Head and Get Your Ideal Job (Steps to Success). A & C Black Publishers Ltd.
- Burger, J. (2008). Personality (4th ed.). Brooks/Cole Pub. Co.
- Burnett, D. (2013). Nothing personal: The questionable Myers-Briggs test. The Guardian. Retrieved 21 August 2022, from https://www.theguardian.com/science/brain-flapping/2013/mar/19/myers-briggs-test-unscientific.
- Caldwell, D., & Burger, J. (1997). Personality and Social Influence Strategies in the Workplace. Personality And Social Psychology Bulletin, 23(10), 1003-1012. https://doi.org/10.1177/01461672972310001
- CAPT. The Story of Isabel Briggs Myers. Capt.org. Retrieved 16 August 2022, from https://www.capt.org/mbti-assessment/isabel-myers.htm.
- Carter, C. (2016). The Myers-Briggs Type Indicator: Still going strong and still getting it wrong. OP Matters, 32, 29-30. Retrieved 25 August 2022, from.
- CIPD. (2021). Selection Methods. CIPD.co.uk. Retrieved 20 August 2022, from https://www.cipd.co.uk/knowledge/fundamentals/people/recruitment/selection-factsheet#7265.
- Dickson, D., & Kelly, I. (1985). The ‘Barnum Effect’ in Personality Assessment: A Review of the Literature. Psychological Reports, 57(2), 367-382. https://doi.org/10.2466/pr0.1985.57.2.367
- Essig, T. (2014). The Mysterious Popularity Of The Meaningless Myers-Briggs (MBTI). Forbes. Retrieved 22 August 2022, from https://www.forbes.com/sites/toddessig/2014/09/29/the-mysterious-popularity-of-the-meaningless-myers-briggs-mbti/.
- Esteves, A. (2022). Which Myers Briggs Type is Closest to your Zodiac Sign?. Truity. Retrieved 31 July 2022, from https://www.truity.com/blog/which-myers-briggs-type-closest-your-zodiac-sign.
- Funder, D. (2007). The personality puzzle (4th ed.). Norton.
- Furnham, A. (1990). The fakeability of the 16 PF, Myers-Briggs and FIRO-B personality measures. Personality And Individual Differences, 11(7), 711-716. https://doi.org/10.1016/0191-8869(90)90256-q
- Furnham, A., Moutafi, J., & Crump, J. (2003). The relationship between the revised neo-personality inventory and the myers-briggs type indicator. Social Behavior And Personality: An International Journal, 31(6), 577-584. https://doi.org/10.2224/sbp.2003.31.6.577
- Guastello, S., Guastello, D., & Craft, L. (1989). Assessment of the Barnum Effect in Computer-Based Test Interpretations. The Journal Of Psychology, 123(5), 477-484. https://doi.org/10.1080/00223980.1989.10543001
- Hayes, J. (2014). Why the Myers-Briggs Assessment is Meaningful to Millions. Cppblogcentral. Retrieved 22 August 2022, from https://www.prisim.com/wp-content/uploads/2014/12/CPP-Blog-Why-the-MBTI-Assessment-is-Meaningful-to-Millions.pdf.
- Henrick, J., Heine, S., & Norenzayan, A. (2010). Most people are not WEIRD. Nature. Retrieved 21 August 2022, from https://www.nature.com/articles/466029a.
- Howitt, D., & Cramer, D. (2014). Introduction to research methods in psychology (4th ed.). Pearson Education.
- Independent. (2019). 75% of Fortune 500 companies use psychometric testing in recruitment. Independent. Retrieved 25 August 2022, from https://www.independent.com.mt/articles/2019-06-05/business-news/75-of-Fortune-500-companies-use-psychometric-testing-in-recruitment-6736209146.
- Kubinger, K. (2002). On faking personality inventories. Psychological Test And Assessment Modeling, 44(1), 10. Retrieved 29 August 2022, from.
- Li, W. (2021). Predicting MBTI personality type of Twitter users. [Doctoral Dissertation, Graduate School-Camden Rutgers, The State University Of New Jersey]. https://doi.org/doi.org/doi:10.7282/t3-75wc-2×18
- Lundgren, H., Poell, R., & Kroon, B. (2019). “This is not a test”: How do human resource development professionals use personality tests as tools of their professional practice?. Human Resource Development Quarterly, 30(2), 175-196. https://doi.org/10.1002/hrdq.21338
- Mahar, D., Coburn, B., Griffin, N., Hemeter, F., Potappel, C., Turton, M., & Mulgrew, K. (2006). Stereotyping as a response strategy when faking personality questionnaires. Personality And Individual Differences, 40(7), 1375-1386. https://doi.org/10.1016/j.paid.2005.11.018
- Mahar, D., Cologon, J., & Duck, J. (1995). Response strategies when faking personality questionnaires in a vocational selection setting. Personality And Individual Differences, 18(5), 605-609. https://doi.org/10.1016/0191-8869(94)00200-c
- Maltby, J., Day, L., & Macaskill, A. (2022). Personality, individual differences and intelligence (5th ed.). Pearson.
- Martin, B., Bowen, C., & Hunt, S. (2002). How effective are people at faking on personality questionnaires?. Personality And Individual Differences, 32(2), 247-256. https://doi.org/10.1016/s0191-8869(01)00021-6
- Mattoon, M., & Hinshaw, R. (2003). Cambridge 2001: Proceedings of the Fifteenth International Congress for Analytical Psychology (pp. 157-159). Daimon Verlag.
- McPeek, R., Nichols, A., Classen, S., & Breiner, J. (2011). Bias in older adults’ driving self-assessments: The role of personality. Transportation Research Part F: Traffic Psychology And Behaviour, 14(6), 579-590. https://doi.org/10.1016/j.trf.2011.06.001
- Monaro, M., Mazza, C., Colasanti, M., Ferracuti, S., Orrù, G., & di Domenico, A. et al. (2021). Detecting faking-good response style in personality questionnaires with four choice alternatives. Psychological Research, 85(8), 3094-3107. https://doi.org/10.1007/s00426-020-01473-3
- Moyle, P., & Hackston, J. (2018). Personality Assessment for Employee Development: Ivory Tower or Real World?. Journal Of Personality Assessment, 100(5), 507-517. https://doi.org/10.1080/00223891.2018.1481078
- Muthukrishna, M., Bell, A., Henrich, J., Curtin, C., Gedranovich, A., McInerney, J., & Thue, B. (2020). Beyond Western, Educated, Industrial, Rich, and Democratic (WEIRD) Psychology: Measuring and Mapping Scales of Cultural and Psychological Distance. Psychological Science, 31(6), 678-701. https://doi.org/10.1177/0956797620916782
- Myers & Briggs Foundation. MBTI Basics. Myersbriggs.org. Retrieved 20 August 2022, from https://www.myersbriggs.org/my-mbti-personality-type/mbti-basics/.
- Myers-Briggs Company. The history of the MBTI® assessment. Eu.themyersbriggs.com. Retrieved 20 August 2022, from https://eu.themyersbriggs.com/en/tools/MBTI/Myers-Briggs-history.
- Pittenger, D. (1993). Measuring the MBTI… and coming up short. Journal Of Career Planning And Employment, 54(1), 48-52. Retrieved 27 August 2022, from.
- Pittenger, D. (2005). Cautionary comments regarding the Myers-Briggs Type Indicator. Consulting Psychology Journal: Practice And Research, 57(3), 210-221. https://doi.org/10.1037/1065-9293.57.3.210
- Poškus, M. (2014). A new way of looking at the Barnum effect and its links to personality traits in groups receiving different types of personality feedback. Psichologija, 50, 95-105. https://doi.org/10.15388/psichol.2014.50.4893
- Randall, K., Isaacson, M., & Ciro, C. (2017). Validity and Reliability of the Myers-Briggs Personality Type Indicator: A Systematic Review and Meta-analysis. Journal Of Best Practices In Health Professions Diversity, 10(1), 1-27. Retrieved 26 August 2022, from.
- Salako, L. (2021). Why Do Companies Use Psychometric Testing For Hiring?. Picked. Retrieved 25 August 2022, from https://www.picked.ai/magazine/why-do-companies-use-psychometric-testing-for-hiring/.
- Schulz, J., Bahrami-Rad, D., Beauchamp, J., & Henrich, J. (2018). The Origins of WEIRD Psychology. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3201031
- Shtulman, A. (2015). How Lay Cognition Constrains Scientific Cognition. Philosophy Compass, 10(11), 785-798. https://doi.org/10.1111/phc3.12260
- Stein, R., & Swan, A. (2018). Deeply Confusing: Conflating Difficulty With Deep Revelation on Personality Assessment. Social Psychological And Personality Science, 10(4), 514-521. https://doi.org/10.1177/1948550618766409
- Stein, R., & Swan, A. (2019). Evaluating the validity of Myers-Briggs Type Indicator theory: A teaching tool and window into intuitive psychology. Social And Personality Psychology Compass, 13(2), e12434. https://doi.org/10.1111/spc3.12434
- Sutin, A., Luchetti, M., Aschwanden, D., Lee, J., Sesker, A., & Strickhouser, J. et al. (2020). Change in five-factor model personality traits during the acute phase of the coronavirus pandemic. PLOS ONE, 15(8), e0237056. https://doi.org/10.1371/journal.pone.0237056
- Truity. The TypeFinder® Personality Test. Truity. Retrieved 20 August 2022, from https://www.truity.com/test/type-finder-personality-test-new.
- VandenBos, G. (2007). Barnum effect. In. APA Dictionary of Psychology. American Psychological Association. Retrieved 26 August 2022, from.

Danny Wareham

Danny Wareham is an organisational psychologist, accredited coach, and speaker, with three decades of experience of helping businesses, leaders and C-suites nurture the culture and leadership required to support their strategy. He specialises in two key areas: - Social dynamics: Culture, engagement and how people relate to each other; and - Personality: What are our individual differences that highlight our strengths and our blind spots More articles are available on his website: dannywareham.co.uk/articles Follow or connect: https://www.linkedin.com/in/danny-wareham/