Research Methods

Can thematic analysis be used in quantitative research?

Yes. While thematic analysis is rooted in qualitative methodology, researchers regularly apply it within quantitative and mixed-methods designs. This guide covers how, when, and why you might quantify themes, along with tools that make it practical at scale.

Ilmainen 7 päivän kokeilujakso. 30 minuuttia henkilökohtaisella sähköpostilla, 60 minuuttia työsähköpostin kanssa.
Luotettu yli 250 000 ihmisen ja tiimin toimesta

Can thematic analysis be used in quantitative research?

The short answer is yes, with important caveats. Thematic analysis is primarily a qualitative method. It was developed to identify, organize, and interpret patterns of meaning within qualitative data such as interview transcripts, open-ended survey responses, and field notes. But researchers across disciplines have found productive ways to bring thematic analysis into quantitative and mixed-methods frameworks. The key is understanding what changes when you move themes from the qualitative world into a quantitative one, and what gets lost in the translation.

When researchers talk about using thematic analysis in quantitative research, they usually mean one of three things. First, they may be counting theme frequencies after qualitative coding to generate numerical data. Second, they may be using thematic analysis as the qualitative component of a mixed-methods study. Third, they may be applying structured content analysis techniques that overlap with thematic analysis but produce quantifiable outputs from the start. Each of these approaches has different strengths, and each requires a different level of methodological justification.

Understanding thematic analysis in research

Thematic analysis, as defined by Virginia Braun and Victoria Clarke in their influential 2006 paper, is a method for identifying, analyzing, and reporting patterns (themes) within data. It is flexible, accessible, and not tied to any single theoretical framework. This flexibility is one of the reasons it has become one of the most widely used qualitative methods across psychology, education, health sciences, business research, and the social sciences more broadly.

Braun and Clarke outlined a six-phase process: familiarizing yourself with the data, generating initial codes, searching for themes, reviewing themes, defining and naming themes, and producing the report. This process is iterative rather than linear. Researchers move back and forth between phases as their understanding deepens. The method emphasizes the researcher's active role in constructing themes from the data, not simply discovering them as if they were pre-existing objects waiting to be found.

The qualitative foundation of thematic analysis

What makes thematic analysis inherently qualitative is its focus on meaning. A theme captures something important about the data in relation to the research question. It represents a pattern of shared meaning, organized around a central concept. Braun and Clarke have been explicit that themes are not determined by frequency alone. A theme that appears in only a small number of data items can still be analytically significant if it captures something important about the phenomenon under study.

This matters because the most common way researchers try to "quantify" thematic analysis is by counting how often themes occur. While frequency counting can be informative, it does not capture the richness, context, or interpretive depth that makes thematic analysis valuable in the first place. A theme mentioned once by a participant who provides deep, detailed insight may be more analytically important than a theme mentioned briefly by twenty participants. Qualitative thematic analysis preserves that distinction. Purely quantitative approaches risk flattening it.

Braun and Clarke themselves have pushed back against what they call "small q" approaches to qualitative research, where qualitative methods are treated as a preliminary step to generate categories for quantitative analysis. They argue that this misunderstands the epistemological basis of thematic analysis and reduces a rich interpretive method to a simple sorting exercise. Researchers who want to use thematic analysis in quantitative contexts should be aware of this critique and account for it in their methodological justification.

How researchers apply thematic analysis to quantitative data

Despite the qualitative roots of thematic analysis, there are legitimate and well-established ways to use thematic outputs within quantitative research designs. Researchers have been doing this for decades, and the approaches have become more refined over time. The key is being transparent about what you are doing and why, so that reviewers and readers can evaluate your methodological choices.

Frequency counting of themes

The most straightforward approach is to conduct a standard qualitative thematic analysis first, then count how often each theme appears across data items. This produces frequency data that can be reported in tables, bar charts, or descriptive statistics. For example, if you interview 40 participants about their experience with a healthcare program and identify six themes, you might report that Theme A appeared in 32 out of 40 interviews (80%), Theme B in 24 (60%), and so on.

This approach is common in health services research, education, and organizational studies where stakeholders want to understand not just what themes exist but how prevalent they are. It works well when you have a moderately large qualitative sample (typically 30 or more participants) and want to give readers a sense of the relative importance of different themes across the dataset.

The limitation is that prevalence does not equal importance. A theme that appears in every interview may be relatively surface-level, while a theme that appears in only a few interviews may represent a critical insight. Good researchers report frequencies alongside qualitative descriptions that preserve the depth and context of each theme.

Code occurrence matrices

A code occurrence matrix (sometimes called a code-by-document matrix or a cross-tabulation matrix) is a structured table that maps which codes and themes appear in which data items. Rows represent data sources (individual interviews, documents, or cases), and columns represent codes or themes. Cells contain binary values (present/absent) or frequency counts.

This matrix format allows researchers to apply quantitative techniques to qualitative coding outputs. You can calculate inter-rater reliability statistics (such as Cohen's kappa) to assess coding consistency between multiple coders. You can identify patterns of co-occurrence, where certain themes tend to appear together in the same data items. You can compare theme distributions across participant subgroups using statistical tests.

Code occurrence matrices are particularly useful in team-based research projects where multiple analysts are coding data and need a systematic way to compare and reconcile their work. They also support transparency by providing a clear audit trail of how themes map to raw data.

Chi-square tests and statistical comparisons on coded categories

Once qualitative codes have been converted into categorical variables, researchers can apply standard statistical tests. Chi-square tests are commonly used to determine whether theme distributions differ significantly across groups. For instance, you might compare whether male and female participants mention a particular theme at significantly different rates, or whether participants from different organizational levels raise different concerns.

Other statistical approaches include logistic regression (predicting the presence or absence of a theme based on participant characteristics), cluster analysis (grouping participants based on their theme profiles), and correspondence analysis (visualizing relationships between themes and participant categories in a two-dimensional space). These methods treat the outputs of qualitative coding as categorical data and apply quantitative tools to identify patterns that might not be visible through qualitative interpretation alone.

Content analysis overlap

Thematic analysis and quantitative content analysis share significant methodological overlap, which is one reason the boundary between "qualitative thematic analysis" and "quantitative content analysis" can feel blurry. Quantitative content analysis, developed by researchers like Klaus Krippendorff and Kimberly Neuendorf, is explicitly designed to produce numerical data from textual sources. It emphasizes coding reliability, systematic sampling, and statistical analysis of code frequencies.

Some researchers use thematic analysis to develop their coding framework (identifying the categories and themes that matter) and then switch to a content analysis approach for systematic coding and quantification. This hybrid approach combines the interpretive depth of thematic analysis with the rigor and replicability of content analysis. It is especially common in media studies, communication research, and policy analysis.

Mixed-methods approaches to thematic analysis

Mixed-methods research explicitly combines qualitative and quantitative data collection and analysis within a single study or program of research. Thematic analysis frequently serves as the qualitative component in mixed-methods designs, and these designs provide the most methodologically robust framework for bringing qualitative themes into contact with quantitative data.

Sequential explanatory designs

In a sequential explanatory design, researchers collect and analyze quantitative data first, then use qualitative data (analyzed through thematic analysis) to explain or elaborate on the quantitative findings. For example, a survey might reveal that employee satisfaction differs significantly across departments. Follow-up interviews analyzed through thematic analysis could reveal why: perhaps one department has a toxic management style while another benefits from strong peer mentoring. The themes explain the numbers.

Sequential exploratory designs

In a sequential exploratory design, qualitative data comes first. Researchers conduct interviews, analyze them thematically, and use the resulting themes to develop survey instruments, scales, or coding frameworks for a subsequent quantitative phase. This is one of the most common ways thematic analysis feeds into quantitative research. The themes generated in the qualitative phase become the variables measured in the quantitative phase.

For example, a researcher studying barriers to telemedicine adoption might conduct 20 interviews, identify themes like "technology anxiety," "privacy concerns," and "provider relationship quality," and then develop a survey with items measuring each theme. The survey is then administered to a large sample and analyzed quantitatively. The thematic analysis provided the conceptual foundation for the quantitative instrument.

Concurrent designs

In concurrent or convergent designs, qualitative and quantitative data are collected simultaneously and analyzed separately, then the results are compared and integrated. A researcher might administer a structured survey and conduct semi-structured interviews with the same participants. The survey data is analyzed statistically while the interview data is analyzed thematically. Integration happens at the interpretation stage, where the researcher examines whether the quantitative patterns and qualitative themes converge, diverge, or complement each other.

This approach is common in evaluation research, public health, and education. It allows researchers to triangulate findings across methods, strengthening the overall evidence base. When quantitative results and qualitative themes point in the same direction, confidence in the findings increases. When they diverge, the divergence itself becomes a productive finding that requires further investigation.

When quantitative thematic analysis works well

Not every research context benefits from quantifying themes. But there are situations where it adds genuine value and is methodologically defensible.

Large qualitative samples

When you have qualitative data from 50, 100, or more participants, purely qualitative reporting of themes becomes challenging. Readers cannot hold that many individual perspectives in mind. Frequency data helps by providing a high-level map of the thematic landscape. It tells readers which themes are widespread and which are concentrated in particular subgroups. Combined with representative quotes and thick description, frequency data enhances rather than undermines the qualitative analysis.

Pre-existing codebooks and frameworks

When researchers use a deductive or template analysis approach, starting with predefined codes derived from theory or prior research, quantification is more natural. The codes are already structured as categories, and counting their occurrences across data items is a straightforward extension. This is common in applied research fields like implementation science, where researchers use established frameworks (such as the Consolidated Framework for Implementation Research) as coding templates and then report how often each construct appears across study sites.

Comparative studies across groups

When the research question explicitly involves comparing groups (different organizations, demographic categories, intervention conditions, or time periods), quantifying themes allows for systematic comparison. You can show that certain themes are more prevalent in one group than another, and you can test whether those differences are statistically significant. This supports claims about group-level patterns that would be difficult to make based on qualitative narrative alone.

Stakeholder audiences that expect numbers

In applied research contexts like program evaluation, policy analysis, and market research, audiences often expect quantitative evidence. Funding agencies, policymakers, and organizational leaders may find thematic frequencies more actionable than purely narrative reports. Quantifying themes can make qualitative findings more accessible and persuasive to these audiences without sacrificing the underlying interpretive work.

Limitations of quantifying themes

Researchers should be honest about what is lost when themes become numbers. The limitations are real, and pretending otherwise weakens the methodological foundation of any study.

Loss of context and nuance

When a theme is reduced to a count, the richness of the data behind it disappears. A frequency table showing that "work-life balance" appeared in 75% of interviews tells you nothing about the specific struggles participants described, the emotional weight of their accounts, or the contradictions within their experiences. The theme label becomes a container that obscures the complexity it was originally designed to capture.

Good mixed-methods researchers mitigate this by presenting frequency data alongside qualitative descriptions, extended quotes, and case examples. The numbers provide an overview; the qualitative material provides the depth. Presenting numbers alone is a methodological mistake.

Researcher interpretation challenges

Thematic analysis involves interpretation at every stage. Two researchers analyzing the same dataset may identify different themes, depending on their theoretical orientation, research question, and analytic lens. This is not a weakness of qualitative research. It is a feature. But it becomes a problem when you treat themes as objective categories and count them as if they were fixed entities.

Quantifying themes can create a false sense of precision. A number like "Theme X appeared in 63% of interviews" implies that Theme X is a stable, unambiguous category that different coders would identify consistently. In practice, the boundaries of themes are often fuzzy, and reasonable analysts might code the same passage differently. Reporting inter-rater reliability statistics helps, but it does not fully resolve the interpretive nature of qualitative coding.

Methodological critiques

Some qualitative researchers argue that quantifying themes fundamentally misrepresents what thematic analysis does. Braun and Clarke have warned against treating thematic analysis as a "coding reliability" approach, where the goal is to produce consistent, countable codes rather than interpretive themes. They distinguish between "Big Q" qualitative research (which embraces subjectivity and interpretation) and "small q" approaches (which try to apply quantitative logic to qualitative data).

Researchers who quantify themes should be prepared to defend their methodological choices. This means being explicit about your epistemological position, explaining why quantification serves your research question, and acknowledging the limitations of treating interpretive themes as categorical variables. A strong methods section addresses these concerns directly rather than ignoring them.

Tools for thematic analysis in 2026

The practical challenge of thematic analysis, whether qualitative or quantitative, often comes down to the sheer volume of work involved. Coding transcripts manually is time-intensive. Building code occurrence matrices by hand is tedious. Comparing theme distributions across subgroups requires careful data management. This is where software tools become essential, and where AI-assisted analysis has changed what is possible.

Traditional qualitative data analysis software like NVivo and ATLAS.ti provides robust coding and retrieval features but requires significant manual effort. Researchers still need to read every transcript, assign every code, and review every theme. For small studies with 10 or 15 interviews, this is manageable. For larger datasets with 50 or more data items, the manual workload can become a bottleneck that extends timelines by weeks or months.

AI-powered tools have introduced new possibilities for researchers conducting thematic analysis at scale. Puhu is one platform that combines transcription, qualitative coding, and AI-assisted analysis in a single environment. For researchers working with interview recordings or focus group audio, Speak handles transcription with speaker labels and then supports thematic coding directly within the platform. AI Chat allows researchers to query their coded data across multiple transcripts, asking questions like "What did participants over 40 say about technology adoption?" or "Compare themes between the intervention and control groups."

For researchers who need both qualitative depth and quantitative outputs, AI-assisted coding can accelerate the process of building code occurrence matrices, calculating theme frequencies, and identifying patterns across large datasets. The researcher still drives the interpretive work, deciding what counts as a theme and what it means. But the mechanical steps of coding, counting, and organizing become significantly faster.

Tools like Speak also support qualitative coding workflows where multiple team members can code the same data, compare their coding, and calculate inter-rater reliability. This is essential for studies that aim to quantify themes, since the credibility of frequency data depends on coding consistency. Tekoälyagentit can further automate repetitive analysis tasks, letting researchers focus on the interpretive decisions that require human judgment.

How Speak helps with thematic analysis

For researchers conducting thematic analysis across large datasets, AI tools can accelerate the coding process while preserving the interpretive depth that makes thematic analysis valuable. Here is how Speak supports both qualitative and quantitative thematic workflows.

AI-assisted qualitative coding

Upload transcripts and use AI to suggest initial codes based on your data. Review, refine, and organize codes into themes without starting from a blank page. The AI handles pattern detection across large volumes while you maintain control over what constitutes a meaningful theme.

Cross-study theme detection

When you have dozens or hundreds of transcripts, identifying themes that span the full dataset manually takes weeks. Speak surfaces patterns across your entire data library, helping you spot recurring themes, outlier perspectives, and connections between subgroups that you might miss in transcript-by-transcript reading.

Automated transcription for interviews

Every thematic analysis project starts with text. Speak transcribes audio and video recordings with speaker labels and high accuracy, turning your interview recordings into analysis-ready transcripts. Multiple transcription engines let you choose the best option for your language and audio quality.

Sentiment analysis alongside themes

Go beyond coding what participants said and capture how they said it. Speak's NLP analytics detect sentiment, emotional tone, and emphasis within your data. Layer sentiment data on top of your thematic coding to understand which themes carry positive, negative, or mixed associations across your sample.

AI Chat for exploring coded data

Ask natural language questions across your coded dataset. "What reasons did participants give for leaving the program?" or "Compare how urban and rural participants described access barriers." AI Chat queries your full data library using Claude, Gemini, or GPT models, so you can explore your themes interactively.

Export for publications and reports

Export transcripts, coded segments, theme summaries, and frequency tables in formats ready for academic publications, evaluation reports, or stakeholder presentations. Speak supports Word, CSV, PDF, and SRT exports so your thematic analysis outputs integrate smoothly into your reporting workflow.

Researchers trust Speak for qualitative analysis

★★★★★★ 4.9 G2:lla

""Me lähdimme paikasta viikkoja laadullisesta analyysistä yksi päivä. Helppokäyttöinen, helppo ottaa käyttöön ja tuki on ollut uskomatonta.""

Connor H. Data-analyytikko, G2-arvio

""Suuri tarkkuus, monikielinen tuki ja oivaltava analyysi. Integraatiot Google ja Zapier helpottaa kaiken virtaviivaistamista.""

Volker B. Toimitusjohtaja, G2-katsaus

""Ennen käytin 45–30 minuuttia nuottien litterointiin. Nyt se tehdään sekuntia, ja kirjoitan muutamassa minuutissa.""

Ted H. Yrityksen omistaja, G2-arvostelu

"Käytän Speak-alustaa Ranska ja englanti kokouksille jopa kahden tunnin ajan. Se säästää aikaa ja parantaa raporttieni tarkkuutta."

Francois L. Talousneuvoja, G2-arvostelu

"Se liittyy kokouksiin, tallentaa, dokumentoi ja tiivistää. En menetä tärkeitä kohtia ja se säästää valtavasti aikaani."

Ercan T. Liiketoiminnan kehittäminen, G2-katsaus

"Se on helppokäyttöinen, ja pääsen oikeasti yhteyteen tuotteen takana olevan tiimin kanssa. Arvokasta puhua oikea ihminen."

Markus B. Lääketieteellinen johtaja, G2-arviointi

Usein kysytyt kysymykset

Common questions about using thematic analysis in quantitative and mixed-methods research.

Can thematic analysis be used in quantitative research?

Yes. While thematic analysis is fundamentally a qualitative method, researchers can quantify thematic outputs by counting theme frequencies, building code occurrence matrices, and applying statistical tests to coded categories. It is also commonly used as the qualitative component in mixed-methods research designs. The key is being transparent about how and why you are quantifying themes, and acknowledging the limitations of converting interpretive themes into numerical data.

What is the difference between qualitative and quantitative thematic analysis?

Qualitative thematic analysis focuses on identifying and interpreting patterns of meaning within data. The emphasis is on depth, context, and the researcher's interpretive engagement with the material. Quantitative approaches to thematic analysis involve counting how often themes occur, comparing theme distributions across groups, and applying statistical tests to coded categories. Qualitative analysis asks "what does this mean?" while quantitative analysis asks "how often does this occur and does the frequency vary across groups?"

How do you quantify themes in thematic analysis?

The most common approach is to conduct a qualitative thematic analysis first, then count how many data items (interviews, responses, documents) contain each theme. You can report these as raw counts, percentages, or proportions. More advanced approaches include building code occurrence matrices, calculating co-occurrence patterns between themes, and applying chi-square tests or logistic regression to test whether theme distributions differ across participant groups.

What is a code occurrence matrix?

A code occurrence matrix is a structured table where rows represent data sources (such as individual interviews or documents) and columns represent codes or themes. Each cell indicates whether a particular code is present in a particular data source, either as a binary value (present or absent) or a frequency count. This matrix format enables quantitative analysis of qualitative coding, including inter-rater reliability calculations, co-occurrence analysis, and statistical comparisons across participant subgroups.

Can AI help with thematic analysis?

Yes. AI tools can assist with several stages of thematic analysis, including transcription, initial code suggestion, pattern detection across large datasets, and data querying. Platforms like Speak use AI to help researchers identify potential codes and themes more quickly, then allow the researcher to review, refine, and finalize the coding. AI is especially valuable for large datasets where manual coding would take weeks. The researcher still controls the interpretive decisions, but AI handles much of the mechanical work.

What software supports thematic analysis?

Traditional qualitative data analysis tools include NVivo, ATLAS.ti, and MAXQDA. These provide robust coding, retrieval, and visualization features. Newer AI-powered platforms like Speak combine transcription, coding, sentiment analysis, and AI-assisted querying in one environment. Speak is particularly suited for researchers who need to move between qualitative coding and quantitative outputs, since it supports both theme identification and frequency analysis across large datasets.

Is thematic analysis a valid method for mixed-methods research?

Absolutely. Thematic analysis is one of the most commonly used qualitative methods in mixed-methods designs. It works well in sequential exploratory designs (where qualitative themes inform a subsequent quantitative phase), sequential explanatory designs (where qualitative analysis explains quantitative findings), and concurrent designs (where qualitative and quantitative data are collected simultaneously and integrated during interpretation). Its flexibility and accessibility make it a natural fit for mixed-methods work.

How does Speak handle thematic analysis?

Speak provides an integrated environment for the full thematic analysis workflow. Start by uploading audio or video recordings, which Speak transcribes with speaker labels. Code your transcripts directly within the platform using manual coding or AI-assisted code suggestions. Use AI Chat to query your data across multiple transcripts, asking questions about specific themes, participant subgroups, or patterns in your data. Export coded segments, frequency tables, and theme summaries for your publications or reports. Speak supports both qualitative depth and quantitative outputs in a single platform.

Start your thematic analysis with Speak

Upload your interviews, code your data with AI assistance, and build the evidence base your research needs. Transcription, qualitative coding, sentiment analysis, and AI Chat included in every plan.

Aloita itsepalvelu

Create a free account, upload your first recordings, and start coding themes with AI assistance. Get transcripts, qualitative coding tools, and AI Chat during your 7-day trial.

Työskentele tiimimme kanssa

Running a large-scale qualitative study? We help research teams set up coding workflows, configure analysis pipelines, and build custom reporting. Book a consult to get started.