Frequency Data Analysis: Chi Square Test Example

frequency data analysis
8 steps to computer Chi square value.

This article explains frequency data analysis in simple terms for beginners. It presents a clear example of how to analyze frequency data gathered from surveys.

When you are tasked with gathering information about people’s preferences, you often end up with data in the form of frequencies. How can you analyze such data to make it meaningful?

In this article, I’ll walk you through an easy‑to‑understand example and show you the specific statistical test you can use to make sense of it.

How to Analyze Frequency Data in Surveys

Have you ever filled out a survey that asked you questions like:

  • “Are you male or female?”
  • “Which cellphone brand do you use?”

When the answers come back, researchers count how many people chose each option. Those counts are what we call frequency data.

For example, if 150 male students say they prefer Samsung, that’s a frequency. If 340 female students say the same, that’s another frequency.

So, what do we do with that kind of data? And how do we test if there’s any real relationship between the answers?

Why Chi Square Is Perfect for Surveys

To analyze the data gathered in our example on cellphone preference, there’s a special test in statistics called the chi square test (χ²). Don’t worry — you don’t need to be scared of the name. Think of it as a way to check:

“Is there a meaningful link between these two categories of cellphone brand preferred, or is it just random chance?”

It’s like asking:

  • “Does gender affect which cellphone brand students prefer?”
  • “Do voters from different regions tend to support different candidates?”
  • “Do people with different jobs prefer different coffee shops?”

Whenever your research question involves counts (how many people chose what) and categories (male/female, Nokia/Samsung/iPhone) — that’s when chi square steps in.

Organizing Data for Frequency Data Analysis

To clearly present the data we have gathered, it is both common and practical to display the distribution of frequency data in a table. Here’s our case scenario to make the discussion on cellphone preference more specific.

Imagine a store owner wants to know which cellphone brands are popular among students. He surveys 1,000 students and gets this data:

GenderNokiaSamsungiPhone
Male150240120
Female34010050

At first glance, the store owner might think:

  • “Wow, it looks like female students really love Nokia.”

But here’s the thing: our eyes can be fooled by raw numbers.

Maybe there are simply more female students in the survey overall — which could explain the big Nokia count.

That’s why we need chi square. It helps answer the question:

Is gender really linked to cellphone brand preference, or is this just a coincidence?

How to Analyze Frequency Data using Chi Square

The chi square test works by comparing what you actually observed in your survey with what you would expect if there were no relationship between the variables.

First, you tally the actual counts (the observed data) in a table, like how many males prefer Nokia or how many females prefer Samsung. Next, you calculate the expected counts — the numbers you would see if gender and cellphone preference were completely unrelated. Finally, the chi square formula adds up all the differences between what you observed and what you expected, adjusting for the size of each expected count. The larger the difference, the more likely it is that the variables are truly related.

Let’s make it clearer by giving you a step-by-step guide on how to extract meaning from our example.

Step-by-Step Chi Square Guide

Step 1: State the Hypotheses

  • Null Hypothesis (H₀): Gender and cellphone brand preference are independent. (In other words, gender does NOT influence which cellphone brand students prefer.)
  • Alternative Hypothesis (H₁): Gender and cellphone brand preference are associated. (In other words, gender DOES influence cellphone brand preference.)

Step 2: Present the Observed Data (O)

This is the data you collected in your survey:

GenderNokiaSamsungiPhoneRow Total
Male150240120510
Female34010050490
Column Total4903401701000

Step 3: Compute the Expected Counts (E)

The expected frequency for each cell assumes there is no relationship between gender and cellphone brand.

expected equation
Formula for expected value in Chi square.

Let’s compute for each cell:

  • Male–Nokia: (510×490)/1000=249.9
  • Male–Samsung: (510×340)/1000=173.4
  • Male–iPhone: (510×170)/1000=86.7
  • Female–Nokia: (490×490)/1000=240.1
  • Female–Samsung: (490×340)/1000=166.6
  • Female–iPhone: (490×170)/1000=83.3

Step 4: Apply the Chi Square Formula

The chi square formula to apply in our example is:

chi square formula extended
Extended formula of Chi square.

or the simpler Chi-square formula

chi square equation
Chi square formula in symbols.

Don’t worry about how the formula looks. Computing the values by substitution is simple and easy.

Where O = Observed frequency, E = Expected frequency, and ∑ is summation (meaning, sum up all the values obtained the computation (O-E)2/E).

Now calculate for each cell:

  • Male–Nokia: (150–249.9)2/249.9 = 39.9
  • Male–Samsung: (240–173.4)2/173.4 = 25.4
  • Male–iPhone: (120–86.7)2/86.7 = 12.8
  • Female–Nokia: (340–240.1)2/240.1 = 41.5
  • Female–Samsung: (100–166.6)2/166.6 = 26.6
  • Female–iPhone: (50–83.3)2/83.3 = 13.3

Add them all up:

χ2 = 39.9 + 25.4 + 12.8 + 41.5 + 26.6 + 13.3 = 159.5

Step 5: Determine Degrees of Freedom (df)

df=(r–1)(c–1)

Where r = number of rows (2 genders), c = number of columns (3 brands).

df = (2–1)(3–1) = 2

Step 6: Find the p-value

Look up χ² = 159.5 with df = 2 in a Chi-square distribution table (see portion of the chi square distribution table below) or using software like SPSS, R, or Excel. The p-value is less than 0.0001 (far below the usual 0.05 and even the 0.01 threshold).

Chi square distribution
DFp value (α = 0.05)p value (α = 0.01)
13.8416.635
25.9919.21
37.81511.345
49.48813.277
511.0715.086

We can see that for the chi square value to be significant, it has to be higher than the critical p value found in the table of Chi square distribution namely, 5.991 and 9.21 for α = 0.05 and α = 0.01, respectively.

If you don’t want to go through the nitty-gritty of the computation in this age of AI, you may use an online chi-square test calculator. The reason why I laid down the process is for you to fully understand how the numbers were arrived at.

You may also view this video tutorial on how to compute chi-square using Excel.

Step 7: Make a Decision

Since p < 0.05, we reject the null hypothesis.

This means gender and cellphone brand preference are NOT independent — there is a statistically significant association.

Step 8: Explain Why the Result Is Significant

The computed Chi-square value (159.5) is extremely large compared to the critical value for df = 2 (which is about 5.99 at α = 0.05).

This means the difference between what we observed (e.g., many females choosing Nokia) and what we expected (if gender didn’t matter) is too big to be due to chance alone.

In simple terms:
Gender really does influence cellphone brand choice among the surveyed students.

Conclusion

In summary, analyzing survey results often means working with frequency data, and the Chi-square test is one of the simplest yet most powerful tools for this job. By comparing what you observe in your table with what you would expect if there were no relationship between the variables, Chi-square helps you see whether patterns in the data are meaningful or just random chance. As shown in our example on gender and cellphone preference, the test provides an evidence-based way to draw valid conclusions from categorical data — a skill every researcher should master.


quantitative data analysis ebook

What This eBook Will Help You Do

This condensed, 39-page mini eBook is a practical guide to help you:

  • Gain confidence in your data-driven conclusions
  • Learn quantitative data analysis in a few days, not one semester
  • Distinguish the types of variables and data you’re dealing with
  • Match your research questions with the appropriate statistical test
  • Explain the results of statistical software applications
  • Avoid common data collection mistakes that waste time and money

Leave a Reply