A number of people have tried asking questions or leaving comments on my blog post, “More Attacks on Vitamin A.” Due to a software gitch we are currently working out none of these comments are posting.
While we are working on the software glitch, I would like to use this blog post to answer a couple of questions that have focused on clarifications about the data from the study we discussed in the last blog post.
A number of people have asked me for more specific information about how the risk of colorectal cancer varied with different intakes of vitamin A or with different blood levels of vitamin A. For example:
Keeping vitamin D levels high, how was lower and higher vitamin A intake correlated with risk?I’ll cut to the chase: we have no idea and cannot tell from this study.
The more detailed explanation follows.
You can see the table here.
Mark Twain and others attributed a phrase to the nineteenth century British Prime Minister Benjamin Disraeli, “There are lies, damn lies, and statistics.” Statistics never lie, but it’s easy to misuse statistics to support a lie or a weak argument, and we should keep that in mind as we analyze the table.
If we look at the table (you could open it in a new window from the link and continue reading if you would like), the people with high levels of vitamin D are shown on the right. In that column, there are three rows, the first for low intakes of vitamin A and the last for high intakes of vitamin A. Numbers within the cells indicate decreased risk of colorectal cancer if they are less than one and increased risk if they are more than one.
There is a catch, however. The numbers represent the risk of getting colorectal cancer among the 2500 people in the study. There are six billion people in the world. The purpose of this study is to draw a conclusion about how nutrition affects the risk of cancer among everyone, or at least among a large population whose members have similar characteristics as the 2500 people in the study.
In any study, there is sampling error. Therefore, the authors performed statistical tests to show us what we can infer from the study sample to the greater population. Listed next to each number there is a set of two numbers in parentheses. We can be 95% confident that the risk in the general population lies somewhere between the two numbers.
For example, in the upper right corner, we see that people with low intake of retinol and high blood levels of vitamin D had a 27% decreased risk of colorectal cancer compared to people with intermediate levels of both vitamins among people enrolled in the study. We can only infer from this, with 95% confidence, that the decrease in risk among people with the same nutritional characteristics in the general population is between 14% and 38%. If both of the numbers in parentheses are less than one, we can be 95% confident that the decrease risk is found in the general population; if both are larger than one, we can be 95% confident there is an increased risk in the general population; if one number lies below one and the other lies above one, there could be either a decrease, increase, or no difference in risk in the general population. We would say such a result is “not statistically significant.”
But there is another catch. When you investigate multiple hypotheses with a single set of statistics, your chance of generating a false result that appears “statistically significant” greatly increases. If we want to actually compare specific cells within the table, and thus determine how people fared at specific levels of vitamin D or at specific intakes of vitamin A, we need to determine precisely how many questions we are going to ask and perform an additional statistical test to adjust our levels of statistical significance. The authors did not perform statistical tests to compare all the different cells to one another, probably because they would need a study of much greater size in order to have the statistical power to generate legitimate answers to those questions.
What the authors did perform was a test showing there is a significant “interaction effect” between vitamin D levels and vitamin A intakes. We can conclude that the relationship between vitamin D and colorectal cancer declines as vitamin A intakes increase, but we can’t go further than that. We could, but then our statistics would fall into the category of “lies and damn lies.”
To recap, we can infer from this study that among people with high vitamin intakes, low vitamin D levels are less likely to be associated with an increase in the risk of cancer and high vitamin D levels are less likely to be associated with a decrease in the risk. We cannot infer that vitamin D levels cause the increase or decrease in risk, nor can we infer that vitamin A intakes cause the association or lack of association between cancer incidence and blood levels of vitamin D. In sum, this study is useful for justifying experiments that can tell us what the effects of vitamins A and D are on the formation and promotion of cancers in the digestive tract, but nothing more than that.
Read more about the author, Chris Masterjohn, PhD, here.
LeonRover says
Congratulations on stating how statistical inference works in research.
It is refreshing to see a correct description of the conclusions to be drawn from a study which in effect adds the phrase “your mileage may differ”.
Christopher Masterjohn says
Thanks Leon!
Chris
Will says
nice work.