Around meta-analysis (16): meta-data, metadata, and more meta confusion
Part of COSSEE’s ongoing meta-analysis series, this post focuses on terminology and conceptual clarity.
This post is inspired by Coralie’s recent blog post, “Meta-analysis terminology can be confusing”, in which she untangles a range of commonly used, misused, and confused terms in meta-analysis, such as subgroup analysis, moderator analysis, meta-regression, fixed-effect vs fixed-effects models, and multivariate.
These certainly warrant clarification. But what about the terminology for the underlying data? Could that be just as confusing?
What is “meta-data”?
There are many definitions of meta-data (or metadata), but most describe it as “the information that defines and describes data” (ABS). Since information is also a form of data, meta-data itself can have meta-data, which can have more meta-data, and so on. Conversely, a dataset can include meta-data, which itself may include even deeper layers of meta-data. This creates a kind of conceptual circularity that adds to the confusion, especially in the context of meta-analysis and systematic reviews of all sorts.
Does meta-analysis use meta-data?
Yes, but not always in the way people expect.
It is common to assume that meta-data simply refers to the dataset compiled and analyzed in a meta-analysis, especially since both terms contain the prefix “meta” and deal with data from primary studies. As a result, when researchers are asked to share both their data and meta-data, they often upload only the dataset itself.
However, in this context, meta-data refers specifically to the description of the dataset: a detailed explanation of the variables, their definitions, units, data structure, and so on. But this description may also contain information that can itself be considered meta-data, which adds another layer of confusion.

Visualising layers of meta-data in meta-analyses
What counts as data or meta-data depends on the context. In a primary empirical study, the data might consist of field or lab measurements of things, humans, or systems, while the meta-data includes descriptions of the variables in that spreadsheet (black parts of the diagram above).
But once a primary study is published or shared, it gains another layer of meta-data: title, abstract, publication date, author names, affiliations, and so on. This is the kind of meta-data librarians and other information specialists work with (green parts of the diagram above).
In a secondary study, such as a meta-analysis or systematic review, you typically compile not only data from primary studies (selected results and their descriptors), but also some of their meta-data (for example, study-level characteristics such as study reference, title, authors, journal, DOI, and so on), and then also generate new data for your synthesis, such as recalculated effect sizes. The resulting dataset is therefore a layered mix of data and meta-data from different sources and levels.
What to do in practice
In practice, for a meta-analysis and for other secondary studies, use terminology consistently in the context of your study: call your dataset “data”, and the description of your dataset “meta-data” (purple/plum, not pink, parts of the diagram above). You can still acknowledge that your data contains some meta-data from the underlying primary studies, such as information describing the publications.
Why it matters
Conceptual complexity, and the commonly inconsistent use of terminology, may partially explain why appropriate meta-data is often missing or poorly documented in shared datasets from meta-analyses and various types of systematic reviews. When people are asked to share meta-data, but they think this means only their dataset (data), they often share the dataset without descriptions of all variables (meta-data).
Without complete and well-structured meta-data (descriptions of data), it becomes difficult to interpret the dataset, let alone reuse it or reproduce the analyses. Transparent and clear meta-data is crucial for making meta-analyses truly open and reusable.
Note
You can find earlier blog posts from my Around meta-analysis series archived on my personal website.
Useful context