This is not a rant about either the strengths or limitations of RCTs. If that is what floats your boat, then you are in for a treat – there are a ton of posts and papers discussing the nitty gritty of the validity and efficacy of RCT evidence out there (see the “randomistas” Duflo, Banerjee, Angrist and Pishke, and Kremer, and for criticisms see Deaton, Cartwright, Pritchett, and Unlearning Economics)
My bugbear is merely with the phrase “Gold Standard”. What does it mean? When I hear “Gold Standard”, I envision something like this:
Who assigned the Gold medal? In most of the literature, “Gold Standard” is put in quotations, yet there is never a reference to who has given out this standard. As far as I am aware there is not a ratings agency of evidence, at least in the sphere of social policy, giving out stickers to different research techniques? Or is there an FDA of social policy that decides on hard and fast rules on how to judge evidence?
I see the appeal of using “Gold Standard” as a classifier. It conveys credibility without overstating and calling RCTs the “best” research method (although, the association between gold and winning means that many people will substitute “Gold Standard” with “Best” while reading). “Gold Standard” does erroneously imply consensus, and that there is an authority conveying standards. It also implies that other research methods are inferior and have been given silver or bronze standards. A conclusion is often drawn that RCTs are the “Gold Standard”, whatever the context.
But in trying to think of alternatives to “Gold Standard” –
- “one of the best”
- “a strong competitor”
- “a mighty fine research technique”
- “darn good”
- “exceptional in some contexts and inappropriate in others”
– you soon realise how difficult it is to succinctly describe and apply a standard to research methods.
Also, “Gold Standard” always reminds me of a gold standard exchange system, something that is pretty much unanimously rejected now (except for some nuts Republicans who think a pretty metal is the answer to all our problems). This is not something that I want brought to mind when trying to think about rigorous and reliable evidence.
I will leave you with the advice of Rachel Glennerster, who wrote my favorite book on randomized evaluations and has written on the Generalizability Puzzle of evidence. She has also just announced she will be leaving J-PAL, where she advised that training of investigators dropped the use of “Gold Standard” as a descriptor of RCTs, to be the Chief Economist for DFID. Her approach is to remember that there are multiple strategies for gathering evidence and that it is always best to:
“Choose the tool that fits the question”