Citations of:

Methodology in Practice: Statistical Misspecification Testing

Philosophy of Science 71 (5):1007-1025 (2004)

Add citations

You must login to add citations.

Severe testing as a basic concept in a neyman–pearson philosophy of induction.Deborah G. Mayo & Aris Spanos - 2006 - British Journal for the Philosophy of Science 57 (2):323-357.details Despite the widespread use of key concepts of the Neyman–Pearson (N–P) statistical paradigm—type I and II errors, significance levels, power, confidence levels—they have been the subject of philosophical controversy and debate for over 60 years. Both current and long-standing problems of N–P tests stem from unclarity and confusion, even among N–P adherents, as to how a test's (pre-data) error probabilities are to be used for (post-data) inductive inference as opposed to inductive behavior. We argue that the relevance of error probabilities (...) Download Export citation Bookmark 69 citations
Conceptual challenges for interpretable machine learning.David S. Watson - 2022 - Synthese 200 (2):1-33.details As machine learning has gradually entered into ever more sectors of public and private life, there has been a growing demand for algorithmic explainability. How can we make the predictions of complex statistical models more intelligible to end users? A subdiscipline of computer science known as interpretable machine learning (IML) has emerged to address this urgent question. Numerous influential methods have been proposed, from local linear approximations to rule lists and counterfactuals. In this article, I highlight three conceptual challenges that (...) Download Export citation Bookmark 11 citations
Detection of unfaithfulness and robust causal inference.Jiji Zhang & Peter Spirtes - 2008 - Minds and Machines 18 (2):239-271.details Much of the recent work on the epistemology of causation has centered on two assumptions, known as the Causal Markov Condition and the Causal Faithfulness Condition. Philosophical discussions of the latter condition have exhibited situations in which it is likely to fail. This paper studies the Causal Faithfulness Condition as a conjunction of weaker conditions. We show that some of the weaker conjuncts can be empirically tested, and hence do not have to be assumed a priori. Our results lead to (...) Download Export citation Bookmark 33 citations
Computer simulation through an error-statistical lens.Wendy S. Parker - 2008 - Synthese 163 (3):371-384.details After showing how Deborah Mayo’s error-statistical philosophy of science might be applied to address important questions about the evidential status of computer simulation results, I argue that an error-statistical perspective offers an interesting new way of thinking about computer simulation models and has the potential to significantly improve the practice of simulation model evaluation. Though intended primarily as a contribution to the epistemology of simulation, the analysis also serves to fill in details of Mayo’s epistemology of experiment. Download Export citation Bookmark 27 citations
A Categorical Solution to the Grue Paradox.Tatsuya Yoshii & Jun Otsuka - forthcoming - British Journal for the Philosophy of Science.details Download Export citation Bookmark 1 citation
The safe, the sensitive, and the severely tested: a unified account.Georgi Gardiner & Brian Zaharatos - 2022 - Synthese 200 (5):1-33.details This essay presents a unified account of safety, sensitivity, and severe testing. S’s belief is safe iff, roughly, S could not easily have falsely believed p, and S’s belief is sensitive iff were p false S would not believe p. These two conditions are typically viewed as rivals but, we argue, they instead play symbiotic roles. Safety and sensitivity are both valuable epistemic conditions, and the relevant alternatives framework provides the scaffolding for their mutually supportive roles. The relevant alternatives condition (...) Download Export citation Bookmark 3 citations
Who Should Be Afraid of the Jeffreys-Lindley Paradox?Aris Spanos - 2013 - Philosophy of Science 80 (1):73-93.details The article revisits the large n problem as it relates to the Jeffreys-Lindley paradox to compare the frequentist, Bayesian, and likelihoodist approaches to inference and evidence. It is argued that what is fallacious is to interpret a rejection of as providing the same evidence for a particular alternative, irrespective of n; this is an example of the fallacy of rejection. Moreover, the Bayesian and likelihoodist approaches are shown to be susceptible to the fallacy of acceptance. The key difference is that (...) Download Export citation Bookmark 12 citations
How (not) to measure replication.Samuel C. Fletcher - 2021 - European Journal for Philosophy of Science 11 (2):1-27.details The replicability crisis refers to the apparent failures to replicate both important and typical positive experimental claims in psychological science and biomedicine, failures which have gained increasing attention in the past decade. In order to provide evidence that there is a replicability crisis in the first place, scientists have developed various measures of replication that help quantify or “count” whether one study replicates another. In this nontechnical essay, I critically examine five types of replication measures used in the landmark article (...) Download Export citation Bookmark 3 citations
Some surprising facts about surprising facts.D. Mayo - 2014 - Studies in History and Philosophy of Science Part A 45:79-86.details A common intuition about evidence is that if data x have been used to construct a hypothesis H, then x should not be used again in support of H. It is no surprise that x fits H, if H was deliberately constructed to accord with x. The question of when and why we should avoid such “double-counting” continues to be debated in philosophy and statistics. It arises as a prohibition against data mining, hunting for significance, tuning on the signal, and (...) Download Export citation Bookmark 9 citations
Error statistical modeling and inference: Where methodology meets ontology.Aris Spanos & Deborah G. Mayo - 2015 - Synthese 192 (11):3533-3555.details In empirical modeling, an important desiderata for deeming theoretical entities and processes as real is that they can be reproducible in a statistical sense. Current day crises regarding replicability in science intertwines with the question of how statistical methods link data to statistical and substantive theories and models. Different answers to this question have important methodological consequences for inference, which are intertwined with a contrast between the ontological commitments of the two types of models. The key to untangling them is (...) Download Export citation Bookmark 7 citations
Error-statistical elimination of alternative hypotheses.Kent Staley - 2008 - Synthese 163 (3):397 - 408.details I consider the error-statistical account as both a theory of evidence and as a theory of inference. I seek to show how inferences regarding the truth of hypotheses can be upheld by avoiding a certain kind of alternative hypothesis problem. In addition to the testing of assumptions behind the experimental model, I discuss the role of judgments of implausibility. A benefit of my analysis is that it reveals a continuity in the application of error-statistical assessment to low-level empirical hypotheses and (...) Download Export citation Bookmark 9 citations
How to discount double-counting when it counts: Some clarifications.Deborah G. Mayo - 2008 - British Journal for the Philosophy of Science 59 (4):857-879.details The issues of double-counting, use-constructing, and selection effects have long been the subject of debate in the philosophical as well as statistical literature. I have argued that it is the severity, stringency, or probativeness of the test—or lack of it—that should determine if a double-use of data is admissible. Hitchcock and Sober ([2004]) question whether this severity criterion' can perform its intended job. I argue that their criticisms stem from a flawed interpretation of the severity criterion. Taking their criticism as (...) Download Export citation Bookmark 11 citations
On the Jeffreys-Lindley Paradox.Christian P. Robert - 2014 - Philosophy of Science 81 (2):216-232,.details This article discusses the dual interpretation of the Jeffreys-Lindley paradox associated with Bayesian posterior probabilities and Bayes factors, both as a differentiation between frequentist and Bayesian statistics and as a pointer to the difficulty of using improper priors while testing. I stress the considerable impact of this paradox on the foundations of both classical and Bayesian statistics. While assessing existing resolutions of the paradox, I focus on a critical viewpoint of the paradox discussed by Spanos in Philosophy of Science. Download Export citation Bookmark 7 citations
Strategies for securing evidence through model criticism.Kent W. Staley - 2012 - European Journal for Philosophy of Science 2 (1):21-43.details Some accounts of evidence regard it as an objective relationship holding between data and hypotheses, perhaps mediated by a testing procedure. Mayo’s error-statistical theory of evidence is an example of such an approach. Such a view leaves open the question of when an epistemic agent is justified in drawing an inference from such data to a hypothesis. Using Mayo’s account as an illustration, I propose a framework for addressing the justification question via a relativized notion, which I designate security , (...) Download Export citation Bookmark 8 citations
Is frequentist testing vulnerable to the base-rate fallacy?Aris Spanos - 2010 - Philosophy of Science 77 (4):565-583.details This article calls into question the charge that frequentist testing is susceptible to the base-rate fallacy. It is argued that the apparent similarity between examples like the Harvard Medical School test and frequentist testing is highly misleading. A closer scrutiny reveals that such examples have none of the basic features of a proper frequentist test, such as legitimate data, hypotheses, test statistics, and sampling distributions. Indeed, the relevant error probabilities are replaced with the false positive/negative rates that constitute deductive calculations (...) Download Export citation Bookmark 8 citations
Curve Fitting, the Reliability of Inductive Inference, and the Error‐Statistical Approach.Aris Spanos - 2007 - Philosophy of Science 74 (5):1046-1066.details The main aim of this paper is to revisit the curve fitting problem using the reliability of inductive inference as a primary criterion for the ‘fittest' curve. Viewed from this perspective, it is argued that a crucial concern with the current framework for addressing the curve fitting problem is, on the one hand, the undue influence of the mathematical approximation perspective, and on the other, the insufficient attention paid to the statistical modeling aspects of the problem. Using goodness-of-fit as the (...) Download Export citation Bookmark 8 citations
Internalist and externalist aspects of justification in scientific inquiry.Kent Staley & Aaron Cobb - 2011 - Synthese 182 (3):475-492.details While epistemic justification is a central concern for both contemporary epistemology and philosophy of science, debates in contemporary epistemology about the nature of epistemic justification have not been discussed extensively by philosophers of science. As a step toward a coherent account of scientific justification that is informed by, and sheds light on, justificatory practices in the sciences, this paper examines one of these debates—the internalist-externalist debate—from the perspective of objective accounts of scientific evidence. In particular, we focus on Deborah Mayo’s (...) Download Export citation Bookmark 5 citations
A frequentist interpretation of probability for model-based inductive inference.Aris Spanos - 2013 - Synthese 190 (9):1555-1585.details The main objective of the paper is to propose a frequentist interpretation of probability in the context of model-based induction, anchored on the Strong Law of Large Numbers (SLLN) and justifiable on empirical grounds. It is argued that the prevailing views in philosophy of science concerning induction and the frequentist interpretation of probability are unduly influenced by enumerative induction, and the von Mises rendering, both of which are at odds with frequentist model-based induction that dominates current practice. The differences between (...) Download Export citation Bookmark 3 citations
Philosophical Scrutiny of Evidence of Risks: From Bioethics to Bioevidence.Deborah G. Mayo & Aris Spanos - 2006 - Philosophy of Science 73 (5):803-816.details We argue that a responsible analysis of today's evidence-based risk assessments and risk debates in biology demands a critical or metascientific scrutiny of the uncertainties, assumptions, and threats of error along the manifold steps in risk analysis. Without an accompanying methodological critique, neither sensitivity to social and ethical values, nor conceptual clarification alone, suffices. In this view, restricting the invitation for philosophical involvement to those wearing a "bioethicist" label precludes the vitally important role philosophers of science may be able to (...) Download Export citation Bookmark 4 citations
The discovery of argon: A case for learning from data?Aris Spanos - 2010 - Philosophy of Science 77 (3):359-380.details Rayleigh and Ramsay discovered the inert gas argon in the atmospheric air in 1895 using a carefully designed sequence of experiments guided by an informal statistical analysis of the resulting data. The primary objective of this article is to revisit this remarkable historical episode in order to make a case that the error‐statistical perspective can be used to bring out and systematize (not to reconstruct) these scientists' resourceful ways and strategies for detecting and eliminating error, as well as dealing with (...) Download Export citation Bookmark 3 citations
Evidence and Justification in Groups with Conflicting Background Beliefs.Kent W. Staley - 2010 - Episteme 7 (3):232-247.details Some prominent accounts of scientific evidence treat evidence as an unrelativized concept. But whether belief in a hypothesis is justified seems relative to the epistemic situation of the believer. The issue becomes yet more complicated in the context of group epistemic agents, for then one confronts the problem of relativizing to an epistemic situation that may include conflicting beliefs. As a step toward resolution of these difficulties, an ideal of justification is here proposed that incorporates both an unrelativized evidence requirement (...) Download Export citation Bookmark 3 citations
Science without (parametric) models: the case of bootstrap resampling.Jan Sprenger - 2011 - Synthese 180 (1):65-76.details Scientific and statistical inferences build heavily on explicit, parametric models, and often with good reasons. However, the limited scope of parametric models and the increasing complexity of the studied systems in modern science raise the risk of model misspecification. Therefore, I examine alternative, data-based inference techniques, such as bootstrap resampling. I argue that their neglect in the philosophical literature is unjustified: they suit some contexts of inquiry much better and use a more direct approach to scientific inference. Moreover, they make (...) Download Export citation Bookmark 2 citations
The error statistical philosopher as normative naturalist.Deborah Mayo & Jean Miller - 2008 - Synthese 163 (3):305 - 314.details We argue for a naturalistic account for appraising scientific methods that carries non-trivial normative force. We develop our approach by comparison with Laudan’s (American Philosophical Quarterly 24:19–31, 1987, Philosophy of Science 57:20–33, 1990) “normative naturalism” based on correlating means (various scientific methods) with ends (e.g., reliability). We argue that such a meta-methodology based on means–ends correlations is unreliable and cannot achieve its normative goals. We suggest another approach for meta-methodology based on a conglomeration of tools and strategies (from statistical modeling, (...) Download Export citation Bookmark 2 citations
Extrapolating from experiments, confidently.Donal Khosrowi - 2023 - European Journal for Philosophy of Science 13 (2):1-28.details Extrapolating causal effects from experiments to novel populations is a common practice in evidence-based-policy, development economics and other social science areas. Drawing on experimental evidence of policy effectiveness, analysts aim to predict the effects of policies in new populations, which might differ importantly from experimental populations. Existing approaches made progress in articulating the sorts of similarities one needs to assume to enable such inferences. It is also recognized, however, that many of these assumptions will remain surrounded by significant uncertainty in (...) Download Export citation Bookmark
Model Verification and the Likelihood Principle.Samuel C. Fletcher - unknowndetails The likelihood principle is typically understood as a constraint on any measure of evidence arising from a statistical experiment. It is not sufficiently often noted, however, that the LP assumes that the probability model giving rise to a particular concrete data set must be statistically adequate—it must “fit” the data sufficiently. In practice, though, scientists must make modeling assumptions whose adequacy can nevertheless then be verified using statistical tests. My present concern is to consider whether the LP applies to these (...) Download Export citation Bookmark
Statistical significance and its critics: practicing damaging science, or damaging scientific practice?Deborah G. Mayo & David Hand - 2022 - Synthese 200 (3):1-33.details While the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools, and that in fact the purported remedies themselves risk damaging science. (...) Download Export citation Bookmark
Tackling Duhemian Problems: An Alternative to Skepticism of Neuroimaging in Philosophy of Cognitive Science.Emrah Aktunc - 2014 - Review of Philosophy and Psychology 5 (4):449-464.details Duhem’s problem arises especially in scientific contexts where the tools and procedures of measurement and analysis are numerous and complex. Several philosophers of cognitive science have cited its manifestations in fMRI as grounds for skepticism regarding the epistemic value of neuroimaging. To address these Duhemian arguments for skepticism, I offer an alternative approach based on Deborah Mayo’s error-statistical account in which Duhem's problem is more fruitfully approached in terms of error probabilities. This is illustrated in examples such as the use (...) Download Export citation Bookmark 1 citation
Tackling Duhemian Problems: An Alternative to Skepticism of Neuroimaging in Philosophy of Cognitive Science.M. Emrah Aktunç - 2014 - Review of Philosophy and Psychology 5 (4):449-464.details Duhem’s problem arises especially in scientific contexts where the tools and procedures of measurement and analysis are numerous and complex. Several philosophers of cognitive science have cited its manifestations in fMRI as grounds for skepticism regarding the epistemic value of neuroimaging. To address these Duhemian arguments for skepticism, I offer an alternative approach based on Deborah Mayo’s error-statistical account in which Duhem's problem is more fruitfully approached in terms of error probabilities. This is illustrated in examples such as the use (...) Download Export citation Bookmark 1 citation
How experimental algorithmics can benefit from Mayo’s extensions to Neyman–Pearson theory of testing.Thomas Bartz-Beielstein - 2008 - Synthese 163 (3):385-396.details Although theoretical results for several algorithms in many application domains were presented during the last decades, not all algorithms can be analyzed fully theoretically. Experimentation is necessary. The analysis of algorithms should follow the same principles and standards of other empirical sciences. This article focuses on stochastic search algorithms, such as evolutionary algorithms or particle swarm optimization. Stochastic search algorithms tackle hard real-world optimization problems, e.g., problems from chemical engineering, airfoil optimization, or bioinformatics, where classical methods from mathematical optimization fail. (...) Download Export citation Bookmark
Two ways to rule out error: Severity and security.Kent Staley - unknowndetails I contrast two modes of error-elimination relevant to evaluating evidence in accounts that emphasize frequentist reliability. The contrast corresponds to that between the use of of a reliable inference procedure and the critical scrutiny of a procedure with regard to its reliability, in light of what is and is not known about the setting in which the procedure is used. I propose a notion of security as a category of evidential assessment for the latter. In statistical settings, robustness theory and (...) Download Export citation Bookmark
Severity and Trustworthy Evidence: Foundational Problems versus Misuses of Frequentist Testing.Aris Spanos - 2022 - Philosophy of Science 89 (2):378-397.details For model-based frequentist statistics, based on a parametric statistical model ${{\cal M}_\theta }$, the trustworthiness of the ensuing evidence depends crucially on the validity of the probabilistic assumptions comprising ${{\cal M}_\theta }$, the optimality of the inference procedures employed, and the adequateness of the sample size to learn from data by securing –. It is argued that the criticism of the postdata severity evaluation of testing results based on a small n by Rochefort-Maranda is meritless because it conflates [a] misuses (...) Download Export citation Bookmark
Objective evidence and rules of strategy: Achinstein on method: Peter Achinstein: Evidence and method: Scientific strategies of Isaac Newton and James Clerk Maxwell. Oxford and New York: Oxford University Press, 2013, 177pp, $24.95 HB.William L. Harper, Kent W. Staley, Henk W. de Regt & Peter Achinstein - 2014 - Metascience 23 (3):413-442.details Download Export citation Bookmark
Revisiting Haavelmo's structural econometrics: bridging the gap between theory and data.Aris Spanos - 2015 - Journal of Economic Methodology 22 (2):171-196.details The objective of the paper is threefold. First, to argue that some of Haavelmo's methodological ideas and insights have been neglected because they are largely at odds with the traditional perspective that views empirical modeling in economics as an exercise in curve-fitting. Second, to make a case that this neglect has contributed to the unreliability of empirical evidence in economics that is largely due to statistical misspecification. The latter affects the reliability of inference by inducing discrepancies between the actual and (...) Download Export citation Bookmark
Securing reliable evidence.Kent W. Staley - unknowndetails : Evidence claims depend on fallible assumptions. Three strategies for making true evidence claims in spite of this fallibility are strengthening the support for those assumptions, weakening conclusions, and using multiple independent tests to produce robust evidence. Reliability itself, understood in frequentist terms, does not explain the usefulness of all three strategies; robustness, in particular, sometimes functions in a way that is not well-characterized in terms of reliability. I argue that, in addition to reliability, the security of evidence claims is (...) Download Export citation Bookmark
Science is judgement, not only calculation: a reply to Aris Spanos’s review of The cult of statistical significance.Stephen T. Ziliak & Deirdre Nansen McCloskey - 2008 - Erasmus Journal for Philosophy and Economics 1 (1):165-170.details Download Export citation Bookmark
Rejoinder error in economics. Towards a more evidence-based methodology , Julian Reiss, Routledge, 2007, XXIV + 246 pages. [REVIEW]Julian Reiss - 2009 - Economics and Philosophy 25 (2):210-215.details Download Export citation Bookmark 1 citation

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...