Effectiveness of physical activity interventions for improving depression, anxiety and distress: an overview of systematic reviews

Introduction

Mental health disorders are among the leading causes of the global health-related burden, with substantial individual and societal costs.1 2 In 2019, one in eight people (970 million) worldwide were affected by a mental health disorder3 and almost one in two (44{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}) will experience a mental health disorder in their lifetime.4 The annual global costs of mental health disorders have been estimated at $2.5 trillion (USD), which is projected to increase to $6 trillion (USD) by 2030.5 Depression is the leading cause of mental health-related disease burden,6 while anxiety is the most prevalent mental health disorder.3 Additionally, the COVID-19 pandemic has been associated with increased rates of psychological distress, with prevalence ranging between 35{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} and 38{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} worldwide.7–9

The role of lifestyle management approaches, such as exercise, sleep hygiene and a healthy diet, varies between clinical practice guidelines in different countries. In US clinical guidelines,10 psychotherapy or pharmacotherapy is recommended as the initial treatment approaches, with lifestyle approaches considered as ‘complementary alternative treatments’ where psychotherapy and pharmacotherapy are ‘ineffective or unacceptable’. In other countries such as Australia, lifestyle management is recommended as the first-line treatment approach,11 12 though in practice, pharmacotherapy is often provided first.

There have been hundreds of research trials examining the effects of physical activity (PA) on depression, anxiety and psychological distress, many of which suggest that PA may have similar effects to psychotherapy and pharmacotherapy (and with numerous advantages over psychotherapy and pharmacotherapy, in terms of cost, side-effects and ancillary health benefits).13–18 Despite the evidence for the benefits of PA, it has not been widely adopted therapeutically. Patient resistance, the difficulty of prescribing and monitoring PA in clinical settings, as well as the huge volume of largely incommensurable studies, have probably impeded a wider take-up in practice.13 14 17

Meta-reviews are systematic reviews of systematic reviews, offering a way of synthesising a vast evidence base. While there have been several meta-reviews of PA for depression, anxiety and psychological distress,17 19–24 they have focused on specific population subgroups, particular conditions (eg, depression only) or on particular forms of PA. We set out to undertake the most comprehensive synthesis to date of evidence regarding the effects of all modes of PA on symptoms of depression, anxiety and psychological distress in adult populations.

Methods

Protocol and registration

The protocol for this systematic umbrella review was prospectively registered on PROSPERO and results are reported according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA)25 guidelines.

Selection criteria and search strategy

The population, intervention, comparison, outcomes and study type (PICOS) framework was used to develop the inclusion criteria as follows: population: any adult population (aged ≥18 years); intervention: interventions designed to increase PA. The following definition of PA was used: ‘any bodily movement produced by the contraction of skeletal muscles that results in a substantial increase in caloric requirements over resting energy expenditure’.26 Reviews were eligible irrespective of PA modality, supervision, delivery (eg, in-person or online) or dose (frequency, intensity and duration). Reviews were ineligible if they included any randomised control trials (RCTs) of non-PA interventions, if PA was combined with another intervention (eg, diet) or if they evaluated single bouts of acute exercise. Comparator: reviews were eligible if ≥75{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} of the included RCTs involved either usual care, waitlist, nothing an equal attention intervention or a lower/lesser PA intervention (eg, a supervised exercise intervention vs printed PA materials). During study selection, it became apparent that the comparator inclusion/exclusion criteria needed elaboration. After careful consideration and discussion, we decided to exclude reviews where >25{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} of component RCTs compared PA to pharmaceutical interventions or compared two types of equal dose exercise (eg, resistance vs aerobic exercise) without a non-PA comparison, since the inclusion of such reviews would limit our ability to evaluate the effectiveness of PA per se. Outcomes: any self-report or clinician-rated assessment of depression, anxiety or psychological distress symptoms. Study type: systematic reviews with meta-analyses of RCTs only, which included meta-analyses of the outcomes of interest.

Twelve databases were searched (CINAHL, Cochrane, Embase, MEDLINE, Emcare, ProQuest Health and Medical Complete, ProQuest Nursing and Allied Health Source, PsycINFO, Scopus, Sport Discus, EBSCOhost and Web of Science) using subject heading, keyword and Medical Subject Headings (MeSH) term searches for ‘systematic review’, ‘meta-analysis’, ‘physical activity’, ‘exercise’, ‘anxiety’, ‘depression’ and ‘psychological distress’ (see online supplemental eTable 1 for the full search strategy). Database searches were limited to peer-reviewed journal articles published in English language from inception to 1 January 2022.

Data management and extraction

Search results were imported into EndNote V.x9 (Clarivate, Philadelphia) where duplicates were removed, then exported into Covidence (Veritas Health Innovation, Melbourne, Australia). Title/abstract and full-text screening, data extraction and risk of bias scoring were completed in duplicate by two independent reviewers (BS and AM, AW, CEMS, DD, EE, EO, KS, RC, RV or TF), with disagreements resolved by team discussion.

Data were extracted in duplicate by two independent reviewers (BS and AM, AW, CEMS, DD, EE, EO, KS, RC, RV or TF) using a standardised extraction form,27 28 and discrepancies were resolved by team discussion. The risk of bias of the included reviews was assessed by two independent reviewers (BS and AM, AW, CEMS, DD, EE, EO, KS, RC, RV or TF) in duplicate using the A MeaSurement Tool to Assess systematic Reviews (AMSTAR-2) tool.29 The AMSTAR-2 tool involves 16 items, with each item scored as yes, partial yes or no. Seven items are considered ‘critical’ and nine ‘non-critical’.29 The critical domains are protocol registration, adequacy of search strategy, justification for excluding individual studies, risk of bias assessment, appropriateness of meta-analysis methods, use of risk of bias during interpretation and assessment of publication bias. Reviews were rated as ‘high confidence’ (0 critical weakness and <3 non-critical weaknesses), ‘moderate’ (one critical weakness and <3 non-critical weaknesses), ‘low’ (>1 critical weakness and <3 non-critical weaknesses) or ‘critically low’ (>1 critical weakness and ≥3 non-critical weaknesses).29

Umbrella review synthesis methods

The overlap in component RCTs that were included across all eligible reviews was assessed using the Corrected Covered Area (CCA) method.30 A CCA of 100{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} indicates that every review included in our umbrella review comprised the same component RCTs, while a CCA of 0{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} indicates that every review in our umbrella review included entirely unique RCTs. The following cut-offs were used to quantify the CCA: 0{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}–5{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}=‘slight overlap’; 6{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}–10{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}=‘moderate’; 11{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}–15{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}=‘high’ and >15{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}=‘very high’ overlap.30 Publication bias was assessed by creating a funnel plot and observing the presence of asymmetries or missing sections.31

Meta-analysis results from each review were presented using forest plots. Separate forest plots were created for meta-analyses reporting standardised (eg, standardised mean difference, SMD) and unstandardised effect sizes (eg, mean difference). For meta-analyses that reported standardised effect sizes, we undertook subgroup analyses for clinical status and intervention characteristics. Meta-analysis results were summarised using medians and IQRs

The Oxford Centre for Evidence-Based Medicine levels of evidence and grades for recommendations32 were used to classify the overall level of evidence as grade A: consistent level 1 studies (ie, systematic reviews of RCTs or individual RCTs); B: consistent level 2 (ie, systematic reviews of cohort studies or individual cohort studies) or level 3 studies (ie, systematic reviews of case–control studies or individual case–control studies) or extrapolations from level 1 studies; C: level 4 studies (ie, case series) or extrapolations from level 2 or 3 studies or D: level 5 (ie, expert opinion without explicit critical appraisal) evidence or troublingly inconsistent or inconclusive studies of any level.32

Results

Of the 1280 records identified, 97 were eligible. They included 1039 unique (component) RCTs and the CCA was 0.6{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}, indicating slight overlap (see online supplemental eFigure 1 for PRISMA flowchart, including reasons for exclusions). Evaluation of funnel plots indicated no evidence of publication bias (online supplemental eFigure 2).

An overview of all reviews’ characteristics is shown in online supplemental eTable 2. There was a total of >128 119 participants (n=133 did not report the number of participants). Mean participant age ranged from 29 to 86 (median=55) years, and most reviews (n=83, 86{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db}) involved female and male participants. An overview of all populations and PA modalities is shown in table 1. Fifteen reviews specifically involved individuals with depression33–41 and three involved individuals with anxiety.42–44 Most reviews involved various PA modes (n=70) and most (n=77) had a critically low AMSTAR-2 score (low: n=10; high: n=10, online supplemental eTable 3).

Table 1

Overview of all populations, conditions and physical activity modes of the included reviews

Meta-analysis results: depression

Results from 72 meta-analyses based on SMD (n=875 component RCTs, >62 040 participants) showed a medium effect in favour of PA for reducing depression and depressive symptoms (median SMD=−0.43, IQR=−0.66 to –0.27, figure 1).

Figure 1

Results of meta-analyses that assessed symptoms of depression using standardised mean differences (negative values represent a reduction in symptoms).

MD effect size for each instrument was: profile of mood states: −7.68 (1 review), Beck Depression Inventory: −5.53 (IQR=−6.24 to –4.81), The Edinburgh Postnatal Depression Scale: −2.97 (IQR=−3.49 to –2.44), self-rating scale: −3.99 (one review), Brief Symptom Inventory 18: −3.02 (one review), Centre for Epidemiological Studies Depression: −0.36 (IQR=−1.25 to 0.02), Montgomery-Asberg Depression Rating Scale: −1.80 and Hospital Anxiety and Depression Scale: −1.26 (IQR=−1.41 to –1.18, online supplemental eFigure 3 and online supplemental eTable 4).

Grade of recommendation: (A) Consistent level 1 studies.

Anxiety

Results from 28 meta-analyses using SMD (171 component RCTs, >10 952 participants) showed a medium effect of PA for reducing anxiety (median SMD=−0.42, IQR=−0.66 to –0.26, figure 2).

Figure 2

Results of meta-analyses that assessed symptoms of anxiety using standardised mean differences (negative values represent a reduction in symptoms).

MD effect sizes for each instrument were: The State-Trait Anxiety Inventory: −3.61 (IQR=−6.01 to –1.66), Brief Symptom Inventory-18: −5.45 (1 review), Self-rating scale: −4.57 (1 review), Hospital Anxiety and Depression Scale: −1.26 (IQR=−1.26 to –0.79, online supplemental eTable 4 and online supplemental eFigure 5).

Grade of recommendation: (A) Consistent level 1 studies.

Psychological distress

One systematic review45 reported SMD results for psychological distress (six component RCTs, 508 participants), while another systematic review46 reported MD results (one component RCT, 39 participants). Results showed a medium effect in favour of PA, compared with usual care (SMD=−0.60, 95{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} CI −0.78 to –0.42). For MD, findings showed no significant effect (MD=−0.30, 95{af0afab2a7197b4b77fcd3bf971aba285b2cb7aa14e17a071e3a1bf5ccadd6db} CI −5.55, 4.95, one review, one component RCT, 39 participants).

Grade of recommendation: (B) Consistent level 2 or 3 studies or extrapolations from level 1 studies.

Subgroup analyses: clinical status

Depression

Seventeen reviews provided data on patients with cancer,45 47–62 and 16 on people with depression or depressive symptoms.10 33 39 63–75 PA was effective in reducing depressive symptoms across all conditions (median SMD range: –0.85 (kidney disease), –0.16 (cardiovascular disease)). The largest effects were found in kidney disease, HIV, chronic obstructive pulmonary disease, generally healthy adults and individuals diagnosed with depression (table 2).

Table 2

Summary data on the effects of physical activity interventions on depression for a range of clinical conditions, including the number of reviews, studies and participants covered; and the 25th percentile, median and 75th percentile for standardised mean differences

Depression

Three reviews42 50 78 presented analyses on session duration (online supplemental content 17). Long (≥60 min, SMD=–0.57, IQR –0.85 to –0.35) and medium (30–60 min, SMD=–0.60, IQR –0.78 to –0.41) session durations had similar benefits. The sole study of short sessions (<30 min) had a SMD of 0.01 (online supplemental eFigure 15).

Discussion

This is the first ever study to compile the extensive base of evidence regarding the effects of PA on depression, anxiety and psychological distress. We identified 97 systematic reviews, reporting the findings of 1039 unique RCTs, involving 128 119 participants. Findings suggest that PA interventions are effective in improving symptoms of depression and anxiety. Improvements were observed across all clinical populations, though the magnitude of effect varied across different clinical populations. The greatest benefits were seen in people with depression, pregnant and postpartum women, apparently healthy individuals and individuals diagnosed with HIV or kidney disease. All PA modes were effective, and higher intensity exercise was associated with greater improvements for depression and anxiety. Longer duration interventions had smaller effects compared with short and mid-duration, though the longest duration interventions still had positive effects.

PA was effective at reducing depression and anxiety across all clinical conditions, though the magnitude of the benefit varied between clinical groups. The larger effect sizes observed in clinical populations may reflect that these populations experience above-average symptoms of depression and anxiety and have low PA levels, and, therefore, have a greater scope for improvement compared with non-clinical populations.17

All PA modes were beneficial, including aerobic, resistance, mixed-mode exercise and yoga. It is likely that the beneficial effects of PA on depression and anxiety are due to a combination of various psychological, neurophysiological and social mechanisms.87 Different modes of PA stimulate different physiological88 and psychosocial effects,88–90 and this was supported by our findings (eg, resistance exercise had the largest effects on depression, while Yoga and other mind–body exercises were most effective for reducing anxiety). Furthermore, our findings showed that moderate-intensity and high-intensity PA modes were more effective than lower intensities. PA improves depression though various neuromolecular mechanisms including increased expression of neurotrophic factors, increased availability of serotonin and norepinephrine, regulation of hypothalamic–pituitary–adrenal axis activity and reduced systemic inflammation.91 92 Therefore, low-intensity PA may be insufficient for stimulating the neurological and hormonal changes that are associated with larger improvements in depression and anxiety.87 Overall, our findings add further support to public health guidelines, which recommend multimodal, moderate and vigorous intensity PA.

Our findings that longer duration interventions were less effective than shorter interventions may seem counter intuitive. It is possible that this finding reflects a decline in adherence with longer interventions. Furthermore, due to a lack of blinding of participants in PA trials, participants may have expected to have improved symptoms. It is possible that after experiencing short-term improvements in depression or anxiety, the expectancy effect may diminish over longer periods of time. An alternative explanation is that the longer interventions might not provide sufficient progression of PA dose, leading to a reduction in their effectiveness. Furthermore, it was somewhat surprising that smaller weekly duration interventions demonstrated larger effects than higher weekly duration. This is the opposite to the dose–benefit relationship observed for exercise and physical health outcomes.93 It is possible that shorter duration interventions are easier for participants to comply with, whereas longer weekly duration interventions are more burdensome and that may be impacting the psychological benefits. It is a useful message that interventions do not need to provide high doses of PA for improvements in depression.

The key strength of this study was that it is the first umbrella review to evaluate the effects of all types of PA on depression, anxiety and psychological distress in all adult populations. We included only the highest level of evidence: meta-analyses of RCTs and applied stringent criteria regarding the design of the component RCTs to ensure that effects could be confidently attributed to PA rather than other intervention components. Additionally, there was only slight overlap in the component RCTs, increasing our confidence in the findings.

A limitation of the review is that most evidence focused on mild-to-moderate depression, with fewer reviews addressing anxiety and psychological distress, preventing us from reaching firm conclusions in the subgroup analyses for these outcomes. Furthermore, most (n=77) of the included reviews were rated as ‘critically low’, based on the AMSTAR-2 quality rating.

Clinical implications

PA is effective for managing symptoms of depression and anxiety across numerous populations, including the general population, people with mental illnesses and various other clinical populations. While the benefit of exercise for depression and anxiety is generally recognised, it is often overlooked in the management of these conditions. Furthermore, many people with depression and anxiety have comorbidities, and PA is beneficial for their mental health and disease management. This underscores the need for PA to be a mainstay approach for managing depression and anxiety.

All modes of PA are effective, with moderate-to-high intensities more effective than low intensity. Larger benefits are achieved from shorter interventions, which has health service delivery cost implications–suggesting that benefits can be obtained following short-term interventions, and intensive long-term interventions are not necessarily required to achieve therapeutic benefit. The effect size reductions in symptoms of depression (−0.43) and anxiety (−0.42) are comparable to or slightly greater than the effects observed for psychotherapy and pharmacotherapy (SMD range=−0.22 to −0.37).94–97 Future research to understand the relative effectiveness of PA compared with (and in combination with) other treatments is needed to confirm these findings.

In conclusion, PA is effective for improving depression and anxiety across a very wide range of populations. All PA modes are effective, and higher intensity is associated with greater benefit. The findings from this umbrella review underscore the need for PA, including structured exercise interventions, as a mainstay approach for managing depression and anxiety.

What is already known

Previous research trials suggest that physical activity may have similar effects to psychotherapy and pharmacotherapy for patients with depression, anxiety or psychological distress.
Studies have evaluated different forms of physical activity, in varying dosages, in different population subgroups, and using different comparator groups, making it difficult for clinicians to understand the body of evidence for physical activity in the management of mental health disorders.

What are the new findings

Results showed that physical activity is effective for reducing mild-to-moderate symptoms of depression, anxiety and psychological distress (median effect size range=−0.42 to –0.60), compared with usual care across all populations.
Our findings underscore the important role of physical activity in the management of mild-to-moderate symptoms of depression, anxiety and psychological distress.

Tags: activity, anxiety, depression, distress, Effectiveness, improving, interventions, overview, physical, reviews, systematic