Taxonomic and functional diversity metrics across site and over time for water quality invertebrate communities: Models and comparison with long-term data and species-level data
We calculated taxonomic and functional diversity metrics representing freshwater invertebrate communities across sites and over time. We also examined different community subsets: native and non-native species, and insects and EPT taxa (Ephemeroptera, Plecoptera, Trichoptera, that is, mayflies, stoneflies, caddisflies, grouped as an indicator of water quality56).
Different sites are included in different moving windows, an important caveat of the moving-window analysis. Supplementary Table 6 lists the number of sites per window in each country. Although we accounted for the heterogeneity of site distribution across studies and countries within years, models cannot correct for the changing number of sampled sites across years. We cannot fully discount the possibility that biases in the characteristics of sites sampled across time affected trajectory results. We conducted two moving-window analyses to investigate this, the first limited to long-term data only and the second limited to sites with species-level resolution. The initial analysis only included sites with 20 sampling years between 1990 and 2020 and not the start years of 1990 and 1991. This analysis included 308 sites from 8 countries. The second analysis included sites with species-level taxonomic data and windows from 1994 to the present day. 717 sites from 14 countries were included in the moving-window analysis. The models were basically the same as our moving-window analyses. The decline in taxon richness over time was found by these alternatives moving-window analyses.
The responses of biodiversity metrics to the climate were assessed, in addition to the upstream land cover and dam impact score. Cropland cover changed by a small amount each year and urban cover changed by 2% per year, as most sites exhibited low variation. To examine relationships between environmental drivers and biodiversity trends, we modelled trend estimates using an LMM, incorporating trend errors as for the calculation of the overall trend, including all predictor variables as fixed effects, and study identity and country as random effects.
Moving Window Trend Estimation Using a Linear Mixed-Effects Model. II. Effect of the Effect of Data Type and Study Heterogeneity
The form of the model was: brm(MovingWindowTrend|se(sd_trend_estimate) ~ year, data = moving_window_trends, iter = 5000, inits = 0, chains = 4, prior = c(set_prior(“normal(0,3)”, class = “Intercept”)), control = list(adapt_delta = 0.90, max_treedepth = 12)).
To test for an overall linear change in the trajectory of moving-window trends, we modelled the effect of year on moving-window trend estimates using brms80.
For each response metric, we calculated the proportion of the posterior distribution of the mean trend estimate (that is, the overall LMM intercept) above or below zero, that is, the probability of an increasing or decreasing mean trend.
We ran linear mixed-effects models (LMM) in the brms package to synthesize site-level data and estimate overall mean trends. The LMM included site-level trend estimates as the response, and an overall intercept and two random effects (country and study identity) as predictors. Random effects accounted for data heterogeneity because of different numbers of sites in different countries. We assumed normal errors when looking at site-level trends. Site-level trends were combined in a meta-analysis model to estimate the mean trend across studies, including the uncertainty (represented by the s.d.) of the trend estimates, using brms80.
Source: The recovery of European freshwater biodiversity has come to a halt
Community niche space analysis of functional diversity and hydrological predictors: 1992-2020 data analysis and climatic forecasts using the TerraClimate dataset
We analysed functional diversity separately for each site by calculating six distance-based metrics chosen to describe multiple facets of community niche space and to align with taxonomic diversity metrics: functional richness, functional redundancy, functional evenness, functional turnover, functional divergence and Rao’s quadratic entropy (definitions and citations are provided in Supplementary Table 4). The metrics were calculated using the dbFD function in the R package. We used six axes in our calculations of functional richness and divergence. To enable calculation of functional turnover, we calculated community-weighted means of each functional trait category weighted by taxa abundance, then calculated turnover of the community-weighted means using the R package codyn, as for taxonomic temporal turnover57,68. We calculated abundance-weighted functional redundancy using the uniqueness function in the R package adiv69. The community diversity was divided into Simpson diversity and functional homogeneity was calculated as 1 U. The trait input matrix was based on Euclidean distances bound between 0 and 1 and the tolerance threshold was 10−8.
The first year of sampling is used to count the year in the temporal autocorrelation term as a count. The models accounted for any residual temporal autocorrelation using an ar(1) term and included the day of year as an additional predictor when there was a variation in sampling dates at a site.
The proportion of landcover categories in each subcatchment is calculated using the time series 81 from 1992 to the present. Land cover data for almost all of the sites analysed was available. We computed the entire upstream catchment for each point occurrence using the r.water.outlet module and calculated the percentage cover of each land cover category within this area. Each sample year the percentage of the upstream area averaged across the sites was calculated to arrive at the areas of cropland and urban land.
We extracted monthly climatic predictors from the TerraClimate dataset79 for 1967–2020, which covered all sites and years. The mean monthly climatic value was calculated for each site, as well as the sampling month. Cumulative annual precipitation and maximum monthly temperature are calculated for each period preceding the average sampling month at each site. Trend values in precipitation and maximum temperature over the period covered by each time series were calculated using Bayesian models fitted using the R package brms80. The models used to calculate site-level biodiversity metric trends were the same as the models used to estimate the coefficient of a continuous year effect. The TerraClimate dataset is associated with uncertainties in areas of complex terrain, but our large number of sampling sites, relatively good station coverage and the low physiographical complexity of most site locations should have minimized error in our analyses.
Source: The recovery of European freshwater biodiversity has come to a halt
Identification of Non-Native Species in the 90m Stream Network using the Hydrography Digital Elevation Model with GRASS GIS77
The high-resolution Hydrography90m stream network is created using the MERIT Hydro75 digital elevation model. To achieve a high spatial accuracy, we used an upstream contributing area of 0.05 km2 as the stream channel initialization threshold using the r.watershed and r.stream.extract modules in GRASS GIS77. We next calculated the subcatchments for each segment of the stream network, that is, the area contributing laterally to a given stream reach between two nodes, using the r.basins module. Spatial inaccuracy of either the DEM or the coordinates caused coordinates indicating a site to not always show up in the delineated stream network. In order to ensure that point occurrences would match the DEM-derived stream network and network, we moved all points to the corresponding stream segments using the v.net module within the given subcatchment. The fish swims at the same time as the crow flies, and so we calculated the network distance (km) from each point. When sites were not connected through the network, the distance was set to NA.
In total, we identified 61 non-native species. We did not allow for reliable identification of non native species because we limited the initial analysis to only 1,299 sites and the coarse family resolution made it impossible to identify non native species. Estimates of trends in non-native species richness and abundance were restricted to the 898 (of 1,299) sites at which non-native species were recorded. The two most abundant non-native species were the New Zealand mud snail, Potamopyrgus antipodarum (≥1 individual present in ≥1 year at 81% of sites) and the North American bladder snail, Physella acuta (34% of sites).
We had to fill in the gaps because of missing trait data. First, when trait data were not available at the original identification level (15.9% trait coverage across taxa), we used genus-level trait data, resulting in 48.2% coverage. Genus-level trait data are sufficient to represent most interspecific variation among freshwater invertebrates, and taxon responses to environmental variability. Next, when genus-level trait data were not available for taxa identified to genus, we replaced missing values in trait modalities with the median of trait profiles of all species within a genus from the full taxon list, resulting in 61.3% coverage. For taxa identified to family level with no available data for a trait, we replaced the missing values in trait sources with the median values of profiles of all genera within a family resulting in 90.5% coverage across all taxa. The lack of accurate phylogenies, low trait coverage at the species level, and mixedresolution across sampling sites prevented the use of othergap-fill approaches, but taxonomic aggregation meshes well with expert trait assignments.
Our compiled time series represent different stream types and stream orders from a large geographical area of Europe. Detailed information on the purpose of the data that was collected is unavailable for some time series. The data was collected from studies that met our criteria. As these data were collected from sites exposed to varying and unquantified levels of anthropogenic impacts, we cannot rule out biases arising from unequal representation of sites exposed to different impact levels from severely impacted to least impacted.
Between 13 and 516 taxa were sampled per site across all sampling years. 34% of sites were identified to species, 30% to mixed levels and 28% to family. There were 2,648 taxa from 959 genera,212 families and 47 groups recorded. We list time-series locations, durations and characteristics in Supplementary Table 2 and list the number of sites sampled per year and country in Supplementary Table 3.