Data Democracy for the Health and Wellbeing of Black Communities

community-research
policy-impact
data-analysis
in-progress
This project combines stop-and-search data with Black communities’ lived experience to explore mental health impacts and racial disproportionality, using a new “Data Democracy” approach to make research more inclusive and actionable.
Author

Just Knowledge Team

Published

March 1, 2024

Data Democracy for the Health and Wellbeing of Black Communities

The Challenge

Black British people bear the burden of the disproportionate use of stop and search and health inequities, but official data often fail to capture the full story. Research on stop-and-search practices has historically overlooked community perspectives, limiting the understanding of its effects on mental health and social wellbeing.

Our Approach

The project combines high-quality administrative data with lived experience interpretation through a new method called ‘Data Democracy’. This approach empowers Black communities to actively participate in interpreting data, addressing epistemic injustice and challenging the assumptions embedded in conventional UK research.

Key steps include:

  • Developing an R package (“policedatR”) that analyses over one million stop-and-search records, making it easier for researchers to explore racial disproportionality, links to health outcomes and many other topics. You can use it here.
  • Engaging communities to interpret findings, ensuring lived experience shapes both analysis and conclusions.

Early Insights & Progress

  • The R package allows users to easily acquire data on stop and search across time and geography, and to quickly produce statistics and visualisations from the data. We think that this could radically accelerate research on stop and search. A journal article exploring the package’s capabilities is currently under review.

  • We’re currently working on Community participation, especially with stop and search monitoring groups, which has already begun to surface insights that challenge conventional assumptions in policing and health research.

Below are some examples of how the package can be put to use to better understand disproportionality in stop and search.

Disproportionality: All ethnicities across England and Wales

This plot shows how much more or less likely people of different ethnicities are to be stopped and searched compared to White people across England and Wales. The data cover the period April 2024 to April 2025.

Code For Nerds
disp_black_rr <- disp_black %>%
  dplyr::select(rr, ci_low, ci_upp, p, warning) %>%
  dplyr::mutate(
    ethnicity = "black", .before = rr
  )

disp_asian_rr <- disp_asian %>%
  dplyr::select(rr, ci_low, ci_upp, p, warning) %>%
  dplyr::mutate(
    ethnicity = "asian", .before = rr
  )

disp_mixed_rr <- disp_mixed %>%
  dplyr::select(rr, ci_low, ci_upp, p, warning) %>%
  dplyr::mutate(
    ethnicity = "mixed", .before = rr
  )

disp_other_rr <- disp_other %>%
  dplyr::select(rr, ci_low, ci_upp, p, warning) %>%
  dplyr::mutate(
    ethnicity = "other", .before = rr
  )

disp_all <- rbind(disp_black_rr,
                  disp_asian_rr,
                  disp_mixed_rr,
                  disp_other_rr)


disp_all <- disp_all %>%
  mutate(
    p_clean = case_when(p < .001 ~ "< .001",
                  TRUE ~ as.character(paste0("= ", p)))
  )

library(forcats)
library(dplyr)
library(plotly)

# make sure ordering is correct (like fct_reorder in ggplot)
disp_all <- disp_all %>%
  mutate(
    ethnicity = stringr::str_to_title(ethnicity),
    ethnicity = forcats::fct_reorder(ethnicity, rr)
  )

plot_ly(
  data = disp_all,
  x = ~rr,
  y = ~ethnicity,
  type = "scatter",
  mode = "markers",
  marker = list(size = 8, color = "black"),
  error_x = ~list(
    type = "data",
    symmetric = FALSE,
    array = ci_upp - rr,       # upper error
    arrayminus = rr - ci_low,  # lower error
    thickness = 1.5,
    width = 5,
    color = jk_colours_primary[3]
  ),
  text = ~paste0(
    "<b>", ethnicity, "</b><br>",
    "RR: ", round(rr, 2), "<br>",
    "95% CI: [", round(ci_low, 2), ", ", round(ci_upp, 2), "]"
  ),
  hoverinfo = "text"
) %>%
  # add labels (like geom_text)
  # add_text(
  #   x = ~rr + 0.2,
  #   y = ~ethnicity,
  #   text = ~round(rr, 2),
  #   textfont = list(size = 14),
  #   showlegend = FALSE
  # ) %>%
  layout(
    xaxis = list(
      title = "Relative risk ratio",
      range = c(0, 3.5),
      tickvals = seq(0, 3, 0.5)
    ),
    yaxis = list(
      title = "",
      categoryorder = "array",
      categoryarray = levels(disp_all$ethnicity)
    ),
    shapes = list(
      list( # vertical red dashed line
        type = "line",
        x0 = 1, x1 = 1,
        y0 = -0.5, y1 = length(unique(disp_all$ethnicity)) + 0.5,
        line = list(color = "red", dash = "dash")
      )
    ),
    annotations = list(
      list(
        x = 1.1,
        y = max(as.numeric(disp_all$ethnicity)) - 0.5,  # near the top
        text = "more likely to be stopped →<br>than White people",
        showarrow = FALSE,
        xanchor = "left",   # like hjust
        yanchor = "bottom", # like vjust
        font = list(size = 14, color = "gray50")
      ),
      list(
        x = 0.95,
        y = max(as.numeric(disp_all$ethnicity)) - 0.5,  # near the top
        text = "\u2190 less likely to be stopped<br>than White people",
        showarrow = FALSE,
        xanchor = "right",   # like hjust
        yanchor = "bottom", # like vjust
        font = list(size = 14, color = "gray50")
      )
    )
  ) |>
  my_plotly()

Black-White Disproportionality: By Police Force Area

This plot shows how much more or less likely a Black person is expected to be stopped and searched compared to a White person for each Police Force Area in England and Wales. The data cover the period April 2024 to April 2025.

Code For Nerds
disp_black_flagged <- disp_black %>%
  mutate(
    flag = factor(dplyr::case_when(!is.na(warning) ~ 1,
                            TRUE ~ 0)),
    non_sig = factor(dplyr::case_when(p > .05 ~ 1, 
                                      TRUE ~ 0)
    )
  ) %>%
  # group_by(flag) %>%
  mutate(
    pfa22nm = forcats::fct_reorder(pfa22nm, rr) 
  ) %>% 
  ungroup()


plot_ly(
  data = disp_black_flagged,
  x = ~rr, 
  y = ~pfa22nm,
  type = "scatter",
  mode = "markers",
  marker = list(size = 8, color = "black"),
  error_x = ~list(
    type = "data",
    symmetric = FALSE,
    array = ci_upp - rr,       # upper error
    arrayminus = rr - ci_low,  # lower error
    thickness = 1.5,
    width = 5,
    color = jk_colours_primary[3]
  ),
  text = ~paste0(
    "<b>", pfa22nm, "</b><br>",
    "RR: ", round(rr, 2), "<br>",
    "95% CI: [", round(ci_low, 2), ", ", round(ci_upp, 2), "]"
  ),
  hoverinfo = "text"
) %>%
  # Add numeric labels (like geom_text)
  # add_text(
  #   x = ~ci_upp + 5, 
  #   y = ~pfa22nm,
  #   text = ~round(rr, 2),
  #   textposition = "middle right",
  #   showlegend = FALSE,
  #   textfont = list(size = 14)
  # ) %>%
  layout(
    xaxis = list(
      title = "Relative Risk Ratio",
      range = c(0, 40),
      tickvals = seq(0, 40, 5)
    ),
    yaxis = list(
      title = "Police Force Area",
      categoryorder = "array",
      categoryarray = levels(disp_black_flagged$pfa22nm)  # match ggplot order
    ),
    shapes = list(
      list(
        type = "line",
        x0 = 1, x1 = 1,
        y0 = -0.5, y1 = length(unique(disp_black_flagged$pfa22nm)) + 0.5,
        line = list(color = "red", dash = "dash")
      )
    ),
    annotations = list(
      list(
        x = 1.5, 
        y = length(unique(disp_black_flagged$pfa22nm)) + 0.5,
        text = "more likely to be stopped →<br>than White people",
        showarrow = FALSE,
        xanchor = "left",
        yanchor = "bottom",
        font = list(size = 16, color = "gray50")
      )
    )
  ) |> 
  my_plotly()

Black-White Disproportionality: Small areas within a London Borough

This map shows Black-White disproportionality in stop and search for small areas in Haringey from April 2022 to April 2025. More purple colours indicate areas with higher disproportionality. We are working with communities in Haringey to co-interpret these data and put them to use to advocate for change.

Code For Nerds
disp <- disp %>%
  mutate(
    label = case_when(!is.na(warning) ~ "*",
                      TRUE ~ "")
  )

disp_sf <- haringey_msoa_geom %>%
  right_join(disp, by = c("msoa21cd","msoa21nm"))

# Ensure geometry is valid
disp_sf <- st_as_sf(disp_sf)

# Create labels for popup
disp_sf <- disp_sf %>%
  mutate(
    popup_label = paste0(
      "<strong>", msoa21nm, "</strong><br>",
      "RR: ", round(rr,2), "<br>",
      "95% CI: [", round(ci_low,2), "-", round(ci_upp,2), "]"
    )
  )

pal <- colorNumeric(palette = ice_swatch, domain = disp_sf$rr)

# Create leaflet map
leaflet(disp_sf) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(
    fillColor = ~pal(rr),   # color by rr
    fillOpacity = 0.7,
    weight = 1,
    color = "black",
    popup = ~lapply(popup_label, htmltools::HTML),
    label = ~lapply(popup_label, htmltools::HTML),
    highlight = highlightOptions(
      weight = 2,
      color = "black",
      bringToFront = TRUE
    )
  ) %>%
  # addLabelOnlyMarkers(
  #   data = subset(disp_sf, label == "*"),  # only show red asterisks
  #   lng = ~st_coordinates(st_centroid(geometry))[,1],
  #   lat = ~st_coordinates(st_centroid(geometry))[,2],
  #   label = ~label,
  #   labelOptions = labelOptions(
  #     noHide = TRUE,
  #     direction = "top",
  #     textOnly = TRUE,
  #     style = list(
  #       "color" = "red",
  #       "font-size" = "16px",
  #       "font-weight" = "bold"
  #     )
  #   )
  # ) %>%
  addLegend(
    pal = pal,
    values = disp_sf$rr,
    title = "Risk Ratio",
    opacity = 0.7
  ) |> 
  my_leaflet()

Next Steps & Future Updates

  • We’re continuously working on upgrades to policedatR and always interested to hear from others about suggestions for improvements or collaboration opportunities!
  • We’re working to develop new measures to quantify disparities in stop and search.
  • We will publish initial research findings on stop-and-search and mental health outcomes.
  • We will further embed the Data Democracy approach to create a replicable model for participatory research.
  • We will share updates and resources as the project progresses.

Explore More

Stay Up to Date with Our Work