Methodology β€” Ukraine War Analytics Data Collection

How we collect, process, and present conflict data for the Russia-Ukraine war.

1Data Collection

Our data is aggregated from multiple open-source intelligence (OSINT) providers. Each source uses different methodologies:

ACLED Event Data

Events are coded by trained researchers from news reports, social media, and local sources. Each event is geolocated and categorized by type.

Oryx Equipment Losses

Every loss is verified with photographic or video evidence. Items are categorized as destroyed, damaged, abandoned, or captured.

Official Sources

Government statistics from Ukraine, international organizations, and verified news agencies provide additional data points.

βš™Data Pipeline Overview

RawSourcesACLED Β· OryxUN Β· GovCollectionDaily cronfetch-all.tsProcessingNormalizeDeduplicateValidationvalidate-data.tsQA checksPublicationStatic build25,980 pagesAutomated daily at 06:00 UTC via GitHub Actions cron β€” data freshness ≀ 24 hours

Raw Sources

ACLED, Oryx, UNHCR, NASA FIRMS, World Bank

Collection

TypeScript fetch scripts run on GitHub Actions

Processing

Normalise, deduplicate, geolocate, aggregate

Validation

Schema checks, outlier detection, QA rules

Publication

Next.js static build β†’ Vercel CDN delivery

2Data Processing

Raw data undergoes several processing steps before presentation:

  • βœ“Normalization: Data from different sources is standardized to common formats and categories.
  • βœ“Deduplication: Events reported by multiple sources are merged to avoid double-counting.
  • βœ“Geolocation: Events are mapped to administrative regions (oblasts) for geographic analysis.
  • βœ“Aggregation: Daily, weekly, and monthly summaries are calculated from individual events.

3Event Classification

Events are classified using the ACLED taxonomy:

βš”οΈ

Battles

Armed clashes between organized military groups. Includes ground combat, territorial changes, and military offensives.

πŸ’₯

Explosions/Remote Violence

Artillery, missiles, airstrikes, and drone attacks. Events where violence is delivered from a distance.

🎯

Violence Against Civilians

Attacks targeting civilian populations, including war crimes, forced disappearances, and deliberate attacks on infrastructure.

πŸ“‹

Strategic Developments

Significant non-violent events including treaties, territorial agreements, and major political developments.

πŸ”Oryx Equipment Verification Methodology

Equipment losses are tracked by Oryx, an independent OSINT team that applies a strict visual-evidence standard:

1. Visual Evidence Required

Every single loss entry requires at least one unique photo or video frame showing the damaged or destroyed vehicle. Text reports or official statements alone are insufficient.

2. Per-Vehicle Tracking

Each vehicle is tracked individually by model, markings, and image URL. The same vehicle cannot be counted twice β€” duplicate images are detected and excluded.

3. Status Categories

Destroyed: vehicle visually confirmed as a total loss (burned out, blown up, structurally destroyed). Damaged: visible damage but possibly repairable. Abandoned: intact vehicle left by crew. Captured: seized by the opposing side.

4. Conservative Undercount

Because losses without photographic evidence are excluded, Oryx figures are a confirmed minimum. Independent analysts estimate real losses are 20–40% higher than Oryx-tracked numbers, particularly for losses in Russian-controlled territory.

πŸ“ŠHow Casualty Figures Are Counted

Casualty counting in an active conflict is extremely difficult. We use multiple sources with different methodologies:

πŸ‘₯ OHCHR Civilian Casualties

The UN Human Rights Monitoring Mission verifies each civilian casualty individually. Their figures are conservative β€” representing confirmed cases where there was "reasonable grounds to believe" a civilian was killed or injured. The UN states real numbers are "considerably higher."

βš”οΈ Military Casualties

Military casualty estimates come from official Ukrainian and Western government statements, and ACLED fatality coding. All parties underreport their own losses for morale and strategic reasons. Figures should be treated as rough orders-of-magnitude, not precise counts.

πŸ“° ACLED Fatality Coding

ACLED researchers code fatality estimates from news reports, assigning best-estimate ranges to each event. These are not verified individually but represent the best available synthesis of open-source reporting.

⚠️ Key Caveats

We never publish raw official Russian or Ukrainian military casualty announcements as ground truth. All figures displayed are sourced from independent third-party organizations and are labeled as estimates where appropriate.

!Limitations & Caveats

Important limitations to consider when using our data:

  • ⚠️Underreporting: Not all events are captured. Remote areas and active combat zones have limited reporting.
  • ⚠️Fog of War: Initial reports may be inaccurate and are updated as more information becomes available.
  • ⚠️Verification Delays: Equipment losses require visual evidence, causing reporting delays.
  • ⚠️Casualty Estimates: True casualty figures are unknown. Official figures from all parties should be treated with caution.
  • ⚠️Geographic Bias: More events are reported in areas with better media access and communication infrastructure.

4Update Schedule

Daily

Event data updates

Weekly

Summary reports

Continuous

Equipment verification

Frequently Asked Questions

How is conflict data collected and processed?+
We aggregate data daily from multiple OSINT providers, then normalize, deduplicate, geolocate, and aggregate it. Raw records are fetched automatically via scheduled scripts, mapped to a consistent format, and merged so events reported by several sources are not double-counted. The cleaned data is then aggregated into daily, weekly, and monthly summaries.
How are events classified?+
Events follow the ACLED taxonomy: Battles (armed clashes between organized groups), Explosions/Remote Violence (artillery, missiles, airstrikes, drone attacks), Violence Against Civilians (attacks targeting civilians), and Strategic Developments (significant non-violent events such as agreements or territorial changes). Using a single, established taxonomy keeps categories consistent across the whole dataset.
How are equipment losses verified?+
Equipment losses follow the Oryx visual-evidence standard: every entry requires at least one unique photo or video frame of the damaged, destroyed, abandoned, or captured vehicle. Each vehicle is tracked individually and duplicates are excluded. Because losses without visual proof are not counted, the figures are a conservative confirmed minimum.
How are casualty figures counted?+
Casualty figures combine several independent sources, each labeled accordingly. Civilian casualties come from UN OHCHR, which verifies each case and publishes conservative minimums. Military casualty estimates derive from independent coding and official statements, treated as rough orders of magnitude rather than precise counts. We do not present raw official military announcements from either side as ground truth.
How often is the data updated?+
Event data is updated daily, summary reports weekly, and equipment-loss verification continuously as new visual evidence appears. Our automated pipeline runs daily so the published data freshness is typically within 24 hours of the upstream sources.
Why might the numbers be incomplete or revised?+
Conflict data is affected by the fog of war: not all events are reported, remote areas have limited coverage, and initial reports can be inaccurate. As verification progresses and upstream providers revise historical records, our figures are updated. We present data as a confirmed minimum rather than a complete total.
How do you handle propaganda or unverified claims?+
We rely exclusively on third-party verified datasets that apply their own editorial standards, and we do not republish unverified claims from any party. Where a figure is an estimate or projection rather than a directly measured value, we label it clearly.