Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Live Restore
Labels:
- code-quality

Epic Link:
SPM-3612
Assigned Teams:

Storage Engines

Story Points:
5
Sprint:
StorEng - Defined Pipeline

We'd like to collect data on how live restore performs in Atlas, and for this we should output a log line at the end of live restore with important metrics.

This ticket is for determining what metrics to output, how to collect them, and implementing the log line. We expect this to be picked up by a tools like the Atlas log ingestion service.

As an example we could consider reporting:

Time to startup: Live Restore is intended get customers up and running quickly, so knowing how much time they'd save compared to the current process of copying the entire database from a backup should be reported
read/write latencies: A ballpark figure for how much slower the database is during live restore versus normal operation
Live restore duration: How long it took for live restore to completely migrate all data

This list is not exhaustive.

Assignee:: [DO NOT USE] Backlog - Storage Engines Team

Reporter:: Andrew Morton

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: Jan 29 2025 11:52:31 PM UTC

Updated:: Jan 31 2025 01:52:16 AM UTC

Details

Description

Attachments

Activity

People

Dates