-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Storage Execution
-
ALL
-
v8.0, v7.3, v7.2, v7.1
-
5
Running a collMod command to change the timeseries granularity during tenant migration and logical initial sync, can cause those data migration protocols to fail with following error
"error":{"code":72,"codeName":"InvalidOptions","errmsg":"Invalid transition for timeseries.granularity. Can only transition from 'seconds' to 'minutes' or 'minutes' to 'hours'."}}}
The error is expected as we apply oplog entries on a inconsistent data for both tenant migration and logical initial sync. We need to ignore the error if the oplog application mode is kInitialSync and kUnstableRecovering, just like SERVER-80301. The fix would be to update the coll Mod ignore list with InvalidOptions
Regarding the fix, I'm considering whether it's the correct approach to catch these errors individually and ignore them for the kInitialSync oplog application mode. In the future, we may encounter similar cases, and waiting for build failures or issues in production to address them doesn't seem ideal. I'm thinking of a solution where we simply ignore any errors when applying oplog entries during kInitialSync mode. However, I'm unsure about the safety of this approach and believe it might require further investigation.
(Attached a repro for initial sync case)
EDIT (11/10/2023)
Modifying the bucket values during concurrent tenant migration/logical initial sync will cause the migration/initial sync to fail.
[j1:rs1:prim] | 2023-11-09T02:27:20.103+00:00 D1 TENANT_M 4886005 [TenantMigrationRecipientService-4] "TenantOplogApplier::_finishShutdown","attr":{"protocol":0,"migrationId":{"uuid":{"$uuid":"adc565c0-0844-4ea2-a36d-b76c4699bdfc"}},"error":"InvalidOptions: Timeseries 'bucketMaxSpanSeconds' needs to be equal or greater to transition"}