-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
Atlas Streams
-
Fully Compatible
-
ALL
-
Sprint 46, Sprint 47, Sprint 48, Sprint 49, Sprint 50, Sprint 52
This is the customer's pipeline:
[{"$source": {"connectionName": "KafkaConfluent","topic": "OutputTopic"}},{"$merge": {"into": {"connectionName": "LyricsCluster","db": "streamingvectors","coll": "lyrics"},"on": "_id","whenMatched": "merge","whenNotMatched": "insert"}}]
Root cause (see this splunk):
- When the customer issued the stop, there were roughly 4,163,319,725 bytes input, but only 2,670,960,838 bytes output. The sink was backlogged.
- As part of the stop, we start writing a checkpoint.
- The $source processes the checkpoint at 3/28/24
5:03:37.180 PM - The sink doesn't finish processing the checkpoint until 3/28/24
5:44:11.283 PM
One related issue is-- why did the backlog get up to 2GB? Our code should prevent that, limiting the backlog in the sink to ~100MB.
==== Customer report ====
It has failed again with the same error in the new stream processing cluster I have created.
{ id: '6603d8c486a1abd293b773c5', name: 'lyrics_destination_cluster', lastModified: ISODate('2024-03-27T08:28:52.546Z'), state: 'STARTED', errorMsg: '', workers: [ 'worker-56b79c874d-9wjr2' ], pipeline: [ { '$source':
}, { '$merge': { into:
{ connectionName: 'LyricsCluster', db: 'streamingvectors', coll: 'lyrics' }, on: '_id', whenMatched: 'merge', whenNotMatched: 'insert' } } ], lastStateChange: ISODate('2024-03-28T17:03:31.206Z') },
The processor subscribed to the Kafka topic has stopped working. It still shows a STARTED state and I can’t stop it
- mentioned in
-
Page Loading...