Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Critical - P2
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Services & Integrations
Operating System:
ALL

When evergreen hit idle timeout for a test it will send a SIGABRT signal to all the mongo processes (mongos/mongod).

The mongo processes will then print the received singal:

[j1:s0:prim] | 2024-05-13T07:29:58.037+00:00 F  CONTROL  6384300 [S] [initandlisten] "Writing fatal message","attr":{"message":"Got signal: 3 (Quit).

And additionally will also print all the current stack traces.

In this scenario, evergreen will categorize the task/tests as follows:

The tasks will be marked as "Tasked timed out".
The test will be marked as "Failed".
The associated BFGs will be marked with "Server crash" severity. I believe this is because the log analyzer find the quit stack traces.

This is the same we would do for a real server crash. Thus, currently is very complicated to distinguish a BFGs that failed due to reaching the idle timeout versus a BFG that failed do to a server crash.

In order to differentiate the two I would suggest that in case the test times out due to reaching the hidle timeout we should have the following:

The tasks should be marked as "Test timed out"
The tests should be markes as "Test timed out" as well.
The associated BFGs should be marked as "Server hang" or at least not marked as Server Crash.

This is an example of BFG that timed out and was wrongly markes as "server crash"

is related to

SERVER-87332 Investigate changing resmoke to use SIGABRT instead of SIGQUIT for {T/A}SAN variants

Closed

Assignee:: Unassigned

Reporter:: Tommaso Tocci

Participants:: Tommaso Tocci

Votes:: 0 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: May 13 2024 08:51:07 AM UTC

Updated:: May 22 2024 03:18:43 AM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates