Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-87

sync can deadlock with open cursors

    • Type: Icon: Task Task
    • Resolution: Done
    • WT1.0
    • Affects Version/s: None
    • Component/s: None

      Currently, both sync & close hang until they can flush all of the buffers, but, since a cursor now holds a hazard reference across calls, that can be forever.

      IIRC, you're planning to lock out the close method if there are open cursors, is that correct?

      What do you want to do about sync?

      I don't see any reason we can't lock sync out if there are open cursors, but we'd probably need to add some trickle-write functionality (a call that flushes what can easily be flushed to get writes started), for applications that use sync just to flush out dirty pages and keep the cache mostly clean.

      Or, I could change sync to write everything it can, but then ignore what it can't write (and return an error status).

      Thoughts?

      --keith

      =========================

      Hi Keith,

      Or, I could change sync to write everything it can, but then ignore what it can't write (and return an error status).

      That makes sense to me, possibly together with an config option that says "keep trying forever", for applications that really want sync to complete.

      We should be able to require that there are no cursors open on the tree in the same session: that seems like a bad idea?

      Michael.

      =========================

      Well, it's really open cursors in the same thread that are the problem, they'll potentially result in deadlock (cursors in other threads could deadlock, but they won't necessarily deadlock).

      It seems to me that sync should flush what it can, and simply return an error if it's unable to flush after some small number of attempts – calling sync with running cursors is problematical at best, you can only succeed you get lucky with thread scheduling.

      OK with you?

      --keith

      =========================

      Hi Keith,

      Well, it's really open cursors in the same thread that are the problem, they'll potentially result in deadlock (cursors in other threads could deadlock, but they won't necessarily deadlock).

      As far as I'm concerned, anyone who opens multiple sessions in a single thread is on their own in terms of self-deadlock. So I think it's reasonable to think in terms of cursors in the same session – the API is set up to make that kind of thing easy to check.

      It seems to me that sync should flush what it can, and simply return an error if it's unable to flush after some small number of attempts – calling sync with running cursors is problematical at best, you can only succeed you get lucky with thread scheduling.

      Sure, that sounds fine. If it also fails when there are open cursors in the same session, that's also fbm.

      Michael.

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            wiredtiger WiredTiger
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: