Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-7416

Imported table requires full copy between incremental backups

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • WT10.0.1, 4.4.7, 5.0.0-rc0
    • Affects Version/s: None
    • Component/s: None
    • None
    • 8
    • Storage - Ra 2021-05-03, Storage - Ra 2021-05-17

      There is a test/format failure that is a follow up on the newly created import thread that is hit:

       [2021/04/19 10:24:19.920]     ############################################
       [2021/04/19 10:24:19.920]     t: FAILED: active_files_remove_missing/202: unlink(filename): error return: -1
       [2021/04/19 10:24:19.920]     t: process aborting
       [2021/04/19 10:24:19.920] format.sh: /data/mci/1a443d228c0dddd90209ce68fee04928/wiredtiger/test/format/RUNDIR.41/CONFIG:
       [2021/04/19 10:24:19.920]     ############################################
      

      With config:

       [2021/04/19 10:24:19.919]     ############################################
       [2021/04/19 10:24:19.919]     #  RUN PARAMETERS: V2
       [2021/04/19 10:24:19.919]     ############################################
       [2021/04/19 10:24:19.919]     assert.commit_timestamp=0
       [2021/04/19 10:24:19.919]     assert.read_timestamp=0
       [2021/04/19 10:24:19.919]     backup=1
       [2021/04/19 10:24:19.919]     backup.incremental=block
       [2021/04/19 10:24:19.919]     backup.incr_granularity=1024
       [2021/04/19 10:24:19.919]     btree.bitcnt=8
       [2021/04/19 10:24:19.919]     btree.compression=snappy
       [2021/04/19 10:24:19.919]     btree.dictionary=0
       [2021/04/19 10:24:19.919]     btree.huffman_value=0
       [2021/04/19 10:24:19.919]     btree.internal_key_truncation=1
       [2021/04/19 10:24:19.919]     btree.internal_page_max=9
       [2021/04/19 10:24:19.919]     btree.key_gap=15
       [2021/04/19 10:24:19.919]     btree.key_max=97
       [2021/04/19 10:24:19.919]     btree.key_min=19
       [2021/04/19 10:24:19.919]     btree.leaf_page_max=17
       [2021/04/19 10:24:19.919]     btree.memory_page_max=7
       [2021/04/19 10:24:19.919]     btree.prefix_compression=1
       [2021/04/19 10:24:19.919]     btree.prefix_compression_min=6
       [2021/04/19 10:24:19.919]     btree.repeat_data_pct=83
       [2021/04/19 10:24:19.919]     btree.reverse=0
       [2021/04/19 10:24:19.919]     btree.split_pct=57
       [2021/04/19 10:24:19.919]     btree.value_max=451
       [2021/04/19 10:24:19.919]     btree.value_min=5
       [2021/04/19 10:24:19.919]     cache=56
       [2021/04/19 10:24:19.919]     cache.evict_max=1
       [2021/04/19 10:24:19.919]     cache.minimum=20
       [2021/04/19 10:24:19.919]     checkpoint=on
       [2021/04/19 10:24:19.919]     checkpoint.log_size=166
       [2021/04/19 10:24:19.919]     checkpoint.wait=33
       [2021/04/19 10:24:19.919]     disk.checksum=uncompressed
       [2021/04/19 10:24:19.919]     disk.data_extend=0
       [2021/04/19 10:24:19.919]     disk.direct_io=0
       [2021/04/19 10:24:19.919]     disk.encryption=rotn-7
       [2021/04/19 10:24:19.919]     disk.firstfit=0
       [2021/04/19 10:24:19.919]     disk.mmap=1
       [2021/04/19 10:24:19.919]     disk.mmap_all=0
       [2021/04/19 10:24:19.919]     format.abort=0
       [2021/04/19 10:24:19.919]     format.independent_thread_rng=1
       [2021/04/19 10:24:19.919]     format.major_timeout=0
       [2021/04/19 10:24:19.919]     import=1
       [2021/04/19 10:24:19.919]     logging=0
       [2021/04/19 10:24:19.919]     logging.archive=0
       [2021/04/19 10:24:19.919]     logging.compression=none
       [2021/04/19 10:24:19.919]     logging.file_max=466830
       [2021/04/19 10:24:19.919]     logging.prealloc=1
       [2021/04/19 10:24:19.919]     lsm.auto_throttle=1
       [2021/04/19 10:24:19.919]     lsm.bloom=1
       [2021/04/19 10:24:19.919]     lsm.bloom_bit_count=37
       [2021/04/19 10:24:19.919]     lsm.bloom_hash_count=5
       [2021/04/19 10:24:19.919]     lsm.bloom_oldest=1
       [2021/04/19 10:24:19.919]     lsm.chunk_size=9
       [2021/04/19 10:24:19.919]     lsm.merge_max=19
       [2021/04/19 10:24:19.919]     lsm.worker_threads=4
       [2021/04/19 10:24:19.919]     ops.alter=0
       [2021/04/19 10:24:19.919]     ops.compaction=0
       [2021/04/19 10:24:19.919]     ops.hs_cursor=0
       [2021/04/19 10:24:19.919]     ops.pct.delete=17
       [2021/04/19 10:24:19.919]     ops.pct.insert=0
       [2021/04/19 10:24:19.919]     ops.pct.modify=12
       [2021/04/19 10:24:19.919]     ops.pct.read=2
       [2021/04/19 10:24:19.920]     ops.pct.write=69
       [2021/04/19 10:24:19.920]     ops.prepare=0
       [2021/04/19 10:24:19.920]     ops.random_cursor=0
       [2021/04/19 10:24:19.920]     ops.salvage=0
       [2021/04/19 10:24:19.920]     ops.truncate=1
       [2021/04/19 10:24:19.920]     ops.verify=1
       [2021/04/19 10:24:19.920]     quiet=1
       [2021/04/19 10:24:19.920]     runs=1
       [2021/04/19 10:24:19.920]     runs.in_memory=0
       [2021/04/19 10:24:19.920]     runs.ops=0
       [2021/04/19 10:24:19.920]     runs.rows=1800000
       [2021/04/19 10:24:19.920]     runs.source=file
       [2021/04/19 10:24:19.920]     runs.threads=4
       [2021/04/19 10:24:19.920]     runs.timer=11
       [2021/04/19 10:24:19.920]     runs.type=row-store
       [2021/04/19 10:24:19.920]     runs.verify_failure_dump=0
       [2021/04/19 10:24:19.920]     statistics=0
       [2021/04/19 10:24:19.920]     statistics.server=0
       [2021/04/19 10:24:19.920]     stress.aggressive_sweep=0
       [2021/04/19 10:24:19.920]     stress.checkpoint=0
       [2021/04/19 10:24:19.920]     stress.checkpoint_prepare=0
       [2021/04/19 10:24:19.920]     stress.hs_checkpoint_delay=0
       [2021/04/19 10:24:19.920]     stress.hs_search=0
       [2021/04/19 10:24:19.920]     stress.hs_sweep=1
       [2021/04/19 10:24:19.920]     stress.split_1=0
       [2021/04/19 10:24:19.920]     stress.split_2=0
       [2021/04/19 10:24:19.920]     stress.split_3=1
       [2021/04/19 10:24:19.920]     stress.split_4=1
       [2021/04/19 10:24:19.920]     stress.split_5=0
       [2021/04/19 10:24:19.920]     stress.split_6=0
       [2021/04/19 10:24:19.920]     stress.split_7=1
       [2021/04/19 10:24:19.920]     stress.split_8=0
       [2021/04/19 10:24:19.920]     transaction.frequency=100
       [2021/04/19 10:24:19.920]     transaction.isolation=snapshot
       [2021/04/19 10:24:19.920]     transaction.rollback_to_stable=0
       [2021/04/19 10:24:19.920]     transaction.timestamps=1
       [2021/04/19 10:24:19.920]     wiredtiger.config=
       [2021/04/19 10:24:19.920]     wiredtiger.rwlock=1
       [2021/04/19 10:24:19.920]     wiredtiger.leak_memory=0
       [2021/04/19 10:24:19.920]     ############################################
      

      This assert conveys that there is no file present in the filesystem when trying to delete the import table. Inside backup.c, we maintain two different file lists, one is prev, and one is cur. The function active_files_remove_missing removes all the files that are present in prev but not cur. This invariantly means that files in the prev should exist inside the filesystem.

              else if (cmp < 0) {
                  /*
                   * There is something in the prev list not in the current list. Remove it, and continue
                   * - don't advance the current list.
                   */
                  testutil_check(__wt_snprintf(
                    filename, sizeof(filename), "%s/BACKUP/%s", g.home, prev->names[prevpos]));
      #if 0
                  fprintf(stderr, "Removing file from backup: %s\n", filename);
      #endif
                  error_sys_check(unlink(filename));
                  testutil_check(__wt_snprintf(
                    filename, sizeof(filename), "%s/BACKUP.copy/%s", g.home, prev->names[prevpos]));
                  error_sys_check(unlink(filename));
      

      However through checking the files in the filesystem, the table was not present. Thus through further investigation it was found that the table was never created but added to the active file list. Through these symptoms it was found that the backup cursor would show a key with "import.wt" but when trying to iterate through the information through a incremental_backup cursor, the cursor->next() would immediately return WT_NOTFOUND.

      This problem would only occur under certain conditions:

      1. Test/format backup config is on
      2. Test/format import config is on

        1. backup.c
          24 kB
          Jie Chen
        2. test_import11.py
          3 kB
          Jie Chen

            Assignee:
            jie.chen@mongodb.com Jie Chen
            Reporter:
            jie.chen@mongodb.com Jie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: