-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
Tiered storage uses the WT_FILE_SYSTEM.fs_directory_list interface to list the objects in a bucket. This is done as part of __tiered_name_check to make sure the schema operations can check for the existence of colliding objects. eg: There already exists an object in the bucket, corresponding which a tiered table is being created. This might happen if the same name tiered table was earlier dropped.
Here is the interface declaration:
int (*fs_directory_list)(WT_FILE_SYSTEM *file_system, WT_SESSION *session, const char *directory, const char *prefix, char ***dirlist, uint32_t *countp);
Since S3 has a flat hierarchy and not a filesystem like tree layout, AWS simulates the file system tree by interpreting a '/' in the object name to define a folder. It is not clear whether directory listing implementation should follow that model and assume '/' separating the "directory" and the "files" as objects contained in the directory.
For instance, imagine a bucket with the following objects:
blah1 blah2 dir1 dir1pre dir1prefixA dir1prefixA2 dir1/pre dir1/prefixA dir1/prefixA2
What should the following call return:
fs_directory_list(..., "dir1", "prefixA", ..);
Is it:
dir1prefixA dir1prefixA2
OR
dir1/prefixA dir1/prefixA2
_tiered_name_check sets the configured object-prefix as the directory and the name it is looking for as the prefix. To me, a directory implies that it will end with a "/", which the tiered name check function is not expecting. Maybe _tiered_name_check should not set a directory and provide object-prefix+name as the prefix to the directory list function.
Note:
This issue impacts test_tiered07.py for S3. Re-enable the test once this issue has been sorted out.
cc sue.loverso
- related to
-
WT-8791 Give all tiered storage python tests a S3 scenario
- Closed