Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-93876

Clean up inefficient dependencies in the Build Graph

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Build

      Currently we have a lot of targets with dependencies listed that are not needed to compile the target itself, but instead listed as convenience for the targets that depend on them.

      It should be possible to run script moving the targets down in the dependency graph to the bottom-most node that requires them.

      During the conversion we've marked many `deps` with this issue. These are dependencies where building just that target with Bazel technically passes without undefined symbols (i.e. with `linkopts = ["-Wl,--no-undefined"]`) even if the dep is removed. We found it tricky to find overspecified deps after the fact which is why the current BUILD files were manually marked on a best-effort basis. Reducing the deps and flattening the link graph is highly desirable as we observed linktime differences of 50% or more for some targets that currently potentially overspecify their linked dependencies. A flattened linkgraph potentially allows for higher parallelization of the build and provides a more flexible/modular build with increased incremental cache hit rates.

      However, it's not safe and not correct to naively remove the marked dependencies in many cases. In the comments in this issue you'll find (a likely non-exhaustive list of) mentions of cases that would break.

      If you're the codeowner of a marked target it means that you're likely adding at least one object (in an archive) redundantly. Consider reviewing the dependency list and using the deps that are the lowest in the global build graph. That is, don't try to minimize the number of deps in the list. Instead, try to use the deps that come earliest in the build graph (which is usually a larger list since the earlier targets are smaller). One way to do this could look like this:

      1. Rename `deps` to `header_deps`. This way you still get all the headers during compilation, but link nothing.
      2. Add `linkopts = ["-Wl,--no-undefined"]`, to the target. This way you'll see errors for all undefined symbols.
      3. If you're missing symbol `x`, add the dependency that has the `.cpp` file which defines symbol `x` in their `srcs` attribute. This is the minimal dependency that you can add to resolve the missing symbol.
      4. Rinse and repeat for all symbols.
      5. If you have a target with multiple `.cpp` files in `srcs`, consider commenting out all except for one to reduce the visual overload from the linker errors. Doing this might also expose cases where pulling out one of the `.cpp` files into a dedicated target allows the other `.cpp` files to be built earlier in the build graph, improving incremental rebuild performance.
      6. Find the "right" balance between maintainability and performance. The theoretical optimum is one target per `.cpp` file, but that might not be feasible to maintain. On the other hand, every time a `.cpp` file in `srcs` is changed it'll trigger a rebuild of all the other `.cpp` files in that same `srcs` list. This can create noticeable parallelization bottlenecks and unnecessary load on incremental rebuilds.

            Assignee:
            Unassigned Unassigned
            Reporter:
            zack.winter@mongodb.com Zack Winter
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated: