www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Some thoughts on dub housekeeping tasks for future design work

reply Rikki Cattermole <alphaglosined gmail.com> writes:
Hello again!

After debugging an already fixed in ~master bug, I had a thought 
kinda stuck in my mind about how dub is working internally 
although extremely poorly. It's related to how the metadata is 
actually pretty simple in its behavior, but not abstracted 
properly for the build manager. The metadata has one of three 
behaviors (although it can be in more than one category). It can 
go up, down, or nowhere (only that build).

A lot of the metadata is already abstracted into a single struct, 
BuildSettings, that's not an issue. What we want to do is move 
into having three instances per (sub)package being built. By 
doing this we can remove a significant amount of busy work that 
goes on within dub in duplication of BuildSettings generation and 
allow a much simpler process when building. Just pass in an array 
of BuildSettings and it'll merge them as part of the build. The 
responsibility of picking which BuildSettings apply to a 
particular build is the responsibility of the package manager, 
not the build manager.

Right now a lot of this logic is all intermixed with the package 
manager itself and done very badly. It's going to be a lot of 
work to untangle this and could easily break people's builds. So 
before anything structural like this can be done, anything that 
can be split out like leaf modules needs to be done. I've 
identified some housekeeping tasks that if done would make this 
process a lot easier or have quite significant benefits both 
currently and after such work is complete.

dub:

- ``getBestPackage`` simplify down to one request instead of two 
(move logic to dub-registry)
- Introduce caching mechanism of downloaded artifacts in file 
system, must be class and swapped out at runtime
- Registered non-registry package sources must be able to be 
compiled into a JSON blob full of package versions ext. info
- Able to consume compressed (zip) copies of package information 
(one per file in zip) in lieu of a registry
- Decouple and split out into own sub package compilers, 
packagesuppliers, cache, platform, dependencyresolver, semver 
packages/modules

dub-registry:

- Rewrite cache to use dub's (new) cache mechanism
- Move the repositories package to dub and rewrite as required to 
fit both purposes
- Use dub's new repositories that were moved here, minus registry 
(must be configurable at runtime)
- Able to produce compressed (zip) copy of all package version 
information (one file in zip per package)

If we can do these things, and split up BuildSettings in 
preparation for directionality (up/down/nowhere) with arrays 
support; we might have a way to do invasive structural changes in 
the package management side to break up the behavior with clearer 
divisions of build/package manager. But it's going to be slow 
going, and it's going to have to be bottom-up from the leaf 
modules.

One of the reasons we have to start bottom up is because dub has 
two abstractions for metadata. Recipes are what the dub files are 
represented by, and BuildSettings for the build itself. Ideally, 
the package manager would not know about the build manager's 
metadata (although it would tell the build manager to load its 
data), and the build manager wouldn't know about the package 
manager's metadata, which isn't the case right now.

Thoughts?
Dec 11 2022
parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Sunday, 11 December 2022 at 12:43:47 UTC, Rikki Cattermole 
wrote:
 Hello again!

 [...]

 dub-registry:
As for the dub-registry i can only advice to avoid having it do more things than it already does. In fact, I can only suggest to make it considerably dumber. Right now it partakes in dependency resolution by recursively resolving transitive dependencies. This has considerable resource requirements due to the reconstruction of relative big json snippets. This thrashes a lot of memory and puts unnecessary pressure on the GC, resulting in high memory requirements for a relatively simple registry. The tradeoff is having the client make multiple requests, however, they can be done semi-parallel and, without requiring transformations, can be streamed straight out of the database.
Dec 11 2022
parent rikki cattermole <rikki cattermole.co.nz> writes:
Dub isn't thread-safe. It would require significant structural changes 
to get any form of parallelism going on.

But the problem is that you have to retrieve metadata from the registry, 
pass that into getBestPackage and from there perform additional 
downloads. It is absolutely insane to have to do this when the logic is 
already compiled into dub-registry!
Dec 11 2022