I'm currently updating the core architecture and WideRouter compiler capacity while simultaneously fixing a cache related bug. With that I've dedicated a specific part of all routers to caching; which is intentionally meant to be device agnostic and clearable on a whim throughout a pod.
Full ramp-up for accelerate testing is crucial very soon. I require easy access to multiple runpod-friendly, clustered docker, or indirectly connected devices with an api - to properly integrate the debugging and testing infrastructure. Currently the debugging infrastructure on the compiler is essentially nonexistent, it simply crashes with a huge readout that does not assist with debugging - so the only way to get the real debugged error is by simply turning off compilation to run the test without it. This is FINE without compilation, but once compiled the debugging is difficult to read. This can cut into time if you are unaware of the quirk and I want to ensure this isn't visible by catching all necessary elemental exceptions based on the interactions and device movements.
If I can get direct access to a docker system with multiple devices - cpu, gpu, interface, data, raw, tensor, numpy, whatever; I will gladly make a full wrapper structure to directly intermingle with diffusers, pytorch, or perhaps tensorflow alongside the structural system that I'm currently implementing. Please, share hardware. I'll gladly share my engineering and innovations. I'll focus my direct attention on whichever goal you wish first, then move from there.
If anyone is willing to donate hardware for a month or two I would be obliged to focus my attention on the needs of that someone or entity. I would happily share time slots and utilization days or months with others. So long as I can get the architecture working to capacity to provide the necessary implications and training to the models required. My only motivation is to directly train these experiments to match the hardware access and accuracy with deeper models, while simultaneously training multiple adjacent models in a stable fashion.
This is one of my primary goals, so reach out directly abstractpowered@outlook if you have any information. Access to stronger hardware will allow me to directly ramp up ablation tests and accuracy augmentations to machines that currently operate lesser than sota, while enabling the potential for the same complexity as a deep model training with many devices all tasked with different goals.
This is not an easy task so it will take time. Sharing resources, ideas, and concepts of utility I encourage as well. You can find me on discord as
_abstract_
aka discord name AbstractPhila with the popfrog display image, and in the Res4lyfe discord, most often frequenting the sampler madness channel as well. I have a fair history of chatting there and a few other locations for bouncing ideas and keeping daily notes.
