Skip to main content
  1. Pages/

About me

·1388 words·7 mins

Bio #

I’m a Computing Scientist working on designing novel scalable algorithms in biomedical imaging.

Bachelor Computer Science 2012-2015 #

I started my research career after going back to university in 2012, completing a Bachelor in Science at the University of Antwerp with great distinction in June 2015 on the topic of parallel discrete event simulation in C++.

DEVS #

Discrete event simulation is a computational and mathematical discipline that looks at how discrete objects or agents, with a discrete state, evolve while interacting with each other, often via exchanging messages. To execute such simulations in parallel two paradigms exist:

  • `conservative'
  • `optimistic'
Conservative parallelism #

In `conservative’ execution you run the simulation in parallel but any agent or object wait until all necessary messages have arrived.

If a lot of messages are passed, especially bidirectional, you can easily lose all performance advantages because the workload is locked in a dependency loop.

Optimistic parallelism #

Here you remove the constraints, but you record state and are prepared to roll back if needed. This is very similar to what modern CPU engines do, you pick an execution path that is likely, and speculate ahead. If you’re wrong, you roll back and take the other path. If you are right most of the time, and the cost of rolling back is not too severe, large speedups occur. To a large extent this speculative execution model in CPUs is a foundational aspect for the modern computing speeds, to see for yourself, try disabling it by enabling the strictest Meltdown and Spectre class mitigations.

In simulation, however, it is harder to predict what the outcome will be. If it was easy to predict, you didn’t need the simulation to begin with, right?

Hybrid execution #

In the above setting the simulation designer needs to pick, in advance, what paradigm to run. However, what if the engine, using performance tracing, can switch? For example, what if you have a simulation where a lot of rollbacks occur, and clearly conservative execution would be faster? This is what our project then set out to do, to change mid-simulation how the parallelism was executed. To maximize speed in C++ we did away with smart pointers, and thus had to manage our own memory (across threads).

While for almost any project in C++ this would be overkill, we benchmarked and found up to 100x speedups were possible by handcrafted memory management. Note that the simulation user is not exposed to this, the memory management is exclusively handled in the engine. Fine grained control then allows for runtime switching of the synchronization protocols.

Master Computer Science 2015-2017 #

In my M. Sc. (University of Antwerp, 2015-2017, greatest distinction) I completed my thesis on the hybrid combination of metaheuristics and hyperheuristics for surrogate modelling of large scale epidemiology simulators.

Hybrid meta-hyperheuristics for learning interpretable surrogate models of epidemiological simulations #

Large agent based simulations can be invaluable in modelling progression of epidemics, and test what if scenarios for nation-level decision makers. However, such simulations have a large parameters space (Q), and are computationally expensive to run. Surrogate functions or models are learnt models that map the parameters (or scenario) Q to an expected outcome, without running the simulation. This is a supervised learning problem, with the added constraint that the model should be interpretable, in the form of a closed form mathematical equation. In other words, it is a symbolic regression problem where you learn the most simple and elegant formula that takes the input (Q) and predicts the outcome of the simulation with minimal error.

Hyperheuristics, such as genetic programming, are ideal tools for this (circa 2015), but tend to compromise accuracy for generalizability. To improve their final stage fine tuning, in my thesis I showed that:

  • It can be executed in parallel using MPI using marked speedups
  • You can fine tune with a number of metaheuristics (e.g. PSO, swarm algorithms) without compromising the closed form equations

Ph. D. Computing Science 2018-2024 #

In my Ph. D. (Thesis defended April 2024, Simon Fraser University, accepted ‘as-is’) I focus on reconstructing interaction between subcellular organelles and proteins where, technically, they can’t be observed.

Robust, unbiased detection of objects #

In order to detect interaction, you need at a minimum detect the objects that interact first (though not always). A first part of my thesis focus on unbiased detection of objects in both SMLM and voxel based superresolution microscopy (SRM)

Subprecision interaction detection #

SRM does not always have the precision to observe the changes cellular biologists are interested in, even though it dramatically increases the precision of object detection compared to diffraction limited microscopy. Computing science fills in the hole by reconstructing what can only be partially seen. In other words, the objective is detect objects below the precision of the system, e.g. subprecision.

Spatiotemporal detection of interaction #

In practice this means most of my algorithms take in multiple channels of data, each captures a specific protein or organelle. The outcome then is the spatio-temporal interaction by proximity and state change between the identified objects. This requires highly robust, fair, and balanced algorithms, ideally with provable upper and lower limits to their performance.

Scalability out of necessity (and because it’s fun to do more with less) #

Given the large scale of the data, designing scalable algorithms is key, as is parallelizing them. Most of my more recent algorithms are implemented in Julia, which allows for JIT or precompiled code near C/C++ speeds, while offering high level language features. Where possible I parallelize the algorithms, leveraging the knowledge from earlier work on C++ discrete event simulators.

Reproducibility meets speed: recipe driven pipelines #

Critically, to ensure reproducible work, I developed a package `DataCurator’ that is able to compose complex pipelines, without code, and execute them on clusters with arbitrary parallelism.

Imbalance meets extreme value theory: towards stable detection of equilibrium perturbations #

Finally, cellular function is a careful evolved equilibrium, where only small changes can have major effects (e.g. disease). In machine learning or statistical terms, this means I am typically focusing on minute differences between datasets, e.g. severe imbalance' or extreme value’ problems. At the same time, discovery in this setting requires reconstructing those equilibria, often from image data alone, in an self, un, or weakly supervised setting.

Parallel research trajectories #

During my research on the above topics I collaborated on a number of papers with the same themes of extreme imbalance, reconstruction in sparsely annotated weakly supervised data, and spatio-temporal reconstruction in different modalities. These projects enriched my own work, apart from being fun problems to solve, because they taught me to abstract problems away from the imaging modality and into abstract mathematical problem formulation, before selecting the most optimal closed form solution or approximate learned model.

Current and future work #

My current interests extend from the same themes discussed above. While simulation is powerful, I aim to explore implicit functions, yet do so in a sparse model that requires a far smaller computational (and carbon) footprint. The many corrupting factors that can bias SRM based analysis can, I believe, be identified by implicitly learned causal factor analysis. If you wish to collaborate on these topics, feel free to reach out.

CV #

CV can be found here

Personal interests #

I’m an avid hiker, MTB’er, and in general prefer to spend long days in nature, especially in pristine landscapes near archeological traces of early human civilizations, where possible, or near wildlife, without perturbing them. You’d be surprised how easy it is to find these, if you know but to look. I love the work by the openstreetmap community (OSM), and try to improve maps where I can. At home I enjoy reading philosophy, especially from different, sometimes dormant cultures.

About the name #

One of my interests is classical history. The name of the website refers to a what if scenario in the late Roman Republic. What if Sertorius, and not Sulla, won the civil war? Would the Roman republic have survived, and how would European history look if that had happened? I condone neither Sertorius, nor Sulla, but both had very strong visions on what they perceived as what should happen with the Roman state, though neither’s changes ended up lasting long.

Given that I frequently work with what if perturbations to complex dynamical systems from very limited information, it’s a thought exercise that at an intellectual level is interesting to think on.