Great article! Thanks for the educational effort. The graphics bring a lot of clarity.
Although when we read "SHAP requires to train a distinct predictive model for each distinct coalition in the power set", it really feels like we need to train the same kind of model we initially used for every coalition (like a new gradient boosting model for example). But what is done instead is training a simple linear regression (for KernelSHAP) or a simple decision tree (for TreeSHAP) which is much faster.
Excellent explanations can also be found here: https://christophm.github.io/interpretable-ml-book/shap.html