There is also another major limitation of Vertex AI endpoint => the time it takes to deploy a model to an endpoint !!
In my case, it takes between 7 to 10 minutes to deploy the model to the endpoint !!
I use Vertex AI pipeline to run the full data extraction/preprocessing/training/deployment workload. I need to create a new model every hour (taking into account the new production data). But the deployment step always takes way to much time. Because of this I am really considering switching to a compute engine instance for serving the model.