This video illustrates the performance I achieved for two canned datasets on the given GCP compute engine configurations. I didn’t try to optimize a whole lot, but you can see my adjustments in memory and the # of processors used. The timeline shows CPU usage for each case. You can access DTC’s code via Github here: https://lnkd.in/gqxWbGV.
Containerization and cloud computing are making otherwise complex code accessible to more scientists and operators–these technologies are really shaking things up and commoditizing aspects of previously costly enterprises. Approx. cost to compute these two cases is less than $10. I think I could achieve downtime savings using Kubernetes engine, but I haven’t tested that yet.
*Opinions here are solely mine and are not government endorsement of specific commercial solutions.