At Komodo Health, our Data Ingestion team is responsible for quickly incorporating new data sources into our expanding Healthcare Map™. One of our engineers, Johannes Leppä, PhD, gave a talk at the DataCouncil conference in San Francisco in April about how we build our scalable data ingestion architecture and some of the lessons learned along the way. Some of the technologies we use are Spark, Airflow and Kuberenetes.
Here is the video of that talk
Enjoy! If you are interested in working on this and many other interesting technical problems while helping reduce the global burden of disease we are hiring for a number of roles.