
Hello, I'm
David Asencio
Senior Data Engineer
10+ years designing scalable data platforms across healthcare. Specialized in multi-cloud architectures, high-performance ETL pipelines, and enterprise analytics that drive better patient outcomes.
About
Building Data Platforms That Matter
Senior Data Engineer with 10 years of experience designing, building, and optimizing scalable data platforms across startup and enterprise healthcare environments. Proven expertise in developing high-performance ETL pipelines, architecting cloud-native data lakes, and processing large-scale datasets using Python, SQL, and Apache Spark.
Extensive experience working in multi-cloud environments (AWS, Azure, and GCP), leading cloud migration initiatives, and implementing modern data warehousing solutions using Snowflake and BigQuery. Strong background in healthcare data engineering, including regulatory and quality reporting transformations, with a focus on data reliability, performance tuning, and cost optimization.
Recognized for driving architectural decisions, mentoring engineering teams, and delivering enterprise-grade data solutions that improve operational efficiency and enable advanced analytics.
B.S. Computer Science
UT Austin, 2011 - 2015
10+ Years Experience
Startup to Enterprise
Multi-Cloud
AWS, Azure, GCP
Healthcare Specialist
HIPAA, FHIR, HL7, CMS
Experience
Where I've Worked
McLean, Virginia
Architected enterprise-scale multi-cloud data platform supporting Medicaid and population health analytics across AWS, Azure, and GCP.
- Designed and optimized large-scale Spark (PySpark/Scala) pipelines processing multi-terabyte healthcare datasets
- Led migration of 200+ batch ETL pipelines into a unified cloud-native data lake architecture
- Implemented CI/CD framework using GitHub Actions and Azure DevOps, reducing deployment time by 40%
- Built reusable ETL framework adopted across multiple teams, improving development velocity by 35%
- Reduced Spark job runtime by 60% through partition tuning and cluster autoscaling
- Designed near real-time ingestion pipelines using Kafka and Kinesis
Dallas, Texas
Designed and maintained end-to-end healthcare data pipelines integrating EHR, claims, lab, and operational datasets.
- Built scalable PySpark jobs for regulatory reporting and quality metrics (HEDIS transformations)
- Led migration of on-prem SQL Server workloads to Azure Data Factory and Azure Synapse
- Developed Snowflake-based dimensional models (star schema) for enterprise BI and analytics
- Optimized batch workloads, reducing average processing time by 45%
- Implemented automated data quality checks ensuring HIPAA-compliant data processing
Austin, Texas
Developed and maintained ETL pipelines for customer booking, pricing, and vendor datasets.
- Built Airflow DAGs to orchestrate batch workflows and improve reliability
- Designed dimensional models in Amazon Redshift to support BI reporting
- Automated data validation scripts reducing reporting discrepancies by 30%
- Improved AWS resource utilization and reduced monthly cloud costs
Austin, Texas
Developed complex SQL queries and dashboards to track revenue, customer acquisition, churn, and operational KPIs.
- Built Tableau/Looker reports used by leadership for decision-making
- Conducted A/B testing analysis to evaluate product and pricing strategies
- Automated recurring reporting workflows, reducing manual effort by 25%
Projects
Featured Work

Enterprise Healthcare Analytics Platform
Architected a multi-cloud data platform supporting Medicaid and population health analytics. Processed multi-terabyte datasets across AWS, Azure, and GCP with real-time ingestion pipelines using Kafka and Kinesis.

FHIR Analytics Dashboard
Designed and built a comprehensive FHIR (Fast Healthcare Interoperability Resources) analytics platform with geospatial mapping, demographic breakdowns, and real-time healthcare data visualization for clinical decision support.

Clinical Data Pipelines & Reporting
Built end-to-end healthcare data pipelines integrating EHR, claims, lab, and operational datasets. Migrated on-prem SQL Server workloads to Azure cloud and designed star schema models in Snowflake for enterprise BI.

Hospital Management Data System
Engineered data infrastructure for a hospital management system covering patients, lab test requests, appointments, and billing. Designed dimensional models and automated data quality checks for HIPAA-compliant processing.

Healthcare Analytics Dashboards
Developed the data layer powering healthcare analytics dashboards with KPI gauges, trend analysis, and operational metrics. Built reusable ETL frameworks adopted across multiple analytics teams.

Population Health Analytics Platform
Built scalable data pipelines for population health analytics including demographics, clinical outcomes, and treatment patterns. Enabled regulatory reporting and quality metric transformations at scale.
Skills
Technologies & Expertise
Programming & Languages
Big Data & Processing
Cloud Platforms
Data Warehousing & Modeling
Healthcare Data & Compliance
DevOps & Orchestration
Contact
Let's Work Together
I'm always open to discussing new opportunities, data architecture challenges, or how to build better healthcare data platforms. Feel free to reach out.