David Asencio

Hello, I'm

David Asencio

Senior Data Engineer

10+ years designing scalable data platforms across healthcare. Specialized in multi-cloud architectures, high-performance ETL pipelines, and enterprise analytics that drive better patient outcomes.

Katy, Texasasdavid127@outlook.com

About

Building Data Platforms That Matter

Senior Data Engineer with 10 years of experience designing, building, and optimizing scalable data platforms across startup and enterprise healthcare environments. Proven expertise in developing high-performance ETL pipelines, architecting cloud-native data lakes, and processing large-scale datasets using Python, SQL, and Apache Spark.

Extensive experience working in multi-cloud environments (AWS, Azure, and GCP), leading cloud migration initiatives, and implementing modern data warehousing solutions using Snowflake and BigQuery. Strong background in healthcare data engineering, including regulatory and quality reporting transformations, with a focus on data reliability, performance tuning, and cost optimization.

Recognized for driving architectural decisions, mentoring engineering teams, and delivering enterprise-grade data solutions that improve operational efficiency and enable advanced analytics.

B.S. Computer Science

UT Austin, 2011 - 2015

10+ Years Experience

Startup to Enterprise

Multi-Cloud

AWS, Azure, GCP

Healthcare Specialist

HIPAA, FHIR, HL7, CMS

Experience

Where I've Worked

2023 - 2026

Senior Data Engineer

·Acentra Health

McLean, Virginia

Architected enterprise-scale multi-cloud data platform supporting Medicaid and population health analytics across AWS, Azure, and GCP.

  • Designed and optimized large-scale Spark (PySpark/Scala) pipelines processing multi-terabyte healthcare datasets
  • Led migration of 200+ batch ETL pipelines into a unified cloud-native data lake architecture
  • Implemented CI/CD framework using GitHub Actions and Azure DevOps, reducing deployment time by 40%
  • Built reusable ETL framework adopted across multiple teams, improving development velocity by 35%
  • Reduced Spark job runtime by 60% through partition tuning and cluster autoscaling
  • Designed near real-time ingestion pipelines using Kafka and Kinesis
PythonScalaSparkSnowflakeBigQueryAWSAzureGCPKafkaTerraformAirflow
2020 - 2023

Data Engineer

·Parkland Health

Dallas, Texas

Designed and maintained end-to-end healthcare data pipelines integrating EHR, claims, lab, and operational datasets.

  • Built scalable PySpark jobs for regulatory reporting and quality metrics (HEDIS transformations)
  • Led migration of on-prem SQL Server workloads to Azure Data Factory and Azure Synapse
  • Developed Snowflake-based dimensional models (star schema) for enterprise BI and analytics
  • Optimized batch workloads, reducing average processing time by 45%
  • Implemented automated data quality checks ensuring HIPAA-compliant data processing
PythonPySparkSnowflakeAirflowAzure ADFAzure SynapseAWSKafkaTerraform
2017 - 2019

Junior Data Engineer

·LawnStarter

Austin, Texas

Developed and maintained ETL pipelines for customer booking, pricing, and vendor datasets.

  • Built Airflow DAGs to orchestrate batch workflows and improve reliability
  • Designed dimensional models in Amazon Redshift to support BI reporting
  • Automated data validation scripts reducing reporting discrepancies by 30%
  • Improved AWS resource utilization and reduced monthly cloud costs
PythonSQLAWSRedshiftAirflowTableauLooker
2016

Data Analyst

·LawnStarter

Austin, Texas

Developed complex SQL queries and dashboards to track revenue, customer acquisition, churn, and operational KPIs.

  • Built Tableau/Looker reports used by leadership for decision-making
  • Conducted A/B testing analysis to evaluate product and pricing strategies
  • Automated recurring reporting workflows, reducing manual effort by 25%
SQLTableauLookerPythonAWS Redshift

Projects

Featured Work

Enterprise Healthcare Analytics Platform

Enterprise Healthcare Analytics Platform

Architected a multi-cloud data platform supporting Medicaid and population health analytics. Processed multi-terabyte datasets across AWS, Azure, and GCP with real-time ingestion pipelines using Kafka and Kinesis.

SparkSnowflakeBigQueryAWSAzureGCPKafkaTerraform
FHIR Analytics Dashboard

FHIR Analytics Dashboard

Healthcare Data Project

Designed and built a comprehensive FHIR (Fast Healthcare Interoperability Resources) analytics platform with geospatial mapping, demographic breakdowns, and real-time healthcare data visualization for clinical decision support.

FHIRHL7PythonSparkSnowflakeTableau
Clinical Data Pipelines & Reporting

Clinical Data Pipelines & Reporting

Built end-to-end healthcare data pipelines integrating EHR, claims, lab, and operational datasets. Migrated on-prem SQL Server workloads to Azure cloud and designed star schema models in Snowflake for enterprise BI.

PySparkAzure ADFSynapseSnowflakeAirflowHIPAA
Hospital Management Data System

Hospital Management Data System

Healthcare Data Project

Engineered data infrastructure for a hospital management system covering patients, lab test requests, appointments, and billing. Designed dimensional models and automated data quality checks for HIPAA-compliant processing.

PythonSQLAirflowGreat ExpectationsAWSRedshift
Healthcare Analytics Dashboards

Healthcare Analytics Dashboards

Cross-Functional Analytics

Developed the data layer powering healthcare analytics dashboards with KPI gauges, trend analysis, and operational metrics. Built reusable ETL frameworks adopted across multiple analytics teams.

PythonSparkRedshiftTableauLookerAirflow
Population Health Analytics Platform

Population Health Analytics Platform

Public Health Analytics

Built scalable data pipelines for population health analytics including demographics, clinical outcomes, and treatment patterns. Enabled regulatory reporting and quality metric transformations at scale.

PySparkSnowflakeBigQueryKafkaCMS ReportingHEDIS

Skills

Technologies & Expertise

Programming & Languages

PythonSQL (Advanced)ScalaPySparkBash

Big Data & Processing

Apache SparkDatabricksAWS EMRAzure SynapseGCP DataflowApache KafkaAWS Kinesis

Cloud Platforms

AWS (S3, Redshift, Glue, EMR, Lambda)Azure (ADF, Synapse, ADLS)GCP (BigQuery, Cloud Storage, Dataflow)

Data Warehousing & Modeling

SnowflakeAmazon RedshiftBigQueryStar & Snowflake SchemaData Lake Architecture

Healthcare Data & Compliance

HL7 & FHIR StandardsHIPAA / PHI GovernanceCMS Regulatory ReportingHealthcare Claims (X12)Clinical Data ModelingICD Coding Concepts

DevOps & Orchestration

Apache AirflowAzure Data FactoryTerraform (IaC)CI/CD PipelinesGit / GitHubAzure DevOpsGreat Expectations

Contact

Let's Work Together

I'm always open to discussing new opportunities, data architecture challenges, or how to build better healthcare data platforms. Feel free to reach out.

Katy, Texas