Data Engineer – AWS & Spark (Healthcare Data)
Source Meridian · Colombie
Descripcion del puesto
About the role
Source Meridian is seeking a Data Engineer to design and operate an AWS‑native data platform that processes healthcare claims and tokenized identifiers. You will work with large Parquet datasets on S3, build Spark pipelines, and ensure secure, privacy‑focused data handling.
Key responsibilities
- Build and maintain Spark (PySpark/Scala) pipelines for large‑scale Parquet data on S3.
- Implement tokenization workflows, including token conversion and dataset intersection.
- Process and deliver healthcare claims datasets with accurate identity mapping.
- Orchestrate pipelines using Apache Airflow and AWS native tools.
- Develop reliable, testable, and observable ETL/ELT processes with retries, idempotency, and monitoring.
- Optimize performance and cost of Spark jobs, S3 layout, and Athena queries.
- Contribute to dbt models for transformations, documentation, and data quality.
- Collaborate with cross‑functional stakeholders while maintaining strict privacy and security standards.
Required profile
- 1–2 years of professional data engineering experience.
- Strong hands‑on experience with Apache Spark (PySpark or Scala).
- Proficiency with the AWS data stack: S3, Athena, and familiarity with Glue Catalog/Lake Formation.
- Experience building pipelines with Apache Airflow.
- Excellent SQL skills and solid data‑modeling fundamentals.
- Advanced English for technical discussions and documentation.
Required skills
- Apache Spark (PySpark, Scala)
- AWS S3
- Amazon Athena
- AWS Glue Catalog (optional Lake Formation)
- Apache Airflow
- SQL
- Data modeling
- dbt (nice to have)
- Tokenization and identity resolution concepts
- AWS security (IAM, KMS, encryption)
- Running Spark on EMR or Spark‑on‑containers
Questions fréquentes
Por que reporta esta oferta?
Postula en 30 segundos
Ingresa tu email para postular. Se creara una cuenta automaticamente.
Al continuar, aceptas nuestras condiciones de uso.
Ya tienes cuenta? Iniciar sesion
Publicado hace 5 horas
Expira en 1 mes
3 vistas · 0 candidaturas
Aumenta tus posibilidades
Sube tu CV: te propondremos las ofertas que coinciden con tu perfil.
Analizando tu CV...
Source Meridian
Colombie
Ofertas relacionadas
-
SAP Coordinator ABAP – Remote (Colombia)
Inchcape Américas Colombie -
Frontend Developer – Remote (AI Code Review)
Hired Colombie -
Senior Consultant – Freelance AI Project (Strategy Consulting)
Mindrift Colombie -
Ingeniero/a de Nube – Soluciones de datos en AWS
Tata Consultancy Services Bogota -
Robot Operator (AI Training) – Remote Contract
Alignerr Colombie