Data Engineer – AWS & Spark (Healthcare Data)
Source Meridian · Colombie
Description du poste
About the role
Source Meridian is seeking a Data Engineer to design and operate an AWS‑native data platform that processes healthcare claims and tokenized identifiers. You will work with large Parquet datasets on S3, build Spark pipelines, and ensure secure, privacy‑focused data handling.
Key responsibilities
- Build and maintain Spark (PySpark/Scala) pipelines for large‑scale Parquet data on S3.
- Implement tokenization workflows, including token conversion and dataset intersection.
- Process and deliver healthcare claims datasets with accurate identity mapping.
- Orchestrate pipelines using Apache Airflow and AWS native tools.
- Develop reliable, testable, and observable ETL/ELT processes with retries, idempotency, and monitoring.
- Optimize performance and cost of Spark jobs, S3 layout, and Athena queries.
- Contribute to dbt models for transformations, documentation, and data quality.
- Collaborate with cross‑functional stakeholders while maintaining strict privacy and security standards.
Required profile
- 1–2 years of professional data engineering experience.
- Strong hands‑on experience with Apache Spark (PySpark or Scala).
- Proficiency with the AWS data stack: S3, Athena, and familiarity with Glue Catalog/Lake Formation.
- Experience building pipelines with Apache Airflow.
- Excellent SQL skills and solid data‑modeling fundamentals.
- Advanced English for technical discussions and documentation.
Required skills
- Apache Spark (PySpark, Scala)
- AWS S3
- Amazon Athena
- AWS Glue Catalog (optional Lake Formation)
- Apache Airflow
- SQL
- Data modeling
- dbt (nice to have)
- Tokenization and identity resolution concepts
- AWS security (IAM, KMS, encryption)
- Running Spark on EMR or Spark‑on‑containers
Questions fréquentes
Pourquoi signalez-vous cette offre ?
Postulez en 30 secondes
Entrez votre email pour postuler. Un compte sera cree automatiquement.
En continuant, vous acceptez nos conditions d'utilisation.
Deja un compte ? Connexion
Publie il y a 5 heures
Expire dans 1 mois
4 vues · 0 candidatures
Boostez vos chances
Importez votre CV : nous vous proposons les offres qui matchent votre profil.
Analyse de votre CV en cours...
Source Meridian
Colombie
Offres similaires
-
SAP Coordinator ABAP – Remote (Colombia)
Inchcape Américas Colombie -
Frontend Developer – Remote (AI Code Review)
Hired Colombie -
Senior Consultant – Freelance AI Project (Strategy Consulting)
Mindrift Colombie -
Ingeniero/a de Nube – Soluciones de datos en AWS
Tata Consultancy Services Bogota -
Robot Operator (AI Training) – Remote Contract
Alignerr Colombie