Job Description
Role: Data Architect-GCP
📍Location: Remote-Canada
Duration: 6-12+ Months Contract
Note: Need only Canadian authorized person.
Job Description:
• Build Data pipelines required for optimal extraction, anonymization, and transformation of data from a wide variety of data sources using SQL, NoSQL and AWS ‘big data’ technologies.
• Streaming Batch
• Work with stakeholders including the Product Owners, Developers and Data scientists to assist with data-related technical issues and support their data infrastructure needs.
• Ensure that data is secure and separated following corporate compliance and data governance policies
• Take ownership of existing ETL scripts, maintain and rewrite them in modern data
• transformation tools whenever needed.
• Being an automation advocate for data transformation, cleaning and reporting tools.
• You are proficient in developing software from idea to production
• You can write automated test suites for your preferred language
• You have frontend development experience with frameworks such as React.js/Angular
• You have backend development experience building and integrating with REST APIs and Databases using languages such as Java Spring, JavaScript on Node.js, Flask on Python
• You have experience with cloud-native technologies, such as Cloud Composer, Dataflow, Dataproc, BigQuery, GKE, Cloud run, Docker, Kubernetes, and Terraform
• You have used cloud platforms such as Google Cloud/AWS for application hosting
• You have used and understand CI/CD best practices with tools such as GitHub Actions, GCP Cloud Build
• You have experience with YAML and JSON for configuration
• You are up-to-date on the latest trends in AI Technology
Great-to-haves:
• 3+ years of experience as a Data or Software Architect
• 3+ years of experience in SQL and Python
• 2+ years of experience with ELT/ETL platforms (Airflow, DBT, Apache Beam, PySpark, Airbyte)
• 2+ years of experience with BI reporting tools (Looker, Metabase, Quicksight, PowerBI, Tableau)
• Extensive knowledge of the Google Cloud Platform, specifically the Google Kubernetes Engine
• Experience with GCP cloud data related services ( Dataflow, GCS, Datastream, Data Fusion, Data Application, BigQuery, Data Flow, Data Proc, Dataplex, PubSub, CloudSQL, BigTable)
• Experience in health industry an asset
• Expertise in Python, Java
• Interest in PaLM, LLM usage and LLMOps
• Familiarity with LangFuse or Backstage plugins or GitHub Actions