A decade in data engineering, now shipping generative AI in production — GPT-powered products, RAG pipelines, and vector-database knowledge bases. Based in Greater Montreal.
I'm an AI Engineer and Data Architect with over a decade of experience designing data platforms and, more recently, putting generative AI into production — GPT-powered products, RAG pipelines, semantic index layers, and vector-database knowledge bases.
My career spans the full arc of modern data engineering: from enterprise Java systems in India and Dubai, through big-data platforms at CGI, Ericsson, and Intact, to advising customers as a Solutions Architect at Cloudera, leading the search product team at Sitecore, and architecting AI-driven data platforms at Databook. Today I'm a Principal Engineer at Autodesk.
I care about systems that are reliable as well as smart — data quality, observability, and LLM-Ops are as much a part of my toolkit as embeddings and prompts. I've also led and mentored teams of data engineers, because great platforms are built by great teams.
Local-first income & expense tracker driven by a Claude tool-calling agent — chat it, WhatsApp it, or snap a receipt. SQLite on your machine is the single source of truth; OCR and Google sync are opt-in.
Spearheaded a generative-AI product for strategic sales — enhancing customer engagement with LLM-driven insights.
Designed Retrieval-Augmented Generation pipelines and a semantic index layer, improving AI-driven retrieval.
Managed a Pinecone vector database as an AI knowledge base and built LLM-Ops tooling for deployment.
Led Sitecore's core Search team — AI-backed search and personalization for global e-commerce.
Architected Data Lake & Warehouse for a global retailer and led historical migrations to AWS at Cloudera.
A Scala rule engine that validates streaming and batch data against rules defined in YAML.
View on GitHub →$ git remote → github.com/samsandeepmalik
Driving engineering excellence across data and AI initiatives as a principal individual contributor.
Built and optimized large-scale batch data pipelines for one of Canada's largest insurers.
Data engineering for telecom-scale platforms and analytics workloads.
Delivered big-data engineering solutions for enterprise clients.
Early career building enterprise systems — including onsite work for Emirates Airline in Dubai and core platforms (Master Data Management, KPI systems) with Java, Spring, and AngularJS.
Engineering — foundation in computer science and software engineering.
CCA-175 (Spark & Hadoop), Certified Kubernetes Application Developer (CKAD), and AWS serverless workshops.
Deep in the GenAI stack — LLM orchestration, agentic patterns, retrieval architectures, and evaluation.