Open to new roles

I build intelligent data & AI systems.

AI Engineer & Data Architect · Principal Engineer @ Autodesk

A decade in data engineering, now shipping generative AI in production — GPT-powered products, RAG pipelines, and vector-database knowledge bases. Based in Greater Montreal.

11+
years experience
8+
companies · 3 continents
$

cat about.md

# 01

I'm an AI Engineer and Data Architect with over a decade of experience designing data platforms and, more recently, putting generative AI into production — GPT-powered products, RAG pipelines, semantic index layers, and vector-database knowledge bases.

My career spans the full arc of modern data engineering: from enterprise Java systems in India and Dubai, through big-data platforms at CGI, Ericsson, and Intact, to advising customers as a Solutions Architect at Cloudera, leading the search product team at Sitecore, and architecting AI-driven data platforms at Databook. Today I'm a Principal Engineer at Autodesk.

I care about systems that are reliable as well as smart — data quality, observability, and LLM-Ops are as much a part of my toolkit as embeddings and prompts. I've also led and mentored teams of data engineers, because great platforms are built by great teams.

$

ls ~/projects

# 02
gpt-sales.md

GPT Sales Intelligence

Spearheaded a generative-AI product for strategic sales — enhancing customer engagement with LLM-driven insights.

GPTGenAI
rag-index.md

RAG & Semantic Index

Designed Retrieval-Augmented Generation pipelines and a semantic index layer, improving AI-driven retrieval.

RAGEmbeddings
vector-kb.md

Vector KB & LLM-Ops

Managed a Pinecone vector database as an AI knowledge base and built LLM-Ops tooling for deployment.

PineconeLLM-Ops
ai-search.md

AI Search & Personalization

Led Sitecore's core Search team — AI-backed search and personalization for global e-commerce.

SearchSnowflake
data-platforms.md

Enterprise Data Platforms

Architected Data Lake & Warehouse for a global retailer and led historical migrations to AWS at Cloudera.

Data LakeAWS
EN-RuleEngine

EN-RuleEngine

A Scala rule engine that validates streaming and batch data against rules defined in YAML.

ScalaStreaming
View on GitHub →

$ git remote → github.com/samsandeepmalik

$

git log --author=sandeep

# 03
Jul 2025 — Present · Montreal

Principal Engineer

Autodesk

Driving engineering excellence across data and AI initiatives as a principal individual contributor.

Dec 2022 — Jul 2025 · Montreal

Data Architect / Staff Engineer

Databook
  • Spearheaded a GPT solution for strategic sales products and designed production RAG pipelines.
  • Architected a semantic index layer and managed Pinecone as the AI knowledge base.
  • Built LLM-Ops, data-quality checks, and observability; led a team of data engineers on a new platform.
Mar 2022 — Dec 2022 · Montreal

Principal Data Engineer

Sitecore
  • Led the core Search product team — AI-backed search and personalization for e-commerce.
  • Architected Data Lake/Warehouse for a global retailer; built Snowflake marts and automated monitoring.
May 2021 — Mar 2022 · Montreal

Solutions Architect

Cloudera
  • Guided customers onto Cloudera Data Platform across public, private, and hybrid clouds.
  • Designed on-prem → AWS migration strategies, ran POCs, and trained customer engineering teams.
Apr 2020 — May 2021 · Toronto

Senior Data Engineer

Intact

Built and optimized large-scale batch data pipelines for one of Canada's largest insurers.

Sep 2019 — Apr 2020 · Montreal

Senior Data Engineer

Ericsson

Data engineering for telecom-scale platforms and analytics workloads.

Jul 2018 — Aug 2019 · Montreal

Senior Data Engineer

CGI

Delivered big-data engineering solutions for enterprise clients.

2014 — 2018 · India & Dubai

Software / Data Engineer

Synechron · Emirates · MothersonSumi Infotech (MIND)

Early career building enterprise systems — including onsite work for Emirates Airline in Dubai and core platforms (Master Data Management, KPI systems) with Java, Spring, and AngularJS.

$

cat education.txt

# 04
university

Manav Bharti University, Solan

Engineering — foundation in computer science and software engineering.

certifications

Big Data & Cloud

CCA-175 (Spark & Hadoop), Certified Kubernetes Application Developer (CKAD), and AWS serverless workshops.

ongoing

Always Learning

Deep in the GenAI stack — LLM orchestration, agentic patterns, retrieval architectures, and evaluation.

$ ./say-hello.sh

Let's build something intelligent

I'm always open to conversations about AI engineering, data architecture, and interesting problems.