Harisankar.
Over ten years designing and delivering enterprise data platforms, AI-driven pipelines, and cloud-native analytics solutions across AWS, GCP, Snowflake, and on-premises infrastructure.
A decade building enterprise data
platforms — and leading the teams that ship them.
I'm a data and AI engineering leader with over ten years of experience in designing, building, and delivering enterprise-scale platforms for clients across telecom, IoT, Biotech, retail, e-commerce, and GenAI. My work sits where architecture meets delivery — turning ambiguous, high-stakes problems into systems that hold up in production.
That reach spans the full breadth of modern data engineering — distributed processing with Spark and Kafka, the modern data stack (dbt, Great Expectations, DataHub), and multi-cloud architecture across AWS, GCP, Snowflake, and on-premises infrastructure — alongside production GenAI systems built on Retrieval-Augmented Generation, LangChain, and LLM integration.
Beyond the code, I own delivery end to end: shaping architecture, mentoring engineers, running quality audits, and keeping stakeholders aligned. Today I do this as a Senior Consultant at Thoughtworks, advising global enterprises on data platforms, GenAI, and cloud-native architecture.
Solutions Architect — Associate
Certified Engineer (RHCE)
A decade of delivery —
across telecom, IoT, Biotech, retail, and GenAI.
Senior Consultant · Data Engineer
Advising and architecting enterprise data platforms for global retail and e-commerce clients — blending modern data stack practices (dbt, Great Expectations, DataHub) with Spark, Kafka, and multi-cloud analytics.
Projects at Thoughtworks 2
Assortment Analytics Platform
A data-driven platform to analyse product assortment strategy using scraped data from a range of e-commerce platforms — structured for long-term flexibility and analytical scale.
- Data modelling with the Data Vault methodology for long-term scalability.
- Quarterly web-scraping pipelines feeding structured insights into BigQuery.
- Built an end-to-end framework to transfer data from BigQuery into Snowflake.
- Enabling data-driven assortment decisions through category gap analysis and white-space detection.
- Leveraged Snowflake Cortex AI to develop Agentic AI solution on top of the assortment data.
Enterprise Data Platform Modernisation
End-to-end implementation of an enterprise data platform — rebuilding pipeline reliability, data quality controls, and large-scale transformation workloads across Spark, Kafka, and Oracle.
- Data quality framework using Great Expectations integrated with DataHub.
- Developed ingestion frameworks for API, streaming and RDBMS.
- Distributed Apache Spark jobs on AWS EMR; Kafka-based ingestion.
Lead · Voice & Cloud · SDE-III
Led cloud and analytics workstreams on Ayla's IoT platform, shipping telemetry pipelines, dashboards, and data products that powered connected devices for telecom and consumer clients worldwide.
Data Engineer · II
Part of Bayer Crop Science R&D — building big-data and cloud-based pipelines to centralise agricultural research data and generate insights for breeding and yield programmes.
Senior Implementation Engineer
Shipped large-scale telecom platforms — MDM systems, charging systems, and campaign management tooling — for carriers across MEA and APAC. The foundations of everything I build today started here.
Projects at 6D Technologies 3
Notification Gateway / DWH
Transformed a national telecom operator's customer experience by migrating notifications from internal systems onto our platform — letting customers set custom notification preferences via the Mobile App, CSRs and call centres. Owned requirement analysis, design, end-to-end deployment, testing and delivery.
- Partnered with demand analysts, business development managers and solution architects to design, analyse and implement customer requirements.
- Built a Pub-Sub model and data pipeline collecting structured and unstructured data from disparate systems into a Data lake using Kafka streams.
- Collected notification trends and customer information into a DWH and analysed it to surface insights that improved business and customer experience.
- Supported an 18-node Cloudera cluster — NiFi & StreamSets for ingestion, Spark for transformation, Kafka for real-time streaming.
- Went onsite and ran 5+ levels of testing — functional, regression, UAT, integration and performance.
- Designed a data model to manage user preferences; used MySQL, Oracle, BigQuery, Tableau, Power BI and Excel for data intelligence.
Digimate
A comprehensive messaging-service platform that integrates multiple SMS operators. Owned the design and end-to-end deployment, helping onboard 1500+ B2B/B2C customers and scale the business rapidly.
- Designed and deployed the platform end to end — integrating multiple SMS operators into a single solution.
- Helped integrate 1500+ B2B/B2C customers and expand the business rapidly.
- Developed Python / shell scripts and SQL stored procedures for project use cases.
- Set up non-functional requirements — hardware support, load balancing and redundancy management.
- Deployed campaign management solutions onsite for leading stock-exchange and banking clients.
Unified Messaging Gateway
A messaging solution offering both prepaid and postpaid credit to B2B customers for sending SMS/USSD notifications. Responsible for testing, report design and onsite delivery.
- Delivered prepaid and postpaid credit messaging for B2B customers (SMS / USSD notifications).
- Designed reporting and dashboards using Pentaho and Highcharts.
- Developed SQL procedures for processing; followed a CI/CD pipeline for deployment.
- Went onsite for deployment and completed UAT successfully.
Companies I've delivered for — across
telecom, IoT, biotech, retail, and e-commerce.
Over the past decade I've had the opportunity to work with these organisations — building data platforms, pipelines, and analytics across their domains, both directly and through Thoughtworks consulting engagements.
Otto GmbH
Assortment Analytics Platform — Data Vault modelling, quarterly web scraping pipelines into BigQuery, BQ-to-Snowflake transfer framework. See Featured Projects.
JCPenney
Enterprise Data Platform modernisation — data quality framework (Great Expectations + DataHub), Spark on AWS EMR, dbt ingestion, Kafka streams. See Featured Projects.
Ayla Networks
Helped build an industry-leading IoT platform — accelerating development, support, and enhancement of connected products at scale.
Bayer Crop Science
Part of the R&D data team — used big-data and cloud tech to centralise management and extract insight from breeding and field data.
Ooredoo Qatar
Developed an MDM platform collecting data from legacy systems, feeding a clean pipeline into the data warehouse for downstream analytics.
Bharti Airtel
Product development team for AIRTEL DIGIMATE — a centralised campaign management platform. Owned data management and reporting.
Etisalat
Built an IoT Smart Living application for Etisalat Dubai — collecting data streams from multiple device sources for real-time analytics.
AIS Thailand
Built a charging system and designed interactive Pentaho dashboards tailored to AIS's customer and operational requirements.
Opinionated tools. Pragmatic choices.
Data & Big Data
- Apache Spark
- Kafka
- Airflow
- dbt
- Databricks
- Snowflake
- BigQuery
- AWS EMR
- DataHub
- Delta Lake
Cloud & Infra
- AWS
- Azure
- GCP
- Kubernetes
- Terraform
- Docker
- CI/CD pipelines
- On-prem & hybrid
GenAI & ML
- LLM integration
- RAG architectures
- LangChain
- Vector databases
- Prompt engineering
- Deep Learning
- ML algorithms
- Snowflake Cortex AI
LLM Training & MLOps
- Fine-tuning · LoRA / QLoRA
- Axolotl
- DeepSpeed
- FSDP
- MLflow
- Model deployment
Data Platform & Quality
- Enterprise data platforms
- Metadata management
- Great Expectations
- Hackolade (data modelling)
- Data Vault methodology
- API-driven integration
- Streaming & batch
Languages & DBs
- Python
- SQL · Advanced
- Scala
- Java
- PostgreSQL
- MongoDB
- Redis
Delivery & Leadership
- Project management
- Delivery management
- Team leadership
- Quality audits
- Stakeholder alignment
- Architecture reviews
- RFP reviews
- SOW preparation
- HLD & LLD design
Certifications
- Google Cloud Professional Data Engineer
- AWS Solutions Architect – Associate
- Red Hat Certified Engineer (RHCE)
Domains & IoT
- Retail
- E-commerce
- Telecom
- IoT platforms
- Biotech R&D
- Enterprise SaaS
- Amazon Alexa
- Google Assistant / Smart Home
Thoughts from the trenches of data engineering.
Deep dives on data platforms, GenAI systems, cloud architecture, and lessons learned shipping enterprise software. No fluff.
Beyond the API: A Data Engineer's Guide to the LLM Memory Ladder
What a data engineer learned trying to fine-tune an 8B model — the CUDA crash, the humbling, and the chain of trade-offs that decides whether a model will actually run on the hardware in front of you.
Read post ↗From BigQuery to Snowflake: What a Real-World Analytics Migration Actually Looks Like
The real story of migrating an analytics workload from BigQuery to Snowflake — not the clean version, but the actual trade-offs, surprises, and lessons from doing it inside a live enterprise.
Read post ↗Building an Enterprise Data Platform from Scratch: What I Learned Over 9 Months
Nine months, twenty thousand jobs, petabytes of data. Here's what I learned building an enterprise data platform from scratch — the architecture decisions, the human dynamics, and six lessons I'd tell myself at the start.
Read post ↗