Godson Kurishinkal | Senior Data Engineer

WHO I AM

Building Data Infrastructure That Matters

I'm not just a data engineer—I'm the person teams call when they need data they can trust, delivered fast.

With 6+ years in data engineering across UAE companies, I've built my career on turning messy data into reliable infrastructure. Currently at Landmark Group in Dubai, I build pipelines that power warehouse and delivery operations—the data infrastructure behind analytical and business intelligence reports that drive real decisions.

My work sits between raw chaos and actionable intelligence: 10,000+ SKUs flowing through pipelines I designed, 50+ ETL jobs that run while everyone sleeps, and a 3-tier anomaly detection system that catches problems before they become crises.

The thing I'm proudest of? Reducing data delivery from 4 hours to 30 minutes. That's not a vanity metric—it's the difference between same-day decisions and playing catch-up.

                        🏢
                        Landmark Group
Dubai, UAE
                    

                        🎓
                        IIT Madras
BS Data Science
                    

                        🎯
                        Next Goal
Cloud Platform Engineering
                    

REAL IMPACT

Before & After Transformations

Numbers that moved the needle for supply chain operations

⚡

Data Delivery Speed

Before 4 Hours Manual extraction & Excel processing

After 30 Min Automated pipelines with Polars

87% Faster Delivery

🔄

Data Freshness

Before 48 Hours Next-day reporting only

After 2-4 Hrs Same-day operational decisions

90% Fresher Data

🛡️

Data Quality Incidents

Before Weekly Reactive firefighting mode

After Monthly Proactive anomaly detection

500+ Anomalies Caught

WHAT I DO

Core Specializations

End-to-end data engineering for enterprise supply chain operations

Medallion Data Lakehouse

Bronze → Silver → Gold architecture with Hive-partitioned Parquet files. Designed for scalability, data lineage tracking, and incremental processing patterns that handle 10,000+ SKUs daily.

ETL Pipeline Architecture

Configuration-driven pipelines with Abstract Base Classes for extractors, transformers, and loaders. 50+ production pipelines with 95%+ reliability, supporting FULL and INCREMENTAL load patterns.

Dimensional Modeling

Star schema design with 15+ fact tables and 6+ dimension tables. Slowly Changing Dimensions (SCD Type 2), conformed dimensions, and optimized for both analytical queries and reporting.

Data Quality Engineering

3-tier validation framework: schema validation → business rule checks → statistical anomaly detection. Z-score and IQR-based outlier detection that's caught 500+ issues before they reached stakeholders.

ML-Powered Forecasting

15+ algorithm ensemble using ADI/CV² demand pattern classification. Prophet, XGBoost, and statistical methods automatically selected based on demand characteristics for optimal accuracy.

RPA & Legacy Integration

5+ Selenium/PyAutoGUI bots that extract data from systems without APIs. When there's no clean way in, I build one—automating what others think can't be automated.

WHAT I'VE BUILT

Featured Projects

Real production systems powering supply chain operations

🏗️

Medallion Data Lakehouse

Bronze/Silver/Gold architecture processing 10,000+ SKUs with 95%+ pipeline reliability.

Polars Parquet Star Schema

Enterprise Data Platform

Production-grade ETL framework with Abstract Base Classes, data quality validation, and 3-tier anomaly detection.

Python DuckDB ETL Framework

Demand Forecasting System

Intelligent forecasting engine with 15+ ensemble algorithms and ADI/CV² pattern classification.

ML Ensemble Time Series Scikit-learn

Anomaly Detection System

3-tier statistical system for data quality monitoring: validation → outlier → volatility.

Z-Score IQR Data Quality

HOW I BUILD

Configuration-Driven Pipelines

Real code from production systems—declarative configs that enable schema validation, quality checks, and flexible load modes

config.py Python

PIPELINE = {
    "load": {"mode": "FULL"},
    "paths": {"output": "data/landing/orders"},
    
    "schema": {
        "order_id": "string",
        "amount": "decimal(10,2)",
        "country": "string",
        "created_at": "timestamp"
    },
    
    "quality": {
        "required_columns": ["order_id", "amount"],
        "not_null": ["order_id"]
    }
}

ingestion.py Python

from config import PIPELINE, DB_CONFIG

# Database tables ingestion
for table in DB_CONFIG["tables"]:
    df = (
        db.read
        .format("jdbc")
        .option("url", DB_CONFIG["url"])
        .option("dbtable", table)
        .load()
    )
    
    # Validate against config schema
    validate_schema(df, PIPELINE["schema"])
    check_quality(df, PIPELINE["quality"])
    
    df.write.parquet(PIPELINE["paths"]["output"])

Read Config

→

Validate Schema

→

Quality Check

→

Load Data

✓ Schema validation • ✓ Quality checks • ✓ FULL/INCREMENTAL modes

TOOLS I USE

Tech Stack

Production-tested technologies for building reliable data systems

Python SQL Apache Airflow Docker Git GitHub SQL Server Polars DuckDB Azure Data Factory Databricks Apache Kafka PySpark Microsoft Fabric Terraform

WHERE I'M HEADING

2026 Roadmap

Cloud certifications, leadership growth, and enterprise-scale platforms

🎓

Certifications

Fabric Data Engineer (DP-700) Q1
Databricks DE Associate Q2
Databricks DE Professional Q3

💼

Portfolio

5+ detailed case studies Ongoing
Architecture diagrams Q1
Open-source ETL toolkit Q2

🚀

Career

Senior/Staff Data Engineer Q1-Q2
Cloud-native data platforms Ongoing
Technical leadership & mentoring Ongoing

Building Production Data Platforms at Scale

Building Data Infrastructure That Matters

Before & After Transformations

Data Delivery Speed

Data Freshness

Data Quality Incidents

Core Specializations

Medallion Data Lakehouse

ETL Pipeline Architecture

Dimensional Modeling

Data Quality Engineering

ML-Powered Forecasting

RPA & Legacy Integration

Featured Projects

Medallion Data Lakehouse

Enterprise Data Platform

Demand Forecasting System

Anomaly Detection System

Configuration-Driven Pipelines

Tech Stack

2026 Roadmap

Certifications

Portfolio

Career

🚀 Let's Connect