NITK • SEM 2 • COURSEWORK
Welcome
— English

Distributed Data
Management

An interactive showcase of data parsing, distributed databases, exploratory analysis, and spatial modeling.

Course Overview

📊

Distributed Operations

Mastering data management across different schema setups using Pandas, SQLite, and MongoDB architectures.

🔍

Exploratory Analysis

Pre-processing pipelines uncovering hidden patterns and preparing clean datasets for scale.

🤖

Hotspot Modeling

Deploying programmatic geospatial clustering (K-Means/KDE) on road accident data.

Assignments Showcase

Lab 01
Database & Analysis Pipeline

Accident Hotspot Analysis

A comprehensive lab covering data parsing, NoSQL/SQL databases, EDA formatting, and integrated Spatial Hotspot Analysis.

Lab 02
Distributed Computing

Big Data with Hadoop

Scaling accident analysis to millions of records using a Docker-based Hadoop cluster and MapReduce parallel processing.

Theory 01
Academic Report

End-to-End Data Solutions

A comprehensive digital book detailing modern distributed architectures, the shift from Lambda to Kappa platforms, and MLOps strategies.

Theory 02
Algorithm Survey

DDM Algorithms Showcase

An interactive survey of distributed aggregation protocols, featuring TAG, Push-Sum, and our proposed Adaptive Gossip-Hierarchy hybrid.