Blog on Statistics and Data Science

Daily AQI Bulletins [2024]

CPCB releases Daily AQI data for each city. This data is analysed and presented here: Link
Here is a report based on this data: Link
Hyderabad Daily AQI Bulletins


PSL Extracts [2024]

Physical Sciences Laboratory has meterological data of various reanalyses across years. I extracted this data and visualised it.
TROPOMI NO2
GitHub


Vivekam Bot [2024]

Vivekam is a Twitter(X) bot that tweets wisdom of Swami Vivekananda from his written works. I'd frequented RK Math in Hyderabad and did good reading of Swami Vivekananda. I am using this bot to re-read his works and broadcast his message of reason, love and inclusion.
GitHub


Samvidhan Bot [2024]

SamvidhanBot is a Twitter(X) bot that tweets wisdom from the Constitution of India and its makers. It was created on January 26, 2022. Later, Twitter became X, APIs got changed, there are no more free servers available. Eventually, SamvidhanBot is hosted on my RaspberryPi.
We always need content and newer ideas for the bot. Feel free to contribute.
GitHub | SamvidhanBot Blog |


TROPOMI Extracts [2023]

TROPOMI (TROPOspheric Monitoring Instrument) is the satellite instrument on board the Copernicus Sentinel-5 Precursor (S5P) satellite. TROPOMI monitors trace gases and aerosols relevant for air quality and climate.
In this project, TROPOMI data of pollutants NO2, SO2, HCHO and O3 is extracted via Google Earth Engine (GEE). Data is extracted for 104 airsheds (Indian Cities) at 1000m resolution and for India at 0.1 degree resolution.
TROPOMI NO2
GitHub
Plots and CSVs for 104 airsheds: NO2 | SO2 | HCHO | O3 |


Intelligent Data Solution - Disaster Risk Reduction (IDS-DRR) [2023]

40% of Assam experiences flood every year and the government spends hundreds of crores to manage floods. IDS-DRR is to make this flood management more thoughtful by providing evidence to the decision makers on Flood Hazard, Losses & Damages, Vulnerability etc. I worked predominantly on geospatial datasets to make these datasets available for decision making.
My Talk at IndiaFOSS3.0 | Presentation for the talk | GitHub
Blog introducing the datasets
Blog on processing flood inundation maps from BHUVAN
Blog on Confirmatory Factor Analysis (CFA) to make sense of the data


Datafication of Indian court judgments [2023]

Legal researchers annotate court judgments to mark variables of interest. Manually, they can only annotate a few judgments. I used NLP (from regex to ML models) to automate this process. About 300 judgments related to Child Rights are datafied this way. These datasets are used by researchers in their Empirical Legal Research (ELR).
My Talk at FOSS-U Hyderabad Meetup | My Talk at HydPy-Hyderabad | Presentation for the talk
GitHub | Medium Blog


Data-driven environment monitoring at Green War Room - Delhi Government [2022]

Delhi's air pollution needs no introduction. I've automated the environment monitoring processes in the Green War Room of Delhi Government. Enabled the Environment Secretary to take a more data-informed decision on managing air quality in Delhi.
Green Delhi Dashboard | Medium Blog | Automating Impact
Some mention of this project in the Economic Survey of Delhi


How deprived are Indian districts? [2020]

This is the first dashboard I'd built for a weekend hackathon at the Indian School of Business (ISB). We analysed Mission Antyodaya 2019 data to create "Deprivation Index" for each district. Machine Learning was not required, but we implemented basic K-Means Clustering to sound fancy. We won the hackathon.
Antyodaya Dashboard | GitHub