I'm a Computer Engineering graduate from Pulchowk Campus, Tribhuvan University.
Most of my time goes into trying to understand how language models work on the inside — why a
safety-aligned model sometimes refuses perfectly harmless questions, and how the
geometry of a model's internal representations relates to what it has actually learned.
A lot of it is simply curiosity, following a question until it starts to make sense.
I care about making these tools useful for languages that are often left out. My family speaks
Maithili — around 50 million speakers, yet very little language technology exists for it —
so building a small model for it felt as personal as it was technical.
Alongside research I work on AI safety and agentic systems at Astha.ai,
and I hope to pursue a PhD in interpretability and alignment. I try to learn in the open: I write at
Tatva, share short explainers on
YouTube, and keep a camera and a fondness for
Mithila's stories and festivals close by. If any of this overlaps with your own work, I'd be glad to hear from you.
Research Interests
- Mechanistic interpretability — representation trajectories, activation steering, circuits
- AI alignment & safety — over-refusals, safe deployment, agentic oversight
- Representation geometry — effective dimension, generalization, distribution shift
- Low-resource & multilingual NLP — models and benchmarks for underrepresented languages
- Agentic systems security — prompt injection, tool poisoning, Model Context Protocol (MCP)
Selected Publications
See my Google Scholar for the full list.
1.
SafeConstellations: Mitigating Over-Refusals in LLMs Through Task-Aware Representation Steering.
ACL 2026 (Main Conference).
An inference-time, task-aware trajectory-shifting method that cuts over-refusals by up to
73%
with minimal utility loss — no retraining required.
[paper]
2.
On the Relationship Between Representation Geometry and Generalization in Deep Neural Networks.
Preprint, 2026 (sole author).
Shows that
effective dimension — an unsupervised geometric metric — predicts generalization
across vision and language models (partial
r = 0.75 over 52 classifiers).
[paper]
3.
Geometric Phases of Mechanism Formation in Neural Networks.
GLOW 2026 — Workshop on Generalizing from Limited Resources in the Open World @ IJCAI 2026 (Poster).
Using linear probes and centered kernel alignment (CKA) across dense training checkpoints, finds that
classification mechanisms form output-layer-first and within the first ~5% of training:
output layers reach >70% probe accuracy by epoch 5 while input layers stay below 50% (Cohen's d = 3.68).
The same deep-first pattern holds in the first ~200M tokens of from-scratch LLM pretraining (GPT-2 Small,
SmolLM2-135M) and reproduces on public Pythia / OLMo-2 checkpoints — and isn't explained by gradient
magnitude (input layers receive up to 6.9× more gradient yet learn less). A temporal map of when
and where mechanisms emerge (CIFAR-10/100, decoder-only LLMs).
arXiv & code coming soon.
4.
Can maiBERT Speak for Maithili?
LoResLM @ ACL 2026.
The first monolingual BERT for Maithili (~50M speakers);
87% accuracy on news classification,
outperforming MuRIL and NepBERTa.
[paper]
[model]
5.
Revolutionizing Currency Security: A YOLOv8-Based Approach for Detecting Counterfeit Nepali Banknotes.
J. Bus. Econ. Stud., 2024.
[paper]
6.
Machine Learning Analysis of Tirhuta Lipi.
2023.
0.97 accuracy in Tirhuta script recognition for OCR and translation of low-resource scripts.
[paper]
7.
Support Vectors Are a Better Way of Text Classification for Imbalanced Data.
2023.
A robust SVC method for 100+ class text classification under severe imbalance.
[paper]
Preprints & Work in Progress
- Cross-lingual inference-time steering
(devanagari-steering) —
transferring a model's Hindi competence onto sister Devanagari languages (Maithili, Nepali, Bhojpuri)
with no fine-tuning, directly extending SafeConstellations and maiBERT.
- Federated memory for AI agents
(paper ·
server) —
long-term agent memory with transaction trails, versioning, and audit; evaluated on
LOCOMO and LongMemEval.
News
- 2026 — Geometric Phases of Mechanism Formation in Neural Networks accepted as a poster at GLOW 2026 @ IJCAI.
- 2026 — 🥉 Kaggle Competition Bronze Medal for BirdCLEF+ 2026 (rank 354 / 4084, top 8.7%). Working note submitted to LifeCLEF 2026 / CEUR-WS.
- 2026 — SafeConstellations accepted to ACL 2026 (Main).
- 2026 — maiBERT accepted to LoResLM @ ACL 2026.
- 2026 — Preprint on representation geometry and generalization released.
- 2024– — Working on AI-safety & agentic-systems research at Astha.ai (MCP-Scanner, SAFE-MCP).
Experience
- AI Researcher — Safety & Agentic Systems, Astha.ai
(2025–present) — Zero-Trust agent oversight, MCP-Scanner vulnerability platform, SAFE-MCP framework.
- AI Engineer — RAG & Infrastructure, AMNIL Technologies (2024–2025) —
guardrails, LLM-as-a-Judge evaluation, self-hosted LLM serving with vLLM.
- Data Team Lead, GradeUp Educations (2022–2024) — learning agents/chatbots,
an automated grade-evaluation system, and semantic-similarity matching.
- GAN Specialization Mentor, DeepLearning.AI (2021–present).
Selected Projects
- maiBERT — first BERT for Maithili
(demo).
- SAFE-MCP — adversarial evaluation framework for MCP agent infrastructure.
- AgentGuard — Zero-Trust protocol for AI agents:
identity, policy, mTLS, audit (Python SDK + Go server).
- spiffe-core ·
TraT — SPIFFE-based agent identity / attestation
and Transaction Tokens for multi-agent workflows.
- sumit-mcp-server — federated memory MCP server
(live on HF Spaces).
- Vibe-Coder — an agent that builds Streamlit/FastAPI apps.
- IRB Robotics Arm — open-source image-recognition robotic arm (UN SDG3).
Honors & Awards
- Winner, GritFeat AI Hackathon (2023) — SWIFT: wearable LSTM fall-detection for the elderly (79.86%).
- 1st Runner-Up, Locus Dataverse (2023) — NLP classification of imbalanced research-paper abstracts.
- 1st Runner-Up, Docsumo DataRush (2022) — abstract classification into 158 classes (SVC + TF-IDF).
- Best AI Project, DELTA 3.0 (2023) — Nepali Harvest: crop-disease prediction & harvest-time recommendations.
- Winner, IT-Meet Image Challenge (2022) — computer-vision classification of Nepali ballot-paper images.
- Winner, LogPoint Capture The Flag (2022) — binary exploitation & forensics.
Active Kaggle competitor — recent:
🥉 BirdCLEF+ 2026 (bronze medal, rank 354 / 4084),
ARC-AGI / NeuroGolf 2026 (minimal-cost ONNX networks),
and Scientific Image Forgery Detection (SAM-based).
Documents & Links
Notes & Lab Reports