AI Engineer · IIIT Nagpur 2027
Vishal Kumar

Building real AI systems, not just demos

I build production-oriented AI systems across multi-agent orchestration, RAG, LLM fine-tuning, Production ML, and streaming inference. My strongest work combines research discipline with deployable software.

#1 / 823
AMD Slingshot AI Promptathon
Top 1.2%
HackerRank Orchestrate, 12,885 participants
Top 2.4%
Wunder Fund LOB Predictorum
0.9987 QWK
Clinical AI validation metric
5 live
Deployed AI/product systems
Selected work

Systems with measurable behavior.

The strongest projects lead with the problem, the engineering mechanism, and the proof. This is the order recruiters, founders, and research teams need.

01

CrisisOps

Agents + RL

A procedural SRE training environment for cooperative LLM agents handling cascading microservice failures.

1000 rollouts, 95.1% parse rate, reward gain from 0.404 to 0.445.
SystemPrimary and Buddy agent pair with tool use, schemas, automated judging, and reward shaping.
WhyShows multi-agent systems, environment design, RL-style feedback loops, and deployment depth.
LangGraph OpenEnv GRPO QLoRA FastAPI
02

Triagegeist

Clinical ML

Emergency triage prediction system with hierarchical safety logic and clinical feature engineering.

OOF QWK 0.9987, ESI-1 recall 98.6%, 80K patient records.
SystemSafety guardrail, 200+ engineered clinical features, and LightGBM/CatBoost/XGBoost ensemble.
WhySignals research discipline: validation, bias awareness, high-stakes error handling, and metric literacy.
XGBoost LightGBM CatBoost Production ML
03

LOB Predictorum

Streaming ML

Limit order book prediction system rebuilt after diagnosing batch-to-live degradation.

Top 2.4% globally, 0.96ms inference, 73x model size reduction.
SystemStreaming-native BiGRU + TimeMixer ensemble, InfoNCE SSL pretraining, and INT8 ONNX quantization.
WhyStrong founder signal: found the real failure mode, pivoted architecture, optimized for deployment constraints.
PyTorch BiGRU TimeMixer ONNX
04

WasteWatch AI

Civic AI

AI-native civic waste reporting platform with vision, fallback routing, voice input, and city-level workflows.

Live deployment, bilingual voice, 12 integrations, 100+ city routing.
SystemGemini vision pipeline, Groq fallback, FastAPI services, voice interface, and operational API integrations.
WhyShows product instincts: users, deployment, external APIs, reliability concerns, and civic impact.
Gemini Groq FastAPI Voice AI
05

CivicPath

RAG + Voice

Non-partisan election assistant for Indian voters with multilingual guidance and ECI-aligned fact checks.

Google Cloud Run deployment, 10 Google services, 12+ Indian languages.
SystemGemini, Maps, Speech, localized retrieval, voice accessibility, and civic journey generation.
WhySignals grounded assistant design for high-trust public information workflows.
Gemini GCP Maps Speech
06

DevBoost AI

Dev tools

Agentic code analysis tool with explain, test, and fix modes for developer workflows.

Streaming responses, 12+ languages, structured output contracts.
SystemReact/Vite interface, FastAPI backend, file upload, export, session history, and deterministic analysis modes.
WhyShows practical AI UX, prompt contracts, and workflow-focused product execution.
React Vite FastAPI LangChain
Proof timeline

Competition signals that compound.

The important story is not one lucky ranking. It is repeated performance across prompt engineering, agents, clinical ML, and quantitative streaming inference.

May 2026
HackerRank Orchestrate

Built a grounded support triage agent using only the provided corpus, with evidence retrieval and auditable outputs.

Top 1.2% (#159 / 12,885)
2026
AMD Slingshot AI Promptathon

National prompt engineering and AI system design competition at REVA University, Bengaluru.

#1 / 823
2026
Wunder Fund LOB Predictorum

Quantitative AI challenge focused on limit order book prediction under live streaming constraints.

Top 2.4% (#120 / 4,917)
2026
ArtPark CodeForge - IISc Bangalore

Built under hackathon constraints with focus on useful AI product execution.

1st runner-up
2026
Triagegeist Clinical AI

Clinical ML challenge with hierarchical modeling and high-stakes triage metrics.

OOF QWK 0.9987
Capabilities

A stack around applied AI systems.

The stack is grouped by the problems it solves. This reads stronger than a flat wall of tools.

Agents and LLM systems

Agent orchestration, retrieval, tool use, prompt contracts, evaluation, grounding, and model fallback.

LangGraph LangChain RAG Gemini Claude Groq

Modeling and research

Clinical ML, time-series modeling, self-supervised learning, fine-tuning, ranking metrics, and ablations.

Python PyTorch XGBoost LightGBM QLoRA ONNX

Production engineering

APIs, frontends, deployment, integrations, streaming inference, and pragmatic reliability tradeoffs.

FastAPI React Next.js Docker GCP AWS
Additional builds

Range without diluting the core story.

These are useful secondary signals, but the primary narrative remains production AI systems.

About

Research depth plus shipping discipline.

I am a Fourth-year B.Tech CSE student at IIIT Nagpur. My work sits at the intersection of agentic AI, RAG, reinforcement learning, Production ML, and production engineering.

The pattern across my projects is deliberate: build a system, expose the failure modes, measure behavior, and deploy a usable version. That is why my strongest work includes live systems, competition rankings, documented model choices, and clear evaluation metrics.

I am looking for AI engineering internships, applied research roles, and founder-led teams where the work demands both technical depth and product judgment.

Build serious AI systems.

Open to internships, applied research collaborations, and teams working on agents, RAG, LLM evaluation, clinical AI, or production ML infrastructure.