Vision & Values

Current Focus

Karma-Čintana refers to the reflection of action in Sanskrit. This online webpage is dedicated to my reflection on the work I have done in pursuit of knowledge and providing value to society at large.

I am a Computer Science student at the University of Minnesota, interested in generative AI models, distributed systems, and formal language theory + automata theory. I am also interested in machine learning, however I would say my interest is limited to the application of those algorithms in linguistics and deep learning models. I am very fascinated by the nature of information, linguistics, and how we can best process natural language into intelligent software.

I am also the UMN AI Club President, fostering a community of diverse people to solve our world's greatest frontier — simulated intelligence.

Outside of work, I love to read Indian history and am a passionate writer. I play the guitar and love tennis 🎾.

Latest Publications

How Much Labeled Data Does a GUI Agent Need? Data Scaling Analysis for Inverse Dynamics Models on macOS Workflows
Maanas Taneja, Adil Arya, Purab Shingvi, Ayman Siddique, Shardul Mehal
In Preparation, 2026. GitHub
Stealing Creator's Workflow: A Creator-Inspired Agentic Framework with Iterative Feedback Loop for Improved Scientific Short-form Generation
Jong Inn Park, Maanas Taneja, Qianwen Wang, Dongyeop Kang
arXiv:2504.18805, 2025. arXiv
Prompt Optimization as a State-Space Search Problem
Maanas Taneja
arXiv:2511.18619, 2025. arXiv GitHub
GPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language Models
Maanas Taneja, Purab Shingvi, James Mooney
arXiv:2601.04719 arXiv GitHub

Experience

Cisco XDR Software Engineering Intern
May 2025 – Aug 2025 | Atlanta, GA
Designed an autonomous AI agent using LangGraph to process 200+ daily customer support tickets by executing diagnostic queries across large-scale data lakes. Developed a Django/PostgreSQL microservices backend with Duo SSO and React dashboard on AWS EKS. Filed a patent for the system.
Minnesota NLP Lab Undergraduate Researcher
Jan 2025 – Present | Minneapolis, MN
Co-authored SciTalk, a multi-agent framework for scientific video generation. Designed iterative feedback loops using LLaVA-NeXT-Video 34B and built a modular pipeline with 6 GPT-4o agents to extract and ground videos in source materials.
Cisco Webex Software Engineering Intern
May 2024 – Aug 2024 | San Jose, CA
Designed and deployed an LLM-powered evaluation pipeline for the Cisco Control Hub AI Assistant, improving evaluation accuracy by 40%. Cut generation costs by 50% by introducing Redis-based caching infrastructure.
Mohenjo Daro Co-Founder
Sept 2024 – Present | San Francisco, CA
Launched a decentralized media startup aggregating underrepresented voices; scaled to 500+ monthly readers. Led end-to-end architecture (Next.js, FastAPI, PostgreSQL) and cloud infrastructure (AWS, Terraform).
CloudQuest Pvt Ltd. Software Engineering Intern
May 2023 – July 2023 | Gurugram, IN
Engineered cross-platform video player for India's first cloud gaming service using C++, FFmpeg, OpenGL, and WebRTC. Reduced video latency by 120% and improved quality 4x by designing custom decoder. Ported client to WebAssembly.

Projects

ShotParser
Python, CLIP, OpenCV, Whisper, Gemini, React
Automated video analysis pipeline that segments footage into individual shots with full cinematographic metadata. Uses a custom deterministic multi-pass algorithm combining CLIP-based similarity voting, optical flow, and color histograms for scene boundary detection — avoiding neural network black boxes for fully traceable, debuggable splits. Integrates OpenAI Whisper for per-shot dialogue transcription and Gemini 2.5 Pro for visual descriptions. Outputs structured JSON with camera angle, lens type, motion, and an entity registry for consistent character/location tracking across shots. GitHub
GPU-Accelerated KV Cache Quantization
CUDA, C++, OpenMP
Implemented INT8 KV-cache quantization for transformer inference, demonstrating ~200× to ~1700× speedup over CPU baselines. Developed optimized CUDA kernels (Naive, Tiled, Coarsened, Vectorized) for memory-bound workloads and benchmarked across realistic LLM dimensions (up to 131k context) with strict correctness verification ($< 10^{-6}$ error). GitHub
In-Memory Vector Knowledge Base
SQLite, FAISS, LangChain
A lightweight vector database built with SQLite and FAISS for agent memory and rapid prototyping. Created at Cisco to bypass infrastructure constraints, it provides a zero-dependency knowledge store supporting semantic chunking, category-based retrieval, and LLM-powered relevance filtering. Plug-and-play with LangChain, handling both in-memory and persistent storage for small to medium-scale knowledge bases. GitHub
Media Library
C++, FFMPEG LibAV
A work-in-progress media manipulation API written in C flavoured C++ using FFMPEG LibAV, currently capable of encoding, decoding, remuxing and transcoding files with a simple interface. The Library can also be used to capture, process RTP stream packets for use in video streaming applications. The goal is to make a library that can be used to edit videos programmatically, sorta like Premiere Pro but without the UI. GitHub
Ray Tracer
C++, CUDA
A work-in-progress Path Tracer based on Peter Shirley's book, currently using a naive brute force ray tracing algorithm to simulate Lambert, Metallic, and Transparent objects. Currently porting to CUDA for massive performance gains via parallelism. GitHub