Harshit Dhankhar

Hey there, I am a first year Master's student in Computer Science and Engineering (AI concentration) at the University of California San Diego (UCSD). I graduated with a Bachelor's degree in Mathematics and Computing from the Indian Institute of Technology (IIT), Patna in May 2025.

I have a keen interest in the multi-dimensional applications of AI and Reinforcement Learning and also love to research the algorithmic designs of existing methodologies.

My current research interests have been shaped throughout my journey as an undergraduate researcher, with various interdisciplinary research projects based primarily on Deep Learning, AI and Reinforcement Learning, specifically in optimal routing for Autonomous EVs, CO2 forecasting, job scheduling using indexable policies in Multi-Armed Bandit (MAB) setting, Qubit routing and Natural Language processing for Machine Translation (MT) systems.

I completed my Bachelor's thesis, co-supervised by Prof. Asif Ekbal and Dr. Y.M. Tripathi. I spent the summer of 2024 working at LAMSADE, the AI lab of Université Paris Dauphine - PSL under Prof. Tristan Cazenave. Prior to this, I had collaborated with Dr. Tejas Bodas at Computer Systems Group (CSG), IIIT Hyderabad for a year, which shaped my inchoate research interests during sophomore year.

Please feel free to check out my Resume, and drop me an email at [hdhankhar@ucsd.edu] anytime to talk! :)

~ Email | Resume | Google Scholar | Github | LinkedIn ~

Sept '25	Started MS in Computer Science at UC San Diego.
Aug '25	Recieved my undergrad degree at the 12th convocation of IIT Patna.
June '25	Attended the Microsoft Research Academic Summit held in Bangalore, India.
May '25	Paper "Tabular and Deep Reinforcement Learning for Gittins Index" presented at WiOpt 2025 held in Linköping, Sweden.
Feb '25	Started working as AI Engineer at AnyFeast, London.
Feb '25	Got into UCSD and NYU Courant for MSCS starting Fall 25.
Dec '24	Submitted my work on leveraging quality estimation models for feedback training in MT to ACL Rolling Review.
Dec '24	Attended the Inter IIT Tech Meet 13.0 at IIT Bombay from 11th to 14th December! Got to lead a Problem Statement by Dream11.
Dec '24	My work on AI based agents for quantum circuit routing got accepted at AAAI '25 (QC+AI).
Aug '24	Submitted my work on RL-based learning of indexable policies to AAAI.
June '24	Presented our work on learning Gittins index online at the RL4SN workshop held in Toulouse, France.
May '24	Submitted work on Deep-RL based EV routing to IEEE transactions on reliability.
May '24	Started my summer internship at PSL Dauphine, Paris.
Nov '23	Paper accepted at IEEE conference on big data, Italy.
Oct '23	Organised IIT Patna's chapter of TEDx.
May '23	Started my summer internship at IIITH.
Feb '23	Won silver medal at IIT Tech Meet 11.0.
Oct '22	Joined uni's ML society as sophomore coordinator.
Aug '22	Runner up at Smart India Hackathon (SIH), held in Vishakapatnam.
Nov '21	Started my undergrad journey at IIT Patna.

University of California San Diego
Master of Science in Computer Science and Engineering
Fall 2025 - Expected Spring 2027

Indian Institute of Technology Patna
Bachelor of Science in Mathematics and Computing
Nov '21 - Jul '25 | CPI: 8.34/10.0

Student Societies:

Senior-year Technical Secretary | Technical Council, Students' Gymkhana
Overall Organiser | TEDxIIT Patna 2023
Sophomore year coordinator (Sponsorship and Marketing) | Celesta 2022: Annual Technical Festival
Sophomore year coordinator | NJACK ML: IIT Patna's ML society
Sophomore year coordinator (Media and Public Relations) | Anwesha 2022: Annual Cultural Festival
Sophomore year coordinator (Planning and Curation) | TEDxIIT Patna 2022
Senior content correspondent | Forthright: Media Body, IIT Patna

	AI Engineer \| AnyFeast, London Feb '25 - Aug '25 • Deployed a zero-touch ETL pipeline that automates recipe attribute extraction using agentic AI, optimizes local ingredient retrieval via a user-proximity-based caching algorithm, and enriches recipe's instructions with a LangChain-powered RAG module. • Implemented a recommender system using hybrid filtering approach to personalize users' feed, increasing user click-through rate by over 30%.
	AI Lab Research intern \| LAMSADE, AI Lab @ PSL Dauphine May '24 - Oct '24 Advisor: Prof. Tristan Cazenave Worked on devising Monte Carlo Tree Search based methods to solve the qubit routing problem in quantum circuits. Leveraged Nested Monte carlo search to propose an algorithm which beats SOA methods by upto 10.55% in circuit depth. Also, our algorithm boasts a 37.17% runtime advantage compared to other tree-based solutions.
	Research Intern \| Computer Systems Group (CSG) @ IIITH Dec '22 - Dec '23 Advisor: Dr. Tejas Bodas Devised a novel two time-scale algorithm based on Q-learning to learn indexable policies for Multi-Armed Bandits (MABs) and explored its applications in job scheduling.
	Research affiliate \| AI-NLP-ML Lab @ IIT Patna Aug '23 - May '25 Advisor: Prof. Asif Ekbal Worked on leveraging Quality-estimation based models along with 'Pronoun Generation Likelihood (PGL)' to design a novel reward framework for feedback training to improve translation quality of MT systems (EN -> DE). This work is now part of my Bachelor's thesis, co-supervised by Dr. Tripathi and has been submitted at ACL Rolling Review (ARR).
	Undergraduate Researcher \| Advanced Computing Lab @ IIT Patna July '23 - September '24 Advisor: Prof. Rajiv Misra (HOD, CSE) Worked on various projects involving: Use of Attention-aware LSTM for analysing and predicting carbon emissions in SAARC countries. Deep-RL methods for EV charging station selection in a vehicle routing setting. Transformer and Deep-RL based method to predict charging prices for EVs in a V2G setting.

Tabular and Deep Reinforcement Learning for Gittins Index

Harshit Dhankhar, Dr. Tejas Bodas
WiOpt 2025
[paper] [code]

Developed a novel Q-learning based algorithm and its Deep-RL counterpart for estimating Gittins indices in Multi-Armed Bandits (MABs), surpassing state-of-the-art methods in terms of memory (~50%), runtime (~10%), and convergence. We also applied the algorithm in minimizing the mean flowtime in a pre-emptive job scheduling problem when jobs are available in batches and have an unknown service time distribution.

Nested Qubit Routing

Harshit Dhankhar, Prof. Tristan Cazenave
AAAI 2025 (QC+AI)
[paper] [code]

Worked on developing a Nested Monte Carlo Search (NMCS)-based router for optimising circuit depths in qubit routing problem for quantum circuits. Our routing agent was able to outperformed industry-standard solutions from leaders like Google and IBM by an impressive 10.55%, across all our experiments.

Analysis and Forecasting of Carbon Emission in SAARC Countries using Attention-based LSTM

Anil Verma, Harshit Dhankhar, Prof. Rajiv Misra, Prof. T.N. Singh, Om Prakash Dhakal
IEEE International Conference on Big Data (IEEE BigData), 2023
[paper] [code]

We drew a comparative analysis conclusively demonstrating the superior performance of A-LSTM models over baseline LSTM models when applied to the CO2 emission dataset sourced from Our World in Data (OWID) and World Bank Indicator database focused on SAARC countries. Specifically, the Attention-based LSTM model showcases a marked improvement of 57%.

	Alloc8 - Dorm Room Allocation Portal [Demo] [Website] Led the development of a portal to streamline dorm room allocation for 3000 students across 1400 rooms at IIT Patna. Designed a user-friendly interface, backend deployed on VPS with load balancing to manage high traffic; implemented Redis-based locks for secure room selection and performed roommate pairing using Spherical K-Means clustering.
	Quality Estimation based feedback training for improving pronoun translation [paper] Advisor: Prof. Asif Ekbal We developed a novel reward framework called 'ProNMT' using quality estimation models and 'pronoun generation likelihood (PGL)' to enhance English-to-German MT, particularly pronoun translation, bypassing the need of human feedback. Extensive experiments demonstrate significant gains in pronoun translation accuracy and general translation quality across multiple metrics. ProNMT offers an efficient, scalable, and context-aware approach to improving NMT systems, particularly in translating context-dependent elements like pronouns.
	Optimising electric vehicle charging routes with Deep Reinforcement Learning considering driver preferences [paper] [code] Advisor: Prof. Rajiv Misra (HOD, CSE) Studied the use of Deep RL methods (DQN, DDPG, PPO) for prioritising minimisation of distance or cost (based on driver's preference) in grid based charging station selection framework to aid decision making of automated EVs. The environment includes various uncertainities related to charging queues, placement of cars at any time, charging costs, etc. Under review at IEEE Transactions on Reliability.
	Optimizing In-Home EV Charging: Real-Time Optimization with Time Series Transformers and Policy-based DRL [paper] [code] Advisor: Prof. Rajiv Misra (HOD, CSE) We proposed a transformer-based network for electricity price forecasting, enhancing the scheduling of EV charging to minimize costs and maximize user satisfaction. In addition, Deep-RL methods (DQN, DDPG, PPO) were employed for learning and scheduling EV charging and discharging. Under review at Springer's Applied Intelligence Journal.
	Privy Notion: A collaborative note-taking workspace [code] In this project, I developed a full-stack collaborative text editing platform enabling real-time collaboration with Socket.IO. This allows different users to login and collaboratively share a workspace. Built the frontend with Next.js and Tailwind CSS, and the backend with Next.js, PostgreSQL, and Prisma ORM. Implemented robust validation with Zod, authentication with JWT, and session management via NextAuth.
	Vital Extraction from ECG Monitor Images [code] This project automates the extraction of medical vitals from monitor images using a deep learning-based pipeline. The model processes an unlabeled monitor image to output vital signs like heart rate and oxygen levels. Our approach integrates YOLOv5 for monitor detection, InceptionV3 for classification, XGBoost for vital type classification, and EasyOCR for digit recognition. The system delivers reliable accuracy in identifying vitals, with confidence scores ranging from 0.9 to 0.97. The proposed model is scalable and can handle challenges such as variations in monitor angle, shape, and background color.
	ProfessionPro: An end-to-end career recommendation system [code] In our project "ProfessionPro," we developed an end-to-end framework for NLP-based career guidance. We parsed and pre-process the tokens from applicant resume and leveraged KNN to classify resumes into job domains based on skills, education, and work experience. We also built a career based questionnaire for personalized suggestions and created a user-friendly web portal. Our project was recognized as a runner-up in SIH (Smart India Hackathon) 2022, held in Vishakapatnam, India.

Last updated: September 2025

Note: Some of the code links might point to private Github repositories due to ongoing review process of that project at some journal/conference.

This template is a modification to Jon Barron's website. It has further been modified by Rishab Khincha and Amish Mittal. Find the source code to my version here. Feel free to clone it for your own use while attributing the original author Jon Barron.