Logo

Surya Pratap Singh

Robotics MS at UMich

About Me

I am an MS student in the Robotics Department at the University of Michigan (UMich), with a focus on the intersection of robotics, machine learning, and computer vision. My research at the FCAV lab, under the guidance of Professor Ram Vasudevan and Dr. Elena Shrestha, has allowed me to explore reinforcement learning for robot control and multimodal perception for scene understanding. My experience spans various domains, including SLAM, state estimation, and motion planning, skills I honed through various projects at UMich.

Before joining UMich, I interned at the Biorobotics Lab at Carnegie Mellon University (CMU), where I collaborated with Dr. Howie Choset, Dr. Matthew Travers, and industry partners from Apple on an electronic-waste recycling project. This experience deepened both my technical expertise and my ability to communicate complex ideas while working closely with industry stakeholders.

My academic journey began at the Birla Institute of Technology & Science (BITS Pilani), where I earned a bachelor’s degree in Mechanical Engineering with a minor in Robotics and Automation. Under the mentorship of Professor Arshad Javed, I developed a solid foundation in classical control of robot manipulators and learning-based control of UAVs.

My passion extends beyond robotics into machine learning and computer vision applications in diverse fields such as healthcare, visual content generation, and 3D scene reconstruction. I am eager to leverage my interdisciplinary expertise to tackle complex challenges across various industries.

Experience

Perception, Planning, & Learning Research Assistant
June 2023 - Present, University of Michigan,
Ann Arbor, MI, US
Reinforcement Learning & Simulation Research Intern
May 2021 - May 2022, Carnegie Mellon University,
Pittsburgh, PA, US
Motion Planning Intern (Software)
May 2019 - Jul 2019, Indira Gandhi Centre for Atomic Research,
Kalpakkam, TN, India

Highlight Projects

Multimodal Perception for Autonomous Racing

[Code (Coming Soon)] [Video]
To develop a robust multi-terrain control policy for autonomous racing, LiDAR and RGB camera inputs were used for online RL. A lightweight UNet model improved visual sim-to-real transfer, achieving a 0.99 Dice score by segmenting floor, opponent rover, walls, and background. ROS nodelet manager and optimized ROS-Docker communication boosted update rates from 10 Hz to 30 Hz, allowing the rover to run up to 2.5 m/s. Offline TD-MPC RL then yielded effective results with 5K-10K transitions in under 15 episodes, compared to over 300K episodes needed for online training.

Electronic-waste Recycling

[Code] [Slides]
Collaborating with clients from Apple, developed an object-agnostic gripping strategy for E-waste manipulation at high speeds and accuracy. Co-led the simulation team to integrate a 6-DoF parallel manipulator with MuJoCo, using a waypoint-following PD controller, achieving joint value errors under 8 mm (lin.) and 0.1 radians (rot.). Created a numerical optimization-based forward kinematics approach with maximum errors of 5 mm (pos.) and 0.75 radians (orient.). Built a pseudo-Jacobian matrix for singularity avoidance and implemented a DDPG + HER-based Deep RL algorithm in MuJoCo-powered Robosuite, reaching a 95.55% success rate up to 0.5 m/s conveyor speeds.

MBot Autonomy: Control, Perception, and Navigation

[Code] [PDF]
To develop a comprehensive robotic system for autonomous navigation in a warehouse scenario, a multi-hierarchical motion controller was implemented, achieving 2.5 cm RMS trajectory error and velocity errors of 0.025 m/s (linear) and 0.045 rad/s (angular). SLAM was deployed with 2D occupancy grid mapping and Monte Carlo localization, resulting in a 6.63 cm RMS localization error relative to gyrodometry. An A* path planner with collision avoidance and frontier exploration ensured 100% success in optimal path-finding, returning within 5 cm of the home position.

Autonomous Robotic Arm: Vision-Guided Manipulation

[Code] [PDF]
To enable human users to control the RX200 arm through a GUI while providing autonomy via vision-guided kinematics, the arm and RGB-D camera were integrated into the ROS2 framework. A PID controller was implemented for precise control, achieving less than 2 cm end-effector positioning error in block swapping. Automatic camera calibration with Apriltags and integration with forward and inverse kinematics ensured 100% success in click-to-grab/drop tasks within a 35 cm radius. Block color (VIBGYOR) and size detection algorithms with OpenCV allowed the arm to sort blocks with 100% accuracy.

SelfDriveSuite: Vehicle Control and Scene Understanding

[Code]
This project developed algorithms for autonomous driving, including adaptive cruise control, model predictive control (MPC), and robust scene understanding. A QP-based minimum norm controller ensured a safe following distance of 1.8 times the vehicle's velocity. Linear and nonlinear MPCs achieved path tracking with RMS errors of &it1m in position, &it0.5 rad in orientation, &it0.5 m/s in velocity, and &it1 m/s² in acceleration. For scene understanding, a 2D image classification model reached 90% accuracy on blurry images, and a 15-class scene segmentation model, trained in varied weather conditions, achieved 81.2 mIoU using UNet.

OriCon3D: Monocular 3D Object Detection

[Code] [PDF]
This project involved developing a robust and lightweight 3D object detection architecture for real-time autonomous driving using the KITTI dataset. Using MultiBin to regress 3D bounding box orientation from 2D bounding boxes with a pre-trained YOLO model, we achieved 3D IoU scores of 76.9 for Cars, 67.76 for Pedestrians, and 66.5 for Cyclists. Integrating lightweight feature extractors like MobileNet-v2 and EfficientNet-v2 reduced inference time by over 80% and improved 3D IoU by 2.4% for Cars, 1.5% for Pedestrians, and 4.5% for Cyclists, delivering superior performance compared to conventional methods.

Twilight SLAM: Navigating Low-Light Environments

[Code] [PDF]
This project enhanced Visual SLAM accuracy by integrating image enhancement modules—Bread, Dual, EnlightenGAN, and Zero-DCE—into ORB-SLAM3 and SuperPoint-SLAM. This achieved a 15% mean increase in features extracted across various lighting conditions. Localization precision improved with an 18% reduction in RMSE between SLAM-generated and ground-truth poses, with ORB-SLAM3 outperforming SuperPoint-SLAM. EnlightenGAN with ORB-SLAM3 also reduced maximum absolute error in SLAM-generated poses by 13%, optimizing performance in low-light conditions.

Autonomous UAV-based Search and Rescue

[Code] [Video] [PDF]
To enhance autonomous UAV navigation in search-and-rescue (SaR) missions, the project focused on using reinforcement learning to locate and track victims until the rescue team arrives. An urban scenario was simulated in Gazebo with the AR.Drone, and a position controller was fine-tuned using MATLAB’s System Identification Toolbox, achieving less than 1 cm position error. Q-learning with function approximation reduced training time by 50% and addressed large state spaces. YOLO combined with Optical Flow was also used for real-time target tracking, showcasing effective performance in dynamic SaR environments.

Publications

Autonomous UAV-based Target Search, Tracking and Following using Reinforcement Learning and YoloFLOW
IEEE Robotics and Automation Letters-2020
Ajmera, Y., Singh, S.

UAV View
[ Video ] [ PDF ]

We developed a UAV-based system for search and rescue missions, integrating reinforcement learning for autonomous navigation and YOLO with Optical Flow for real-time target tracking. This approach enables the UAV to find and follow victims in cluttered environments, ensuring their locations are continually updated for swift evacuation. Extensive simulations demonstrate the system's effectiveness in urban search and rescue scenarios.

Twilight SLAM: Navigating Low-Light Environments
arXiv:2304.11310-2023
Singh, S., Mazotti, B., Rajani, DM., Mayilvahanan, S., Li, G., Ghaffari M.

Twilight-SLAM
[ Video ] [ PDF ]

This work presents a detailed examination of lowlight visual Simultaneous Localization and Mapping (SLAM) pipelines, focusing on the integration of state-of-the-art (SOTA) low-light image enhancement algorithms with standard and contemporary SLAM frameworks. The primary objective of our work is to address a pivotal question: Does illuminating visual input significantly improve localization accuracy in both semidark and dark environments?

OriCon3D: Effective 3D Object Detection using Orientation and Confidence
arXiv:2304.11310-2023
Rajani, DM., Singh, S., Swayampakula, RK.

OriCon3D
[ PDF ]

In this work, we propose a simple yet very effective methodology for the detection of 3D objects and precise estimation of their spatial positions from a single image. Unlike conventional frameworks that rely solely on center-point and dimension predictions, our research leverages a deep convolutional neural network-based 3D object weighted orientation regression paradigm. These estimates are then seamlessly integrated with geometric constraints obtained from a 2D bounding box, resulting in derivation of a comprehensive 3D bounding box.

Energy Efficiency Enhancement of SCORBOT ER-4U Manipulator Using Topology Optimization Method
Mechanics Based Design of Structures and Machines-2021
Srinivas, L., Aadityaa, J., Singh, S., Javed, A

Scorbot ER-4U
[ PDF ]

In this work, topology optimization of the upper and forearm of a 6-DoF Scorbot manipulator was performed considering dynamic loading conditions. A motion study in SolidWorks led to a 30% reduction in peak stress and a 15% decrease in deflection. Additionally, MATLAB’s Lagrange-Euler model demonstrated a 40% increase in energy efficiency.

Experimental evaluation of topologically optimized Manipulator-link using PLC and HMI based control system
International Mechanical Engineering Congress-2019
Srinivas, L., Singh, S., Javed, A

Trio robot
[ PDF ]

In this work, a universal test setup for a 1-DoF manipulator link was developed to validate Von Mises stress values under static loading conditions. Strain data captured with LabVIEW and DAQ showed stress measurements within 1.27% of MATLAB simulations. Dynamic stress analysis of a 3-DoF TRR manipulator in MSC Adams achieved a mean error under 2% compared to simulations.

Education

University of Michigan, US
Aug 2022 - May 2024, Master of Science in Robotics
Birla Institute of Technology & Science Pilani, Hyderabad, India
Aug 2017 - Jun 2021, Bachelor of Engineering in Mechanical
+