Autonomous Trash Pickup and Sorting Robot
Course Instructor
Venkittaraman Pallipuram Krishnamani
Abstract
This project presents the design and deployment of a partially autonomous robot that integrates embedded systems, computer vision, and multi-actuator systems to detect and interact with litter in real time. The goal is to build a cost-effective, modular, and scalable system capable of identifying and mechanically removing litter using a deep-learning-enhanced, Raspberry Pi–controlled robotic platform. While fully autonomous navigation is beyond the scope of this implementation, the system demonstrates a distributed framework with real-time perception and coordinated actuation.
The robotic system is based on a Raspberry Pi 5, which handles all system functions, including isochronous servo control for robotic arm actuation, and control of a mecanum wheel drive system. The Pi performs on-board, real-time video capture, and streams it to an external processing computer over HTTP and drives a PCA9685 servo controller to actuate multi-joint end effectors. These end effectors can execute predefined motion sequences such as opening, lowering, gripping, lifting, and releasing objects defined in an external JSON configuration file. This overall design allows behavior adjustments without modifying the control code, emphasizing modularity.
All computer vision and AI processing are executed externally on a laptop, allowing the use of heavier models without the constraints of embedded hardware. The laptop runs a pretrained TensorFlow object detection model trained on the TACO (Trash Annotations in Context) dataset. Incoming video frames from the Pi are decoded, analyzed and classified into TACO’s 60 waste categories, then broken down into broader classes such as Recyclable, Organic, or General Trash. The detection results consisting of labels, confidence scores, and bounding boxes are served as lightweight JSON via a Flask API.
The Raspberry Pi continuously queries this API to retrieve updated detection outputs and selects the appropriate servo preset or movement pattern based on the classification. For example, detecting a recyclable item triggers a specific sequence of arm motions and wheel positioning relative to the target. Although the robot does not perform autonomous navigation or environmental mapping, the system supports controlled demonstration of pickup sequences activated by AI detection. This command-response architecture allows for seamless scaling into future autonomous versions.
A key contribution of the project is the creation of a robust real-time communication pipeline between the robot hardware and the remote AI processor. By decoupling perception from actuation, the system maintains flexibility, reduces computational load on the Pi and supports future upgrades such as multi-camera input, additional sensors, or integration with ROS2. The HTTP streaming and JSON prediction interface further support rapid debugging, remote monitoring, and modular component swapping. This architecture ensures the system can be easily expanded as more complex perception control modules are developed.
Experimental results show that the platform can perform reliable trash classification at approximately 10–18 FPS, depending on input resolution and lighting conditions. The servo-actuated pickup mechanism consistently responds to classification outputs, validating the effectiveness of the distributed perception-control system. While autonomous navigation remains a target for future development, this iteration successfully delivers the core components of a functional trash-handling robot: real-time detection, classification, communication, and mechanical interaction.
Autonomous Trash Pickup and Sorting Robot
This project presents the design and deployment of a partially autonomous robot that integrates embedded systems, computer vision, and multi-actuator systems to detect and interact with litter in real time. The goal is to build a cost-effective, modular, and scalable system capable of identifying and mechanically removing litter using a deep-learning-enhanced, Raspberry Pi–controlled robotic platform. While fully autonomous navigation is beyond the scope of this implementation, the system demonstrates a distributed framework with real-time perception and coordinated actuation.
The robotic system is based on a Raspberry Pi 5, which handles all system functions, including isochronous servo control for robotic arm actuation, and control of a mecanum wheel drive system. The Pi performs on-board, real-time video capture, and streams it to an external processing computer over HTTP and drives a PCA9685 servo controller to actuate multi-joint end effectors. These end effectors can execute predefined motion sequences such as opening, lowering, gripping, lifting, and releasing objects defined in an external JSON configuration file. This overall design allows behavior adjustments without modifying the control code, emphasizing modularity.
All computer vision and AI processing are executed externally on a laptop, allowing the use of heavier models without the constraints of embedded hardware. The laptop runs a pretrained TensorFlow object detection model trained on the TACO (Trash Annotations in Context) dataset. Incoming video frames from the Pi are decoded, analyzed and classified into TACO’s 60 waste categories, then broken down into broader classes such as Recyclable, Organic, or General Trash. The detection results consisting of labels, confidence scores, and bounding boxes are served as lightweight JSON via a Flask API.
The Raspberry Pi continuously queries this API to retrieve updated detection outputs and selects the appropriate servo preset or movement pattern based on the classification. For example, detecting a recyclable item triggers a specific sequence of arm motions and wheel positioning relative to the target. Although the robot does not perform autonomous navigation or environmental mapping, the system supports controlled demonstration of pickup sequences activated by AI detection. This command-response architecture allows for seamless scaling into future autonomous versions.
A key contribution of the project is the creation of a robust real-time communication pipeline between the robot hardware and the remote AI processor. By decoupling perception from actuation, the system maintains flexibility, reduces computational load on the Pi and supports future upgrades such as multi-camera input, additional sensors, or integration with ROS2. The HTTP streaming and JSON prediction interface further support rapid debugging, remote monitoring, and modular component swapping. This architecture ensures the system can be easily expanded as more complex perception control modules are developed.
Experimental results show that the platform can perform reliable trash classification at approximately 10–18 FPS, depending on input resolution and lighting conditions. The servo-actuated pickup mechanism consistently responds to classification outputs, validating the effectiveness of the distributed perception-control system. While autonomous navigation remains a target for future development, this iteration successfully delivers the core components of a functional trash-handling robot: real-time detection, classification, communication, and mechanical interaction.