Testing AI Pedestrian Detection

I am giving outreach talks with the title "Autonomous Vehicles" to college and high school students. We go over examples of autonomous vehicles from flight autopilots to self-driving cars. We look at the sensors, technologies, and AI advances that will bring self-driving cars on our roads.

After discussing the details of two recent, lethal self-driving accidents I show a simple AI (based on this code with pre-trained model ssd_mobilenet_v2_coco_2018_03_29) for detecting and classifying objects, including pedestrians.

A big question around autonomous vehicles is: Can we trust them? To develop trust, we probably want to see some evidence that the technology is behaving reasonably well. We can try to test it! Below is the testing activity we do in class.

Testing Activity

The task I gave to the students was described as follows:
  • Split up in 4 teams, each to define up to 5 NEGATIVE test cases 
    • showing a person on video that is NOT detected as a person
    • test cases must be sufficiently different, e.g., partially covered only once
  • Document your test cases, e.g.,
    • 1. Person holds paper in front of face
    • 2. Person lifts one arm and one leg up
  • Points:
    • Each time a test fails, e.g., person is visible (is 4m in front of camera) and not detected for >= 5sec team gets one point
  • The team with the most points wins!
    • In case of tie first group to get highest points wins
The reason for documenting the test cases before the actual test is to prevent teams from copying successful negative tests of other teams.


Here are some results of test cases performed by students from Manor High School (faces manually covered by black boxes in the images). 
For many test cases the software was doing a decent job and even detected persons in uncommon poses:

However, some students seemed to be detection-resistant even in a normal pose:


The students came up with very creative test cases and the light conditions probably also helped them to earn many points:

Some students even developed adversarial test cases where they forced the AI to detect but mis-classify them:


We have to keep in mind that the AI tested here is optimized for mobile devices and clearly has to trade precision for performance. Some groups managed to obtain 5/5 points! Two weeks before this event, I had different groups reaching at most 2/5. One difference between the two occasions with different students was the lighting. In the images above sunlight shines in from the left. In the other case the room was artificially lit.
For our own autonomous vehicles we are relying on combinations of cameras and LIDAR sensors.