This is a capstone project for the UC Berkeley Master in Data Science program. The project team included Artem Lebedev, Nick Johnson, Lily Magliente, and Evelin Li. All authors contributed equally. Project GitHub repo, project website. Read more on the Berkeley iSchool website and in Evelyn’s Medium post.
Consumers are constantly reminded to recycle, but making this consumer’s responsibility is misleading. Companies use packaging that theoretically could be recycled but often isn’t. Large corporations have no reason to address this issue, and in this project we aimed to encourage consumers to vote with their dollars for the brands that have reduced their environmental impact.
We built a browser extension that gives users insightful data at the most opportune moment: time of purchase, when it has the greatest potential to influence decisions. We picked a browser extension as a channel to reach the users when they are online shopping. Our tool shows the user the level of pollution from the brand they are about to buy from. This data comes from analysis of street litter captured by the users of Open Litter Map, an app where users upload photographic instances of litter in their communities. The slice of dataset that we worked with contained 27,000 images of what users considered litter, with user-supplied unmoderated tags.
We designed a pipeline of vision models to extract pollution statistics from these images. The biggest challenge was to label the data as OpenLitterMap does not provide annotations. Rather than hand labeling the images, we combined object detection, clustering, pre-trained models, and human evaluation to generate labels. We started with Logo3kdet, pre-labeled datasets of marketing images of brand logos, that were too high quality to be applied to the user-generated pictures. It was used to train a single-class YOLO model that can detect a logo in the image. This model then generated bounding boxes for OpenLitterMap dataset. Netx, we clustered the results to group similar images to identify potential classes. After clustering, we assigned names to the clusters, manually corrected mistakes and added missing annotations. While the dataset was now labeled, it suffered from class imbalance and we did not consider classes with less than 20 example images. We then split the dataset into training, validation, and test sets and upsampled the remaining low-representation classes in the training set using augmentation techniques. This final dataset was used to train a multi-class YOLOv8 model that can identify specific brand logos. We then pushed the entire OLM dataset through this model and used the brand frequency to derive the brand ranking shown to the end user.
YOLOv8 model trained on this data set had Average Precision (mAP) 68%, with performance remains relatively uniform for all brands. Amstel and Budweiser were two exceptions where the model misses more than 50% of all instances. This is an acceptable performance for our problem, because the model rarely confuses one brand for another. There might be instances that are missed, but if the model says it is “Amstel Light”, it is never “Corona”, despite Corona being approximately 5 times more frequent in the dataset. When the model flags Corona and Heineken as the worst polluters, we are certain it is not “Miller” or “Amstel”.
The system was built and hosted on AWS infrastructure, using EC2 instances to train the models, S3 buckets and Postgres RDS to store the data and lambda functions to interface the chrome extension with the model and data. The final YOLO model was also hosted on SageMaker endpoint that allowed consumers to upload their own images and get the brand information extracted by the model. Hosting a large YOLO model and interfacing with it via HTTP requests tested the limits of the lambda functions and the approach I proposed in this project was later accepted by Amazon in their official guidelines.
In summary, our solution has a potential to alter consumee’s behavior by providing visual information at the critical moment of purchasing. The browser extension integrates LitterLog insights directly into the shopping experience, displaying a brand’s litter score to the user at the time of purchase, when the likelihood of influencing a decision towards an environmentally friendly brand is highest. Additionally, the low cross-brand confusion offers brands assurance that they are not held accountable for the deficiencies of others.
Key words: YOLO, hierarchical clustering, AWS SageMaker, AWS lambda, object detection