Three pears

The last few times that I bought pears from the supermarket my wife criticized me for not buying the "right kind of pears". I asked her to describe to me what the "correct" pears look like and she proceeded to carefully describe their visual appearance and shape. Next time I bought the wrong kind, again, so I thought this is the perfect problem to try and solve using some of the things I am learning from the fast.ai course that I am currently following.

I thought I'd train a model to classify several different kinds of pears so that every time I go to the store I snap a photo of what's available and get the right one.

Turns out classifying pears is not as easy as I thought. Here are my findings.

Underestimating the problem

I've gone through the first 2 lectures of the fast.ai course and after experimenting with some of the notebooks and training a model to distinguish between turtles and tortoises I thought I was ready for something more useful.

When the pear issue arose I quickly searched for "pear varieties" and saw this article which depicted 5 varieties of pears, one of which was the pear type my wife was looking for (Concorde). So I decided to start with those 5 varieties. "I'll add more later", I thought.

I proceeded to copy the notebook from the first fast.ai lesson by Jeremy Howard, edited it so that the model is trained on the 5 varieties of pears, confident that I will get a reasonable result only to be disappointed by an error rate hovering around 40%. Not knowing exactly what I am doing I tried increasing the number of epochs for fine-tuning, tried another architecture and results got even worse. So I decided to investigate a bit.

Investigating

Initially, I had only looked at a small batch using dls.show_batch(max_n=6) but I remembered something from the second lesson. Jeremy showed how to use your trained model to find bad examples of training data so I did just that. I used ImageClassifierCleaner and it turned out there was a lot of trash in the dataset. Naturally, I decided to create one myself.

Reducing the scope

I asked Claude to create a script for me with the following prompt written without much thinking:

create the following python script: I pass it a query like "concorde pear photos" and it starts to download images 1 by 1, showing me the image and the file name before it saves it. I have the option to keep it or delete it.

To my honest surprise it did it almost first try. I just had to ask it to use the duckduckgo_search python library and it created exactly what I wanted. I skimmed briefly through the code and ran it. Here's the original file.

So I went through 100 photos of Concorde pears and 100 photos of Barlett pears. I uploaded them to my Kaggle notebook sure that now it will produce better results. To my surprise the error rate was still hovering around 30%. Again, I tried turning some knobs and pushing some buttons with no result.

For those interested here is the Kaggle notebook and the dataset I used.

Both the notebook and dataset are messy as I haven't had time to clean them up. Hoping to do so in the near future

Finally making progress

One thing that I noticed was that the set of images that I used was a mix of photos of a few pears and photos of whole trees with lots of pears on the branches. My initial thinking was that those are varied examples that should work well but after the failed attempt at training I thought that maybe focusing on photos which depict a closeup of few pears would yield better results. And it did!

Using resnet34 and 4 epochs I managed to get the error rate down to 8% (achieving 92% accuracy). I also found out that using "pad" for resize method helped. I think this is because using "squish" alters the shape of the pears too much and I think the shape is the key to correct classification.

The current result is a fairly good classifier that correctly distinguishes between Concorde and Barrett pears, as long as you pass one of those pear varieties. I have it hosted on a Huggingface space here. Feel free to give it a go.

Me at the store using the app

Findings and next steps

This was a fascinating journey I went through thanks to the incredible lessons by Jeremy Howard and the fast.ai course. I did all of this within a few hours over the weekend with knowing only a minuscule amount of deep learning.

Other things that I still want to try are other architectures, using a larger data set, using augmentation. I also wonder what will happen if I convert the images to grayscale.

All in all, this was incredibly satisfying and I learned a lot. At some point I want to be able to reproduce this in PyTorch and run it on my Mac natively instead of Kaggle playbooks.

So stay tuned for those future posts with findings.

And yes, I finally managed to buy the correct pears for my wife using my classifier, which made this small project even more rewarding!

About

Exploring the intersection of practical problem-solving and emerging technologies in the software industry. Come along for the ride!