Classifying Hand-Drawn Digits Live — What I learned from making this project.

Taking a twist to the classical good ‘ol MNIST Digit project

Evan Lin
6 min readJan 20, 2021

Watch the first 1 minute and 20 seconds of the video before reading further into this article.

I explain my motivation for building this project and my approaches to learning my prerequisites.

I’m assuming by now that since you have reached here, you’ve likely finished watching up until the timestamp above. If you’ve watched beyond the 1 minute 20 seconds, you may or may not have heard me say that the video is solely meant for showcasing the demo of the code. So let me cut to the chase. I’m writing this article to explain my process of learning through the challenges I faced and how I overcame them.

Without further ado, let’s jump in!

Structure

This project consists of 2 files and 3 separate empty folders.

.py files:

  • tkinterInterface.py (Code for drawing interface and also the file that must be run in order to execute the second file, MLcode.py)
  • MLcode.py (Code for Machine Learning computations)

Folders:

All these folders begin empty but store images that overwrite each other every time that the program is run.

  • /HandWrittenImages
  • /ConvertedHandWrittenData
  • /FinalDownScaling

Here is a link to my GitHub if you want to see the compilation of the code.

The code below is the code in tkinterInterface.py.

#Import libraries
from
tkinter import *
import pyscreenshot as ImageGrab
import os
#Window
window = Tk()
window.title('Digits Classifier')
window.geometry('500x340+800-400')
#Create the drawing tool
def paint(event):
x1, y1 = (event.x - 12), (event.y - 12)
x2, y2 = (event.x + 12), (event.y + 12)
drawingBoard.create_oval( x1, y1, x2, y2, fill = "white", outline = "")
#Canvas Box
drawingBoard = Canvas(window, width = 256, height = 256)
drawingBoard.pack()
def rectangle():
drawingBoard.create_rectangle(0, 0, 256, 256, fill="black", outline = "")
drawingBoard.bind("<B1-Motion>", paint)
rectangle()
#Drawing tool bottom message
message = Label(window, text = "Press and Drag Left Mouse Button to draw. Do not move window." )
message.pack(side = BOTTOM )
#Delete buttondef delete():
drawingBoard.delete(rectangle())
btn=Button(window, text="Clear Canvas!", fg='black', command = delete)
btn.place(x=350, y=50)
btn.pack()
#Saving the image variablesdef saveImg():
images_folder = "HandWrittenImages/"
im = ImageGrab.grab(bbox=(929, 333, 1185, 589)) # X1,Y1,X2,Y2
print ("saved....")
im.save(images_folder+('image.png'))
print("clear screen and redraw...")
os.system('python MLcode.py')#Predict drawing button
btn=Button(window, text="Predict Drawing!", fg='black', command = saveImg)btn.place(x=350, y=100)btn.pack()drawingBoard.winfo_geometry()window.mainloop()

…..and this code belongs to MLcode.py

from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import pandas as pd
from PIL import Image
import numpy as np
import os
digits = datasets.load_digits()
print(digits.keys())
X = digits.data
y = digits.target
print(X.shape)
print(y.shape)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1, stratify = y)model = KNeighborsClassifier(n_neighbors = 7)
model.fit(X_train, y_train)
print('Training accuracy: ', model.score(X_train, y_train))
print('Testing accuracy: ', model.score(X_test, y_test))
#Image processing
im = np.array(Image.open('HandWrittenImages\image.png').convert('L'))
print(type(im))
gr_im=Image.fromarray(im).save('ConvertedHandWrittenData\converted.png')
print(type(gr_im))
load_img_rz = np.array(Image.open('ConvertedHandWrittenData\converted.png').resize((8,8)))Image.fromarray(load_img_rz).save('FinalDownScaling\Final.jpeg')
print("After resizing:",load_img_rz.shape)
imgToArray = Image.open('FinalDownScaling\Final.jpeg')
array = np.asarray(imgToArray)
print(type(array))
print(array.shape)
print(array)# its gonna be multi dimensional but now we need a 1d array
oneDArray = array.flatten()
print("I think that the number you drew is:", model.predict([oneDArray]))

Tkinter

Jumping into this project, I had zero knowledge of Tkinter. I was curious enough to see what python GUI libraries I could use to create my canvas drawing interface but learning how to make the interface was really a pain in the ass. I had to do a lot of math in calculating the perfect pixel values so that the screenshot taken by Pyscreenshot would take the exact black canvas on the screen.

One thing I learned from tkinter that was pretty interesting was how the dimensions of the window that you add in your arguments are based on the coordinates of your screen. Looking back, I did a lot of useless math that I honestly could have saved from picking a better python GUI framework.

Back then, I didn’t know what was considered a “good” GUI, but after hustling with Tkinter, I can gladly say that well damn, that stuff is outdated and pretty tedious (considering that I could have picked a wayyy nicer and more customizable framework).

My greatest takeaway from Tkinter was learning how to make buttons in a GUI. I found it pretty intuitive how I could just stuff a function or even a lambda as an argument for the Button() instantiator. All i had to do was set the ‘command =’ parameter to the name of my function that had what I wanted to run. For example for one of my buttons, I had a function called saveImg() that would use os.system() to execute my machine learning code file.

I also found it interesting how most of the designs made in the tkinter GUI can only be made through combinations of vector shapes (which was pretty boring). But in the end, the stuff was pretty intuitive, although I had to play around with a lot of the parameters to get used to ‘em.

Honesty I wish I did more searching in this project to figure out a way to capture the black canvas without using Pyscreenshot. Really sucks how I can't move my tab or else it’ll screenshot the window behind or your computer background. I mean well shit…. it works…. but it can be better.

Scikit-Learn

Man oh man did I learn a lot about scikit from this project! After learning about scikitlearn through countless hours on datacamp lessons, putting my skills to use in this ‘lil project paid off soo much. I spent in total about 18 hours over the course of 3 days in making the project from scratch and looking back, wow, that was a lot of time. The hardest part of the code that really satisfied me when I understood it was the flow of data in how the screenshotted image turned into a 1d vector from a multidimensional matrix.

One thing I look back on and realize was why the heck I decide to use a train_test_split? Honestly, I should have used k-fold cross-validation or even leave-one-out CV (but LOOCV is too slow and I want people with potato computers to run this). Another thing I learned (a big one) was that I should have tested my model with other types of classifiers and I realized arbitrarily picking 7 neighbours for my model wasn’t the best idea.

I should have explained why I chose my classifier to be a KNN and why I chose the #ofneighbors as 7. I would have done this by testing the same data with different models and using matplot to visualize all the data.

The most exciting and “figure it out” moment in the entirety of this project was when I was trying to figure out the matrix conversions in order to turn a 256 by 256 image to 8 by 8. Holy sweet Jesus on a motorbike that was a challenge for me. Anyways, after watching some Khan academy videos on matrix arithmetic and trying the stuff out on paper, I intuitively understood the concepts and tried implementing it in the code.

Good thing that it actually worked haha. I was nearly giving up but hey I knew that I had to keep pushing myself and that I was bound the get results if I kept on trying. Don't focus on the result. Focus on the learning process.

Next steps

So yeah, I did realize that the 8 by 8 ‘X’ training data (a vector of 64 length) wasn’t the best to use because it was inaccurate for particular numbers like 9, 8 and 4. What I should have done was train the 28 by 28 MNIST version and NOT have it run each time i run the entire program. That was really stupid of me but hey, I learned! Another way to overcome this is to straight up upload these files to the cloud and deploy it into a web app for people to try it out for themselves! Again, in the video I mentioned that I picked the 8 by 8 MNIST version because I wanted everybody to be able to run the code but then i realized why not just deploy it over the cloud or my personal site?

So…….. that's what I’m going to do next!

What are the next steps?

  • Learn PyTorch
  • Learn Flask
  • React/React Native
  • Next.js
  • Vercel

Hi! 👋 I’m Evan, a 16-year old that’s interested in machine learning and positively impacting communities of scale!

If you enjoyed the article, the clap button is right below! Trust me, it’ll only take a sliver of time to click it once… and a few slivers of time for a few claps 😄. Connect with me on LinkedIn or shoot me an email if you have any questions!

--

--

Evan Lin

Innovator at The Knowledge Society (TKS). Interested in Machine Learning and Quantum Computing.