N Recursions: October 2018

16 October 2018

Installing Netbeans 8.2 on Ubuntu 16.04 or 18.04 for Python functionality

Most Python editors are either not very functional or they take up way too much memory or they are just a pain to install. If you want a simple editor for Python, try Geany. Note that if you use Python 3, you'll have to specify it in Geany's compile and execute commands.

If you are a Netbeans fan, you'll need Java 8 to be able to install Netbeans.

You could either choose to install from this PPA:
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer

Or install from the default JDK:
sudo apt-get install openjdk-8*

Now open the environment file:
sudo vi /etc/environment

Add the following line to the environment file (for Netbeans to be able to find Java):
JAVA_HOME="/usr/lib/jvm/java-8-oracle"

Save and exit.

source /etc/environment
echo $JAVA_HOME

Now download the Python plugin for Netbeans 8.2. You'll get a zip file. Extract it and open Netbeans.

Goto the Tools > Plugins menu option.

Select the "Downloaded" tab.
Click "Add Plugins".
Select the folder that you just extracted from the zip file and you'll see many ".nbm" files.
Press Ctrl+A to select all of them and press "Ok".
Click the "Install" button on the bottom left corner. Accept the license terms and click "Install".
Restart Netbeans.

Select File > New Project > Python > Python Project.
You'll see a "Manage" button which you can use to change the python platform to Python 3 if you need to.

Click the "New" button, navigate to /usr/bin/ and you'll find a "python3" file among a lot of files. Select it and click "Ok".
Now select Python 3 and click "Make Default".
Click "Close".
Click "Finish".

That's it! Your Python 3 project is ready to run in Ubuntu with Netbeans.

A better tutorial on the Haar features used in Viola Jones algorithm

One of the most confusing aspects of the Haar features used for the Viola Jones algorithm is the black and white rectangles. It took quite a while for me to figure it out since neither the research paper of Viola and Jones explained it well nor did any of the tutorials. Besides, the concept shouldn't be shown as filled black and white rectangles in the first place. So here's a change:

Instead of showing Haar features like this:

Show them like this:

It is necessary that people intuitively understand that it is not the black and white rectangles that are important, but the actual pixel values within the rectangles that are important. For good contrast, show them as yellow and red rectangles if you like.

Source of the original images: University of Oulu's website.

Why are we using those rectangles?

If you were searching for a line in an image, you'd use a mask that's shaped like a line. Same way, when we search for a face, we can use a mask that is shaped like a face or to reduce computation, we could just search for parts of the face that almost always have dark and bright pixels in a certain pattern. The eyes and forehead are one such example. The pixels at the eyes will almost always be dark and the pixels at the forehead will almost always be brighter than the pixels at the eyes. So the black rectangle in the figure above just says that we are looking for a rectangular region where most of the pixels will be dark. The white rectangle above it means that wherever we intend to find such dark pixels (eyes area), we want to be sure that the rectangular area directly above it should compulsorily have plenty of bright pixels (the forehead area).

Of course there would be plenty of other places in the picture which would have similar bright and dark areas, but the area which is most likely to be the eyes and forehead, will give the highest Haar value (calculated in the formula below). This is also why you should not only search for the eyes and forehead, but also search for the nose and lips and ensure that the features you found are in the correct positions with respect to each other. That's how you'll be able to ensure that you have located a face. So based on what kind of black and white pixel pattern you are searching for, you have to design your Haar feature (the black and white rectangles) in such a way that it results in the highest value for whatever feature you are searching.

How to do the calculations?

Start by normalizing pixel values. If you have your grayscale image pixel values in a 2D matrix M that can hold grayscale values from 0 to 255, then divide all values in the matrix by 255 to normalize them. M will now have values ranging from 0 to 1.

Haar value = ((sum of values within white rectangle area in M) divided by (number of pixels within white rectangle)) minus ((sum of values within black rectangle area in M) divided by (number of pixels within black rectangle)).

The closer the Haar value is to 1, the more likely it is, that you've found a facial feature you were trying to match. In the image above, we were trying to find areas where the darker pixels of the eye region have an area above them consisting of lighter pixels of the forehead.

Other Haar feature shapes

Don't worry when you see shapes like this:

It's the same concept. Simply take the sum of all pixels from both white areas in the normalized image matrix M and the sum of all pixels from both black areas in M and subtract in the same way we did earlier. This particular shape is to detect some dark diagonal feature. You can also create your own Haar feature shapes based on what facial feature you are trying to detect.

To learn more about Integral images and Haar features, I recommend Balazs Holczer's tutorials. Well explained, and it's pleasantly amusing to hear him say "Lots of lots of" and the way he says "Feeeeeatures" :-)

Integral images

Haar features

14 October 2018

Skills requred in the field of Machine Learning and Data Science

While doing a literature survey for an assignment, I came across this Medium post by Jeff Hale where he lists some of the skills and technologies that are most in demand during 2017 and 2018 for jobs in Machine Learning.

What a job-seeker should know is that it isn't the highest bar in the graph they should be looking out for. These graphs only show you what the industry wants. There are many job descriptions that list an array of skills but the recruit is actually made to work on something much more mundane like data cleaning or preparing presentations.

It's far more important to identify which area of Machine Learning interests you. Do your own little research or build hobby projects specific to that area of interest and see if you can integrate it with other Machine Learning paradigms.
When you look for a job, don't just look for the skill-set they require or the projects they work on. Look for what your role would be in the project and have a very careful look at the Glassdoor reviews and Indeed reviews about the company culture. During the interview, make sure you get to meet the actual people who would be supervising you and if possible, even the team you'd be working with. Past experiences have shown that the way you are treated before, during and after the interview are a very good indicator of how you'll be treated after you join the company. You can even identify potentially toxic people you'd want to avoid.

Machine Learning and Data Science technologies are here to stay. If you've got yourself equipped with the necessary basic skills, you won't find yourself struggling to land yourself a job. What you do have to worry about is whether you end up as just a soldier in an army of data gorillas or whether you build products that you enjoy building, do research that is fulfilling and make the world a better place for everyone.

13 October 2018

A better way to evaluate students and improve education

It's one thing to go through school and college because you are forced by your parents. It is another thing to do it because you really want to learn.

I've had the latter opportunity during the past one and a half years of pursuing my M.Tech in AI.
Getting back to the classroom environment after spending a decade as a professional working in the industry, one gets to view academia with a broader perspective and an understanding of how things work.

1. Poor textbooks

Wanting to revise some of the basics of integration, differentiation and partial differentiation, I looked up my old Bachelors of Engineering textbooks and was shocked at the content.
The explanation of concepts was almost non-existent. The problems that were worked out were lacking many steps that would help a student understand it. There was hardly any explanation about the history of the techniques. There was no explanation about how these techniques would be used in practical applications.

How to improve:

A consortium of students can evaluate textbooks to check if it conveys the concept in a way that newbie students would be able to understand it even if there was no teacher to teach the subject. Stop buying any textbooks that are not good enough.
Create a wiki which collates the best three internet sources that teach any particular concept very well.
Teachers across the globe who are best known for teaching a particular subject well could prepare course material and distribute it for free to the world (one such excellent source is Coursera).
Create a template for textbooks, which specifies how the author should introduce the topic. First a brief history on what was lacking which caused the introduction of a new technology or technique. Then an introduction to the technique and comparison on how it fares with respect to other techniques. Then an introduction to the technique itself and finally, how the technique is used in various real-world applications.

2. Poor explanation in classrooms

I remembered that a majority of the teachers in schools and colleges were people who either didn't know the subject well enough or didn't know how to break it down into concepts that could be digested by the students easily. It isn't entirely their fault either. They themselves were a product of the same education system that didn't care if the students actually learnt anything. I've never heard of colleges conducting any screening/audition sessions for teachers to check if they had the ability to actually teach!

On a side-note, an often ignored point in classrooms is sleepiness. When you feel thirsty, you don't go and start exercising. You drink water. Same way, when you feel sleepy, you aren't supposed to drink coffee to stay awake. You are supposed to sleep. When students feel sleepy in class, they should be allowed a 10 minute nap instead of being asked to remain awake.

How to improve:

Before hiring a teacher, conduct a session where they are asked to explain at least three different randomly chosen topics of varying difficulty. The teacher should be allowed sufficient time to prepare for the topics, but then be rigorously evaluated on their ability to convey the topic in a simple manner that students can understand easily. They can choose any method to deliver the lecture. Just spoken words, the white-board, a slideshow or even VR.
The teachers can also use a template for teaching, where they first introduce the history of the technology, compare it to other techniques, teach the actual technique and then explain how it is used in real-world applications.
Repetition helps. So it can also help to adopt a teaching technique where the teacher covers the entire syllabus in a few days, where the basics of all concepts are touched upon, and then takes up each topic one-by-one. It's a reality that a student's mind might wander during a lecture, and the second repetition can help them get back on track.
Allow sufficient breaks and nap-time when students feel sleepy. A ten minute nap can be refreshing for them and it'd help if teachers can also take small breaks.

3. Wasted, unsafe practicals

Lab sessions were introduced so that students would get hands-on experience with whatever they learn. However, not all labs go as they are intended. They either follow a mundane list of to-do things or are just plain boring. Many schools and colleges don't use safety equipment either. I heard of a student at NTTF who lost an eye when a piece of metal shot into his eye while he was working on machining it.

How to improve:

Use safety equipment. There's no excuse.
Use half the lab time to allow students to practice what the lab manual stipulates and the other half, where the teacher challenges them to try creative things (that aren't dangerous) to tweak their existing understanding and see what happens if they try something different. This is a precious childhood trait we all have, which gets crushed by years of disciplining. It helps to unleash this trait in a safe, controlled environment and observe how learning actually becomes fun.

4. Scrap the written exam

Every student learns differently and at a varied pace. You can't put everyone in a similar class with horrible teachers and expect them to actually learn.
In all those subjects I didn't score well, my parents, relatives and I used to think I was too dumb to learn. In later years I learnt that it was the above three points that made the subject boring and un-learnable. The subject was actually easy. It was very interesting too, when I looked at it after many years. Yet, at the time I studied it it seemed horrible.
Moreover, many of those who were extremely adept at memorizing information and reproducing it with perfection in the exam hall were clueless when they were asked to generalize and creatively apply the concept. They didn't even know where to start.
A high score in an exam does not mean the student is intelligent. It means they have the capability to assimilate information and remember it for longer than others. This does not mean they would be able to apply the concept well in real-life.
So when you hire people into your organization, think about the role. If all you want are people who do what they are told, go ahead and hire those with a high GPA. If you want people who love applying concepts and building things, hire those who create their own personal hobby projects. One crucial point to note is that you shouldn't manage the latter bunch of people in the same way as you'd manage the former. The creative bunch of people deserve a lot more trust, freedom of thought and expression. If you constrain them, it's as good as having not hired them at all.

How to improve:

Either scrap the written exam altogether or create two types of exams that students can choose from. One exam which is the typical exam where people can memorize things and write it. Another exam where students are given challenges to apply what they've learnt and even come up with new discoveries. Don't make goldfish climb trees. Don't crush the confidence of children by showing them a written-exam score which does not really tell them anything about their innate skills.

Perhaps the education system would only be able to change once the industry starts being more specific about the kind of people it hires. Our roads are in a bad shape today because of people who don't really care about creating good roads. Many doctors can't diagnose patients well because they never really wanted to become a doctor. Many engineers disregard safety and design best-practices because they never really wanted to be engineers. You can see this in every other profession: Religion, politics, education, manufacturing, sales, aviation...

Isn't it time we had a system which could evaluate children for what they are best at, and allowed them to pursue that as a career interest? To allow people to pursue what they love doing and are good at doing. If anything, it'll lead to a happier, more comfortable world to live in.