30 November 2016

Hashing specific columns of CSV files

When working with Big Data, you'd come across many situations when data needs to be de-personalized before being processed or before handing it over to anyone even if it is done over an NDA because the data simply contains information that is too private. Perhaps mobile numbers, names and addresses of specific people.

If the data is in the form of a CSV file that you've been able to open in Excel and if the columns are well identified, then all you need is a simple program which can irreversibly hash the values stored in certain columns.

Note that this is not encryption. It's hashing. If data were encrypted, it would be possible to decrypt it. But with one-way-hashing, the data that is hashed cannot be restored into its original form.

I've created a free and open source Java program called hashCSV which will do this for you.

It's released under the MIT license, so you are free to fork the project and use it for personal or commercial use.


Continued from the previous Aha!

Share with this link

Continued in the next Aha...

06 November 2016

Joy of programming. From the stomach to the heart!

A long time ago, I read about the first chef Google hired. They specifically hired him because he had a passion not just for cooking great food, but also for making sure he cooked it in a way that people loved it. Google says he was very good for the morale of their staff.

When I joined an enthu little startup, they started providing snacks to employees in the evenings and were a bit out of options when it came to deciding which snack to get everyday. Moreover, colleagues were too shy to offer options either, and many of the favorite, really tasty snacks were obviously unhealthy for the long-term. Moreover, the admin guy who had to bring snacks everyday, was quite shy of choosing the snacks, and would ask me everyday.

A colleague and I tried to help out with this. We got the necessary permissions from the higher-up's, created a Google form, got a few other colleagues help in expanding the snack options and sent out the form to everyone. The results we got were shocking. Some of the snacks we used to regularly get for everyone, were the most highly down-voted. Snacks which we didn't really expect much of a response from, were upvoted by many.

Handed over the list to Mr.Admin guy, thinking the problem was solved.

Snacker is born
Next day he comes to me again, asking for a timetable because he's still unsure of what to buy from the list. And hence was born, the idea for Snacker. A little open source program which chooses from a list, the snack of the day.

Snacker initially depended on a random function to choose the snack, but we soon realized that it wasn't enough because even with a "recentness" check, the same snack was turning up quite often.
Version 2 of snacker made use of Human Intelligence (which is as of now far more advanced than AI).
Snacker randomly chooses four least recent snacks and displays it to Mr.Admin guy. It also lists the best snack among them. He then chooses which snack he wants to buy, based on his memory and thinking.

A simple way to automate a routine task and to combine computing power with human intelligence. Indeed, I believe that this is the way the future of the world will be. We won't need to fear AI, because we will be building better computing power and metallic strength into our own bodies. The world will have meta humans. This would be necessary too, because normal humans are not meant to travel in space.

Anyway, programming is fun for me. Eating snacks is fun for everyone. With a bit of adjustment of the snacks list and improvements to Snacker, this could be one program that could find its way from the stomachs of people into their hearts :-)

Don't do the work. Make the computer do the work for you!
- Navin