Skip to content

Intellectual curiosity

Intellectual curiosity published on No Comments on Intellectual curiosity

So I failed two job interviews this week. The reason in both cases is the same – I am not “business savvy,” or at least not savvy enough.

Of course, I can rationalize these failures by mentioning that I am really not in the business these two companies operate, or the interviewers did not give me enough guidance and sufficient orientation in their business models, or it’s not realistic to expect a candidate give a precise answer to an open-ended question, or people who interviewed me are educated morons, or whatever else excuses I can come up with.

Nevertheless the fact remains: all my technical skills do not help me in advancing my career. So, why should I continue studying weird and obscure stuff, take Coursera classes and read thick books and hard-to-find papers? Nobody seems to care about what I know. As long as I can articulate how to compare proportions and explain the difference between mean and median, I am deemed to have adequate skills for a data analyst position.

And this is fine. If the company does what they need to do by utilizing just a bare minimum of analytics, more power to them! Simple analytical solutions do not entail business disadvantage. Quite the opposite, these people do not have to utilize heavy duty machinery to get the job done and stay profitable.

But my experience raises the following question: do I spend my time and energy wisely?

According to Wikipedia, intellectual curiosity is “a term used to describe one’s desire to invest time and energy into learning more,” is widely praised and generally considered a good thing. Still, it is a curiosity, i.e. an aimless interest in knowing something, without having any definite purpose to know. The phrase `I am curious about` suggests the things the speaker wants to know are not essential or critical. They are merely ‘interesting,’ but the speaker can happily live without learning about them. Curiosity is not a desire or a passion; it’s an idle interest rooted in boredom.

Intellectual curiosity has been the main driving force for my studies, or at least I like to think of it this way. It turned out I was curious about wrong things. Instead, I should have spent time reading business cases, fluffy stuff about customer relationship, marketing, management… business magazines, maybe?

And these so called “studies” shouldn’t be driven by “curiosity” but by the clear and sober realization of the market requirements demanded of data scientists today.

No matter how I feel about it.

Two losses…

Two losses… published on No Comments on Two losses…

Our team lost two people in the last month. In both cases I was not able to see them before they moved on – I was traveling. Although I may bump into them on occasion, the world is not the same now. I am going to miss my greeting “Dude!” in the morning and hearing “Duuude…” in return. Such things are contextual, they exist only in a given place and time, not reproducible in other circumstances.

People tend to drift apart when they don’t communicate for long. As it often the case, we do not have lots of things to talk about apart from the work-related topics, and it means we are not going to stay in touch. This is a sad reality.

Relationships developed at the workplace do not survive departures particularly well, at least in my experience. Isn’t it a curious consequence of our jobs becoming a significant part of our lives? Your best friends and people you want to be around are your officemates.

One of these guys used to be my boss for the last three years. During this time I saw him growing into a rare breed of a manager who really cares about the whole team and each one of us. Our 1-on-1s were something I was always waiting for. They meant a lot to me, both professionally and personally.

So, what’s now?

We’ll see. It won’t be the same – this universal cliche is quite appropriate here. I understand that the way forward entails leaving the past behind and embracing the new reality. I must admit though that starting with the clean slate in the old place may be a tough call.

What are they saying about new wine and old wineskins??

Useful findInterval() function

Useful findInterval() function published on No Comments on Useful findInterval() function

Last week I worked on a seemingly simple, almost trivial problem – the mapping from IP addresses to country. Free services out there that return the full geographical location data given an IP address are well known. Some of them have API that could be called programmatically. Nothing out of ordinary; people were doing that for a long time. My problem was that I needed to do it fast and for a big data set, effectively adding a country to the stream of IP addresses coming from the online service. And I wanted to do it in R.

Continue reading Useful findInterval() function

User session – what is it?

User session – what is it? published on No Comments on User session – what is it?

I was aware of this problem for a long time, but always managed to circumvent it somehow. It popped up again last week. I couldn’t dodge it because another, and more important problem required a solution to this old issue. It was time to tackle it head-on. The topic of today’s post is the definition of a user session. Continue reading User session – what is it?

Why I use data.table. Part 1

Why I use data.table. Part 1 published on No Comments on Why I use data.table. Part 1


The R package data.table showed up in my site-library in the summer of 2013. The problem I was working on at that time can be broadly described as a binary classification task, very similar to the fraud detection. It was supposed to be run overnight as a part of the data warehouse loading jobs. Modeling took me a while but I ended up with a surprisingly accurate model with a handful of meaningful predictors. So everything was looking great until I realized that my model would not scale. To achieve an acceptable predictive accuracy I constructed a rather complex set of features. The regular R data frames were just too slow to build them; the transformations needed to create features took a very long time to run even on moderately sized data. Looking for alternatives resulted in my discovery of data.table. Since then I routinely use data.table in all my work.Continue reading Why I use data.table. Part 1

Just give me the average!

Just give me the average! published on No Comments on Just give me the average!

When you hear this phrase from your boss, just give her the average. You should stop pushing for the “right” way to measure an important business metric. Do not try to continue and convince her that averages are misleading, do not tell scary stories about averages you read in textbooks, do not say that another metric is a better choice, that there are confounders we still need to account for, and many other things that you think you know better than her. She would not hear anyway. Just give her the average and consider it done. Continue reading Just give me the average!

Playing with factors

Playing with factors published on No Comments on Playing with factors

I have always struggled with R factors. What are they? How to manipulate them? More importantly, how to think about them? Today I finally sat down and spent a few hours playing around trying to understand them better. The results are below.

Verdict: dangerous Continue reading Playing with factors