Thursday, September 14, 2017

Checking for Zero Values in Go

In Go, every type has a zero value. Which is the value a variable of this type get if it's not initialized. I had a configuration object of type map[string]interface{} and I needed to check if value exists and is not a zero value.

Here's a small piece of code that checks for zero values:

Saturday, July 15, 2017

Generating Power Set using Bitmap

I was asked to write a function that generate a power set of items. At first I wrote a recursive algorithms but then another approach came to mind. When you calculate how many subsets there are, you can say that each item in the original set can either be or not be in a subset, which means 2^n subsets. This yes/no for including can be seen as a bitmask, and since we know that there are 2^n subsets we can use the number from 0 to 2^n-1 as bitmasks.

Monday, June 19, 2017

Who Touched the Code Last? (git)

Sometimes I'd like to know who to ask about a piece of code. I've developed a little Python script that shows the last people who touch a file/directory and the ones who touched it most.

Example output (on arrow project)
$ owners
Most: Wes McKinney (39.5%), Uwe L. Korn (15.3%), Kouhei Sutou (10.8%)
Last: Kengo Seki (31M), Uwe L. Korn (39M), Max Risuhin (23H)

So ask Wes or Uwe :)

Here's the code:

Monday, June 12, 2017

Go's append vs copy

When we'd like to concatenates slices in Go, most people usually reach out for append. Most of the time this solution is OK, however if you need to squeeze more performance - copy will be better.

EDIT: As Henrik Johansson suggested, if you pre-allocate the slice to append it'll fast as well.

Sunday, May 14, 2017

scikit-learn Compatible Pipeline Steps

A client wanted a way to create a pipeline of transformations on DataFrames. Since they already work with scikit-learn, they were familiar with Pipelines. It took very little code to create a base class for a pipeline step that will work with DataFrames.

Wednesday, March 29, 2017

Color Log Lines

There are several log handler out there the color log lines. However I prefer to leave the log as is and when I want colors I pipe the log via a utility that does that. Coloring log lines by levels can be easily done with a little awk.

Here's an example how it looks

