Saturday, January 15, 2011


I happened upon a website called Codility ( awhile ago. It's a website where employers can test candidates for a programming job through a programming "test" system of sorts.

On their blog, they have an interesting post (which points to another thread) titled, "Is Codility Fair?" The main point of contention is whether the "Codility System" is a fair mechanism for testing the programming abilities of a candidate. If the programming test description does not give enough detail on what is required, is it the fault of the programmer for not programming something properly?

The debate is on what a fair set of assumptions every programmer should make when writing their code. For example, is handling corner cases always a fair assumption? Is writing optimal code always a fair assumption? Codility suggests these are a set of assumptions that all good programmers make. If a programmer doesn't assume them, they are a weaker programmer.

Generally speaking, I agree with Codility's sentiments. However, where is the line drawn? What assumptions are reasonable for a person to have and what aren't? Is it setup in a manner that can actually help differentiate a good programmer from a bad one? I took the Codility demo test for a whirl to see how the system works. These are my impressions and some thoughts.


I consider the test interface slightly unfair and bordering on bad. At the bottom of the programming test interface is a button called "verify". It tells you if the code you've programmed has passed or failed a set of test cases.

To me, this button strongly suggests that once you pass the verification button, you have passed the test and are done. However, this is not the case. There is an additional button called "submit task". After you submit your code, your code is then executed against additional test cases. You are then given a grade based on your results (if this was a real test from an employer, you wouldn't see the grade results).

I feel that this is a confusing interface. I believe many people will assume they've completed the assignment when they have in fact not. My first time through the demo I implemented a cheap and quick solution (normally I go back and optimize afterwards, the first pass is just to wrap my head around a problem). When I saw that I passed the tests via the "verify button", I went ahead and clicked "submit task" believing that I had passed the test and there was no need to continue. I wasn't aware that I would be given an additional grade. If you're taking a test from an employer you won't even see the grades, and will just assume you're done.

The lesson learned is if you are about to take a programming test on Codility (or a similar website), go through the demo and learn about how their system works before taking a test from an employer.


The instructions are not unfair, but I have a hard time believing that the Codility System is properly setup to handle all possible interpretations and assumptions. Or in other words, it's a fair system, just not a good one.

As an example, suppose a question is asked to write a function called mystrlen() that measures the length of a string. Which of the below handles the NULL pointer corner case correctly?

int mystrlen(const char *str) {
    if (!str) return -1;
    /* ... rest of code */

int mystrlen(const char *str) {
    /* ... rest of code */

I would argue that either of the above handles the NULL pointer corner case perfectly fine. Unless the description states specifically which one Codility wants, which one is correct? Will Codility accept either one as a solution?

Of course, one can also argue that not checking for NULL is fine as well. After all, the standard GNU libc implementation of strlen() doesn't check for it. My recollection is that most libc implementations don't check for it. It's defined (by Posix?) as undefined behaviour and documented as such. Would the following be considered something by a poor programmer?

int mystrlen(const char *str) {
    /* documentation states str behavior undefined for NULL */
    /* ... rest of code */

(Update: I originally screwed up the examples above by having the functions return size_t's. Those are unsigned types, so I changed it to ints. I suppose having a prototype returning a signed vs unsigned could suggest how a corner case could/should be handled, but please don't nit-pick on that point, think of the general argument in its entirety.)


After the test is completed by job candidates, what does Codility give the employer? Do they give the employer the actual code submitted by the candidate? Or do they only give the employer a grade?

One commenter said that Codility gives employers a report something like this:

Candidate Bob: 100%
Candidate Sam: 100%
Candidate Joe: 100%
Candidate Max: 98%

So it looks like Bob, Sam and Joe will get an interview, but Max is out of luck. Those 2 percentage points are hardly enough to suggest that Max is a poorer programmer than the others. Given the issues I describe above under #1 and #2, I would bet that several poorer scoring programmers could have been equally skilled but just made a few incorrect assumptions. They weren't bad assumptions, just not the assumptions Codility wanted.


After taking the demo test, I couldn't help but think of the famed Barometer Question. Does Codility want you to answer the question? Or do they want you to give them the answer they want? I'm leaning towards the later. That's why in-person programming questions (or even IM sessions) are really the best way to do a programming interview. You can deal with all the assumptions and see how the person really programs. As with the mystrlen() example above, the candidate can explain their assumptions and elaborate on why they made their assumption.

However, I believe Codility could be used as a general weedout mechanism. While I do not believe a 95% truly differentiates someone who scored a 90%, I do believe that someone who scored a 95% is better than someone who scored a 5%. Whenever I ask programming questions during an interview, the goal is not to determine if a candidate is a good programmer or an incredible programmer. The goal is to determine if the programmer is a terrible programmer or not-terrible programmer. I think that's the way employers should approach using Codility.

Update (1/28/11):

I suddenly had a thought. Is Codility treating the candidate like a customer treats a software developer/vendor? After mulling over it, I think the answer is no. With a customer, an engineer/developer has a set of requirements to meet. The requirements may not always be known ahead of time, so assumptions can be made about them. However, ultimately you are coding towards those requirements. Even if you are a brilliant programmer, some portion of Codility is testing your assumption making capabilities, not your coding or engineering ability.

Is testing assumption making capabilities proper for evaluating a candidate? I think it's a perfectly valid judgement of a person's engineering ability. However, I don't believe it is under a Codility system. As I describe above, there are many valid and good assumptions, it may not simply be what Codility wanted and how it will grade you. Subsequently, a recruiter from a company won't see your assumption making abilities. All they see is a score.


  1. It appears Codility does give the employer the test results, including the actual code - see Presumably test-takers could embed their assumptions as comments.

  2. I recently took a couple of Codility tests as part of applying for a job. My personal experience is that above 95% of the code I write do not require black belt skills in algorithm design. I rate other aspects, like writing code that is easy to follow and understand, and is maintainable far more important than quickly figuring out the most efficient algorithm.

    If you're just aware of the impact on performance a bad solution can have you can always discuss different solutions with a collegue to find a good algorithm.

  3. I just sent them some (negative) feedback, here's what I sent

  4. thanks!Thanks for giving knowledge about A collection of random ideas, thoughts, or memories. :) android developer USA

  5. There's a lot of talk about 'how fair is it to base a candidate's programming ability on an online algorithm test', because they tend to favour people with a strong math background and it's possible to prepare for a test like that. And I do agree this kind of testing is far from perfect.

    However, if you think from Codility's perspective, they probably get thousands of applicants each year. They don't have the resources to interview them all in person as it's costly. And yes, you can prepare for the tests, but I would argue that in real life you also need to spend extra time to prepare for new projects in order to perform well (better?). And also, I think you can filter out candidates who are not very serious about the job not willing to spend extra time to prepate.

    I guess the next question is about interpreting the results. I really don't think that they would hire someone who got 81% over someone who got 75%, that's why they have extra interviews/assessments. However, even though you probably need a PhD in Math/CS to get 100%. If you get a very low score or 0, chances are you are not good at what you do. There are probably exceptions, but as a statistically I think that's true.

    Lastly, even if you don't get lucky, they now offer the chance to re-do the step you failed at every month. :)

  6. Honestly speaking their coding challenges has nothing with real life projects except some Sql queries which are pretty straight like retrieving an average salaries of employees from sal tables