Python developer programming tests
I recently administered a programming test to a number of applicants for a Python developer job at 2degreesnetwork.com. Since we now have a new developer (hi Gustavo!), I figured I would post the test and see what kind of response I get to them.
There are two parts to the test, neither are hugely difficult, but there is enough scope in the solutions to understand how the candidate approaches a problem.
In the first part of the test I asked the candidate to look at the following code and implement the thousands_with_commas
function, which should take an integer and return a string representation of that integer with commas separating groups of three digits:
def thousands_with_commas(i): return str(i) if __name__ == '__main__': assert thousands_with_commas(1234) == '1,234' assert thousands_with_commas(123456789) == '123,456,789' assert thousands_with_commas(12) == '12'
I think there is a way of doing this with the standard library, and there is also an implementation in Django, but I was looking for a solution from scratch.
It worked quite well as an interview problem, because there is no one obvious solution and there are a few potential gotchas to tackle.
In the second part of the test, I asked the candidate to implement a function that uses a word list to return the anagrams of a given word.
I started them off with the following code:
def annograms(word): words = [w.rstrip() for w in open('WORD.LST')] raise NotImplementedError if __name__ == "__main__": print annograms("train") print '--' print annograms('drive') print '--' print annograms('python')
This part of the test gave a good indication of how much a candidate new about data structures and performance.
Feel free to post your solutions in the comments, although I suspect I've seen pretty much all variations on possible solutions!
Did I pass? ;-)
given an arbitrarily nested set of lists, that either have other tuples or a string (literal string in them, not unicode), write a function that does depth then breadth collapsing. Specifically, given-
I give them the option of having flatten as a generator/iterable instead. Further, I tell them they don't have to worry about recursion depth, and that it will never be cyclic. Beyond that, litering a few ‘,’ (just after the “e” for example) is thrown in since some folks trip up on that (why, hell if I know).
The nice thing about this request is that it pretty quickly nails down if someone can do basic recursion, even just thinking about it properly. It also nails down some basic list methods (append/extend primarily) and awareness of isinstance (or type, which is icky, but whatever). For those who know a bit better (but didn't cover this case), I usually then modify one of the elements from a list to a tuple and ask them to make the code generic enough to work with it… leading to hasattr/__iter__ awareness.
The sad thing about this question is that it's not that hard… but a surprisingly small number of candidates seem to get it (or even get close to a solution), sub 25% going by the last few months of interviewing people.
Either way, enjoy- it's a fun one to level during an interview.
Brian, something like this?
I'd be more impressed with a candidate that did that without recursion.
Brian, tell me if this works :)
Not that I'd expect any job applicant to remember that. I can't :)
So have this:
It would be so awesome if Python could one day gain the ability for yields to go all the way back to the caller - removing the embedded for loop.
if i == 1234:
return ‘1,234’
elif i == 123456789)
return ‘123,456,789’
elif i == 12
return ‘12’
Update: Fixed code formatting
TDD, I think you spotted a flaw in Test Driven Design.
Guessing y'all hacked that one out on the interpretter… my first response was “strings are iterable”, but for some goofy ass reason strings lack an __iter__ method (but do support iteration). I'm a bit curious why they don't expose __iter__ to introspection, but so it goes (python has some weird warts).
Learn something everyday I guess (fortunately the two folk out of 9 who got close enough to the answer did termination testing first). Other nice thing about this question is that it gives you a window into discussing optimization/performance, intermediate objects generated (and minimizing those), etc. Really kind of a fun, simple question to throw at candidates.
As for the folk asking for a non recursive version, look in http://www.pkgcore.org/trac/pkgcore/browser/snakeoil/snakeoil/lists.py for native_iflatten_* functions. It shoves the stack itself off onto basically an iterable that is internally a deque of other iterables. Think chain, just prependable/appendable after the fact.
Note that those python implementations probably are slower in most cases then just doing a recursive generator- they were implemented that way to be equivalent to the c implementation (http://www.pkgcore.org/trac/pkgcore/browser/snakeoil/src/lists.c) which, at least at the time of writing, was pretty licketly split fast for our usage in pkgcore.
The first example reverses the list, and inserts a comma after every 3 items, adjusting for commas. The second solution appends items to a list, and after three items, it adds it to a larger list, then reduces that nested list into a string.
@rolando: your anagrams function at first glance will fail on word ‘loop’, w/ word list having ‘poll’. the chars used are the same, len is the same, but the # of chars to work with varies.
another variation, since I'm a bit bored atm-
That said, the sorted is a bit of hack, and probably not the fastest approach. Simple though.
Alberto
I think this reads easy, and for that reason I also like Brian Harring's straight list approach above. I definitely found a couple corners cases playing around with this one, and can see how it makes a good interview question.
For annograms, I went for an implementation that emphasizes speed, especially considering that the starting code suggests the entire word list will fit into memory. The word list is pre-processed and stored into a dict for rapid lookup using the sorted hash. I use a WordPlay class instance to hold and reuse the word dictionary.
But here is the long version I would have delivered:
Huub900's anagram answer is fantastic.
annograms() can be turned into a generator by changing ‘filter’ to ‘ifilter’
I think it's more readable and not too different than other solutions as far as performance is concerned.
This should, in theory, handle any iterable item, since it just tries to iterate, and catches the exception if it can't.
What about this?
Will
That gave rise to my first time I can recall where I've used modulo with a negative right operand. It gives just the starting indexes required. Here, Python's use of an index of -2 as len(s) - 2 happens to be unwanted, hence the max() to clamp it to 0. I've also added more test cases, including for some big numbers.
Perhaps someone may like to gather the various solutions here and test and time them?
Clearly, after putting in a comma, there's no need to try and match starting with the first or second digit after that comma. For 2 ** 7 ** 7 that may be a worthwhile saving. :-)
is still quite slow (even compiling the regexp just once)… so if speed is your concern then regexp is probably not the tool for this job (and if you're at this microptimization level probably python is IMO a questionable choice in general).
Moreover I think that should be possible to write a regexp engine that can run the first version without being O(n^2)
Regarding thousands with commas: it's so super easy, I'm surprised that only mj's solution uses recursion. Sure, it crashes on ridiculously long numbers, but on the other hand, it's not what it's supposed to be used for, right?
It would be better to follow up with questions on code maintenance rather than performance if you didn't mention performance when posing the original questions.
- Paddy.
I am learning Python at the moment and will apply for a future jobs; however, Python is a secondary language. What kind of questions would you ask a new Python developer?
Thanks
“ It's very tempting to use these ‘cool’ features when they're not absolutely necessary. It's harder to read, understand, and debug code that's using unusual features underneath. It doesn't seem that way at first (to the original author), but when revisiting the code, it tends to be more difficult than code that is longer but is straightforward. ”
I would have followed the above if I were hiring.
Just keep it simple.
You don't really need to put everything on a single line.
I loved reading your posts.
Does anyone of you know of reliable online test to evaluate a :
- Python trainee,
- Python Jr,
- Python Sr,
- Python CTO ?
The goal is to identify relevant candidates to implement / enrich OpenEdX.
Thank you all in advance for your answers.
Best,
Cyril
or in a functional style:
I've seen this test and tried to do them. Result added bellow.
First path:
Second path: