Duck Typing vs Readability
Duck typing is getting a lot of attention lately, probably due to the hype buzzing around about Ruby/Rails. My first encounter with duck typing was in Python, which is very similar to Ruby in this respect.
The first thing that scares you about duck typing is the lack of the compiler safety net. You can write non-sensical code and you have no idea until you run it. However, as you learn to take advantage of this newfound flexibility, you start seeing the static typing safety net as a straight-jacket. Sure it stops you from poking your eye out, but a lot of the time it just gets in your way. With disciplined testing, you can restore a large proportion of the safety net, and you feel comfortable again. It’s a classic tradeoff: safety/early error detection versus flexibility, and there’ll always be arguments for both sides1.
The next thing that I grappled with was the lack of precise contracts between callers and callees. In a statically-typed language, the function prototype specifies the contract with a high level of precision. The caller knows the contract that each of the parameters must satisfy, and that the returned value(s) will fulfill. In a duck-typed language, the prototype is much less informative. Without type information in the prototype, the only way to know the contracts for sure is to analyse the body of the function. Clearly this is unacceptable, especially when dealing with a third-party API. The usual way to mitigate this loss of information in the prototype is to provide the details in the API documentation. This approach, however, suffers from two major problems:
- Verbosity: a static type (say a Java interface) is a concise way to specify type requirements, a natural language description will tend to be less direct.
- Inaccuracy: there is no way to ensure the documented requirements are correct. In particular, as the code evolves there is a real danger the documentation will be left behind.
This readability problem is my biggest issue with duck typing in practice today. A commonly-suggested solution to the problem is some form of optional static type checking2. However, this route tends to lead us back to something like interfaces, which as I say are a pretty concise way to specify a contract. This is giving away too many of the advantages of duck typing, in particular:
- Granularity: a duck-typed function places the least possible requirements on the passed parameters. Interfaces, on the other hand, may carry extra requirements: methods that are not required for the function in question. Although you can break interfaces down into atoms and combine them, the resulting number of interfaces would be overwhelming.
- Adaptability: related to the above, a duck-typed function can be adapted to contexts that the function author may never have considered with as little effort as possible on the part of the caller.
- Sheer convenience: there is no extra baggage required of calling code, you can just get on with the job.
So how do we get the convenience and power of duck typing without this readbility problem? What we need is a concise way to communicate the requirements on function parameters, without requiring them to be manually specified. Is this really so hard? Imagine a tool that analysed the body of a function (and potentially the functions it calls) to see how the parameters were used. Such a tool could extract a lot of useful information, such as what methods are called on the parameter. On the surface, it is not even a difficult tool to write3. Having this information available as you write code would be a huge plus. On the caller side, you know a lot more about the contract you need to fulfill. On the callee side, you no longer need to maintain this “type” information in the function documentation.
The idea is simple enough that I’m sure it has been thought of before. I wonder then, does such a tool exist? If not, are there some killer implementation difficulties I have overlooked?
—
1 C++ templates, although not without their own problems, get close to a best-of-both worlds: flexible contracts that are statically checked.
2 I suspect these suggestions often come from those who are more comfortable in a statically-typed world.
3 Famous last words, I suspect.
—
Into continuous integration? Want to be? Try pulse.
This entry was posted on Wednesday, June 21st, 2006 at 3:09 am and is filed under Programming Languages, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

June 21st, 2006 at 8:35 am
Hi,
I have the same concern with duck/dynmically typed languages. Sure, you can write tests to check the contracts. This will result in more test code, but still you can argue that you still write less code than in a statically typed language. The problem is that people (at least in the opensource community) just don’t write tests. I wanted to try the python binding for a C library the other day, and guess what, it had _syntax_ errors. I mean the version 2 of the binding lib. It referenced the wrong variable and method names (typos and copy&paste errors). It’s not that it wasn’t tested, it wasn’t ever RUN by the developer who released it on sourceforge. It’s even included in Debian :(.
OK, but my main concern is the same as yours. An explicitly typed program provides a good documentation for itself. The developper has to write less and it is more readable in a programming language than in english. Sometimes I play around with python but it always takes a lot of time to figure out the contracts. You’ll see expressions like ‘file-like object’ in the standard library documentation. And then thet explain in english that it will have open, close and read methods, but not write. (But then what arguments do they take?) And it’s the better alternative, because you will find documentation where they’ll only say that it’s file like but wont tell you in what way does (or can) it differ.
Computer languages re very good at formalizing rules. Contracts/interfaces are such rules. We need them for documentation, we need them because of ourselves and not because of the compiler. My guess is that in the future we’ll se a mixture of the two worlds (not a very visionary guess, I know ? ). You’ll define intefaces that the passed in object must adhere to but the check will be done at runtime and it wil be based on a fine grained check and not simply looking at the class hierarchy. The class of the object won’t have to literally implenent the interface it will just have to be compatible with it. C++ is getting something similare in the form of concepts, though the check will be static of course.
June 21st, 2006 at 10:12 am
See this: http://en.wikipedia.org/wiki/Type_inference
Type inference works well in Haskell, speaking as someone who has a strong preference for static typing anyway.
I haven’t used Scala, but that’s a JVM compatible, Java-like language with type inference if you’d like to try it out.
June 21st, 2006 at 10:46 am
But how do you know what kind of object you have in your hands before passing it into the called function?
I think Scala’s type inference is relatively simple compared to the automated tackling of duck typing. Same for Haskell, I think. (And for the upcoming C# 3.0.) And simple type inference is nice, but grokking duck typing would be nicer.
Try searching on refactoring browsers for Smalltalk. I think I’ve seen ideas from comments along those lines.
I’ve heard of two general strategies:
1. Watch what happens at run time (during the actual program and/or during unit tests), and remember what calls what. Especially valuable when you start doing reflection with strings on the objects, and that’s common in duck land.
2. Statically analyze the code and what gets passed where and how it’s used. Like what you mention, but the extra stuff that also needs done. I’ve heard it said that this is more popular/effective than option 1, but it doesn’t seem to cover the reflection case to me.
Either case seems complicated, though, and all the fancy stuff in Eclipse already seems to be a big deal (and memory/cpu intensive). If someone ever gets this into the hands of the masses, I might go back to being a duck typer.
June 21st, 2006 at 11:58 am
And to atleta, that C++ “concepts” thing sounds sort of like the ECMAScript 4 structural types (http://developer.mozilla.org/es4/proposals/structural_types_and_typing_of_initializers.html). I’m not completely sure whether I think I’ll like the idea or not.
June 21st, 2006 at 12:05 pm
atleta,
Yeah, I do look forward to seeing what they come up with in C++ re: concepts. However, the C++ standards committee moves so glacially slowly that we will be waiting some time. They are in serious danger of relegating C++ to a Cobol-like life. Never dead, but seriously out of favour for any new projects.
June 21st, 2006 at 12:06 pm
Tom,
I think we can get a lot of the way without full type inference. I am not asking for the tool to determine the type of the objects passed in and verify they meet the contract. All I really need is a way to know, as the callee, what contract I should fulfill. Let it be on my head if I fail to do so, let my tests catch it, but at least give me the information. This seems like a simple step that would get over the readability problem. Sure, there is no static enforcement, but I can deal with that.
Full type inference has interesting possibilities of its own, not least the refactoring aspect. I have not written a significant amount of code in a langauge that attempts serious inference: it’s something I should try out more.
June 21st, 2006 at 12:48 pm
Thanks for the clarification of your intent. I agree that could be useful even by itself.
June 21st, 2006 at 1:16 pm
Tom,
No problem, and thanks yourself from bringing type inference into the discussion. Considering what I am suggesting is a step in the type inference direction, there is probably a lot to be learned from how type-inferring compilers handle this part of the problem.
June 21st, 2006 at 4:07 pm
One item of confusion I noticed is that there are two Toms here. I’m Tom Palmer. Tom Davies is somone else. I was replying to both you and to him in my first (“Tom” only) comment.
June 21st, 2006 at 11:47 pm
The Perils of Duck Typing:
http://www.beust.com/weblog/archives/000269.html
June 22nd, 2006 at 12:54 am
Duck typing is flat out dangerous, and I think you can leave it at that. There is a benefit in terms of not dealing with “hassling” conversons, but I find that is less of an issue these days when LOC required to code is no longer directly a measure of potential productivity.
In my experience is definitely more fun to type, but not worth the extra effort in terms of code maint. as the project gets better. As a lazy programmer, IDE’s should be doing my language testing and correction. I should be doing only the logical errors.
In practice, I find ironically that static typing allows me to think at a higher level and stay there. I know I’m not the only programmer who feels that way.
June 22nd, 2006 at 12:55 am
btw, nice antlr articles ?
June 22nd, 2006 at 2:43 am
The Angry Programmer » More stuff to read says:[…] http://www.alittlemadness.com/?p=28 […]
June 22nd, 2006 at 3:05 am
actually i agree with most of what you say BUT there is some essential
piece of information missing: WHEN?
when should i type or when should i not type. see my blog here;
http://yozzeff.blogspot.com/2006/03/static-typing-vs-dynamic-typing.html
June 22nd, 2006 at 9:06 pm
Duck typing is dangerous? What the hell is that supposed to mean? Duck typing is not for hardware driver implementations, yes, but do you know how they always say one shouldn’t run with scissors?
June 23rd, 2006 at 7:13 pm
I am suprised those who have mentioned Haskell have not mentioned typeclasses. In Haskell, variables have types. Those types may belond to one or more typeclasses. A typeclass is like a Java interface, but you can require more than one of them for a type, they can have default method implementations, and most importantly you can add them to existing types without having to touch the original type definition. This means that you can have a typeclass that is local to a module without bothering other modules. Some of the typeclasses from the standard library have language support, by being derivable by the compiler or having special syntax.
Built-in classes include:
Eq — provides == and /=
Ord — requires Eq, and also has comparison operators
Enum — for enumerable types
Num — requires Ord and Enum, provides the most fundamental numeric support
Show — has the show method which is very like Java’s toString()
Read — the inverse of Show: has a read method to read the type from a string. Note how dispatch to the correct implementation of read is based solely on the required *return* type, not the data in the string — something you cannot do in Java or Python
Monad — difficult to explain, it’s related to the ability to hide repetative plumbing code, and allows use of the “do”-notation for syntactic sugar
See the Haskell report for more on this.
July 6th, 2006 at 5:25 am
A couple of observations:
I once worked on a Smalltalk GUI project where I was responsible for some C++ work that implemented a capatible interface for things (network and DBMS) that needed to be called by smalltalk but for which the C-callable libraries needed a nicer wrapper. In three months the (experienced) smalltalk developers delivered code for a user application that I would have estimated would take 9 months to develop in C++ or Java. I then switched to testing. After 6 months we were still finding bugs, most relating to the extensive use of duck typing that the developers had engaged in.
While C++ templates might seem the answer the current state of the art leaves much to be desired. When they work they work well. It is in development when you unwittingly pass in a class and get an error message 100 lines long. Someday it will be possible for the compiler to tell you “Template Foo called from Template Bar at line 999 of the Fraz routine requires that class Baz implement a function iterator()”. Figuring this out from the error messages C++ compilers today give is almost impossible. At least Python gives you a stack crawl that gives you the information at a glance even if is already runtime.
September 7th, 2006 at 12:49 pm
I have a wish for a sort of duck-typing as described by the OP. I want to be able to specify a “type-scope” for variables, including parameters. The scope is specified for a name and the types that name can be treated as.
Assuming types have methods (and maybe properties, events, etc.) the scope would say, “For this type, only allow that which is common to all the types specified, or if in a type-scope (recursive case), further restrict what can be done.” In other words, take the intersection of methods (or whatever) for all the types specified, and that set is what is strongly-typed checked. Or in plainer terms, if it could work, let it work. (Casting can be done, and when that cast fails, throw a run-time exception.)
It would look something like this (I am using a C-like syntax):
a as String, List, b as float, int, Imaginary, Matrix
{
a.Size; // Ok.
a.Reverse(); // Ok
a.ToUpper(); // Compiler error: List does not implement ToUpper.
b.Absolute(); // Ok
b.Invert(); // Compiler error: float, int, Imaginary do not implement ToUpper.
a as String
{
a.ToUpper(); // Now this is OK as a is only a String
}
if a is String
{
a.ToUpper(); // This is OK to as a is only a String
}
Method2(a); // This will be ok if Method2 takes one paramter of type
// String or List
In effect, a limited type-inference is being done. Not only is the type considered, but so is the method or property or whatever is being invoked on/with the object. One can imagine the declaration being a list of methods (including the types) that determine if a type will pass static-type checking.
I think I would prefer the idea of the OP, which is the same idea that I have wanted for a while, but this seems like a “more type-safe” compromise.
While I am at it, how about declarations like this: *.ToString(), *.DoesThis(), *.HasThis, etc. This would save having to implement interfaces for the sake of being able to do this.
As for my background in relation to dynamic typed languages:
I have used Python (after much C/C++) coding, and despite my fears about no static typing, I found I was able to code productivily, correctly and quickly while feeling that the language was not getting in my way nearly as much as C/C++ did.
I was suprised that my mind set seemed to change in that I quickly adpated to the lack of strong typing and became more careful somehow such that I did not have anywhere near the number type errors that I was predicting.
One rule that I have heard is “strong testing not strong typing”, which I think in the end must be true, but I think strong typing still is helpful for its ability to help me to try to do what I want to do.
I am now using C# and really like that language. However, I miss multiple-inheritance and especially the dynamic typing. I don’t miss as much things like changing the signature of a type. Oh, and indenting for blocks, I really liked that too.
September 5th, 2007 at 2:00 am
a little madness » The Key Thing In Python’s Favour says:[…] languages favour writers over readers. I covered one example of this problem back in my post on Duck Typing vs Readability. This is why it is so important that a dynamic language is designed with readability in […]
February 11th, 2009 at 5:11 am
a little madness » Languages, Complexity and Shifting Work says:[…] Less-precisely specified APIs. As APIs contain no type information, they can be harder to read and learn. Documentation helps, but this kind of documentation comes for free (with verification by a compiler) in a statically-typed language. […]