Interview with Daniel Bachler
There are rumors about Elm 1.0 coming soon, so today we decided to interview Daniel Bachler, a guy with a commercial experience in this and many other languages. Daniel is a true polymath, with a deep engineering expertise, so we recommend to check his site and follow him on Twitter in case you're interested in Elm, F#, functional programming, or computer science in general.
Can you introduce yourself to our audience?
Sure! I'm 36 years old, I live in Berlin and I work as a software engineer for Douglas Connect in Switzerland and together with my wife we also run a photography studio and shoot a lot of weddings together during summer all over Europe and sometimes beyond.
At Douglas Connect we develop solutions in the field of toxicology with the vision to reduce animal testing and replacing it with carefully guided in-vitro tests and computer models. A lot of my day to day work is about improving data management so we can even start building machine learning models in the first place. We use a few different languages and technologies, chief among them F#, Elm and Python.
Your professional milestones for us to better get the context?
I got into programming when at 14 I switched to a specialized school for IT we have in Austria. We did not have a lot of theory but we were taught quite a few languages on a practical level: first C, then C++ on MS DOS, and then Visual C++ on Windows, Cobol, PL/1, IBM390 Assemler, Prolog and maybe a few more. These were mostly old languages even then, but it was good to get exposed to different concepts. After school I worked with C# (which had just come out) for a long time. Together with two colleagues we built a pretty sophisticated data analysis app from scratch and that exposed all of us to a lot of different areas: from making comfortable UX to writing DSL parsers and high performance multithreaded optimization code.
Towards the end of that period I got interested in looking for other approaches outside the object oriented C# world that I knew well. I wanted to learn Haskell since a friend had told me many fascinating and frankly also somewhat scary things about it. I tried it but found it hard to grasp (this was 2013 or so and learning resources were more basic than they are today). Around that time I also stumbled across F-Sharp for Fun and Profit and Scott Wlaschin's explanations of functional concepts in F# made a lot of things click for me that were then also very helpful in Haskell.
ML-like languages are on the rise today: F#, Elm, ReasonML... – how would you compare them?
I think they are all interesting languages with a lot of important similarities and shared features. Maybe the most important for me personally are algebraic datatypes – something that is sorely lacking in a lot of more mainstream languages. For example, if you think about how to model an operation status with a progress while running, a floating point result if the operation succeeds and a string error message if it fails, a language with sum types can express this so beautifully (like here in Elm syntax):
type OperationStatus = Running Int Int | Success Float | Error String
All of the ML family languages I know also have pattern matching and exhaustiveness checking, and that together is a really powerful tool to write code that delivers on the promise of "making invalid states unrepresentable" (there are a few of great talks on this phrase btw.).
On the other hand if you think about how something like this is represented in languages that don't
support sum types, then you end up with one of two solutions. Either you create a record/class that
can store all these pieces of information at the same time but only some of them make sense at any
time (e.g. the error message and progress values are unused if the status enum indicates success).
Or you resort to creating a class hierarchy for this, maybe with some abstract base class and then
things get messy as you end up pulling all kinds of concerns into those classes or end up testing on
the concrete type, which kind of defeats the purpose of the class hierarchy and will easily break if
you add new cases to your
OperationStatus type hierarchy.
As for the other similarities, there is of course the syntax that is similar, they have static typing, type inference, first class functions etc, but you can find some of these features in many languages nowadays. If we look at the differences, two important axis of differentiation come to mind.
One big difference between F# and OCaml on one side and Elm, Haskell, and Purescript on the other is that the former allow mutation and implicit side effects while the latter ones do not. I like the more rigid framework of pure languages and the fact that in Haskell and Purescript the type system allows you to discern between side-effectful and pure code. But there are also situations in which mutation and more imperative constructs can sometimes be more intuitive and easier to tune for performance than e.g. writing intricate folds.
Another important dimension to compare them is the complexity of their type systems. Here F# and Elm have simpler type systems and Haskell and Purescript have features like type classes and higher kinded types. Which allows them to create higher level abstractions in their type system like the Functor typeclass that defines a generic map. In Elm you have to redefine map for every type you want it for and can't express "the set of types with a map implementation". In F# you might be able to do some weird things with generics (what parametric polymorphism is called in the .NET world) and type constraints but nobody bothers.
Haskell is an interesting and unique case in that it's the only language of the bunch that is lazy by default. This is a very interesting feature and it allows a uniform type signature for lazy and eager values. The downside is that it's sometimes hard for non-experts to reason about the time and space complexity of code.
There is also a very interesting social distinction. I think Elm especially tried very hard to be beginner friendly and rejected some abstractions to stay easier to learn. This attracts a certain kind of people, which leads to a self-fueling circle. Purescript, on the other hand, very much went into the "all the cool type features from Haskell, but without the legacy stuff" kind of direction and it, in turn, has attracted people that like thinking about and exploring higher abstractions. I think both are very valid, and I think it is great that we have such choices now so everyone can find a community they feel comfortable with (which of course might change over time).
PureScript is getting closer to 1.0 as well. What's your opinion about this language/ecosystem?
One very cool thing about Purescript is how far you can get the type system to help you, especially
when compared with a significantly simpler language like Elm. This starts with simple things like
Purescript allowing you to define your own instances for
Ord so that you can use arbitrary data types
as keys in dictionaries. Or the fact that Purescript has newtypes which are a lightweight zero-cost
abstraction around simple data types that add a lot of type safety. Purescript also comes with great
metaprogramming capabilities in the form of generics (not the parametric polymorphism think that
Java/C# call generics!) and the generic deriving mechanism. This allows a lot of boring, repetitive
work to be automated (compare e.g. the manual, rather tedious writing of JSON encoders/decoders in
Elm with Purescript's Simple.JSON module). The last version of Purescript also added more type level
programming so you can now implement compile time type-safe SQL string interpolation and things like
What I found very interesting was the change of the
Eff type in the latest version of Purescript –
before 0.12 Purescript's
Eff type was using the row type feature. I thought that this was very nice
because just looking at a function you could see which kinds of side effects it would have (e.g. a
function could define an
Eff type with labels
HTTP but nothing else). But it seems
that it did not turned out that well in the end – these were just (type level) string labels and different
libraries could accidentally (or intentionally) reuse the same labels which could lead to problems.
So they switched to a simpler system with 0.12 that is closer to Haskell's
IO monad (so you just know
whether a function is pure or effectful but no more). If I understand it correctly you can still
implement the row based system yourself on top of the new
Effect but, by default, the row types are no
longer used to track effects. Which is an interesting case, I think, of a mental tool that sounds great
in theory but was apparently not validated in practice.
The flip side of all this is that it looks to me that the Purescript ecosystem is a bit more fragmented. In Elm, since there is only one way to write applications and the type system only goes this far, most libraries are immediately compatible with each other and the learning curve for any arbitrary library is very low, even for people new to Elm. In Purescript, if you come into contact with profunctor optics/lenses the first time it can be very rewarding but also quite time consuming.
I hope that Purescript will continue to prosper and hope to use it more in the future.
Elm went through a lot of breaking changes in recent years. How do you see those changes now,
in retrospective (the renouncement of FRP approach in particular)?
The renouncement of FRP was an interesting case. I think it made sense since it simplified the language somewhat and it looked like the Elm Architecture was the way forward for most normal web applications anyhow. At the same time it's a bit sad because it was a very interesting approach to build Signals so deeply into the language and have a "fold from the past" as one of the core library functions. I think we haven't really found the sweet spot yet in how to deal with change over time in code and the reactive and FRP ideas warrant some more exploration, especially when combined with compile time type checking of signal graphs and/or better tooling support like we see with the marble diagrams in RxJS.
As much as I like Elm as a language and ecosystem, I do have some problems with how the language is steered and this was also part of the reason why we are moving away from Elm at Douglas Connect. My issue is less with breaking changes and more with the way improvements are handled. For example, one core selling point of Elm is that there are no runtime errors, yet there are several documented crashing bugs that have lingered for a long time and just not been fixed, often for no understandable reason.
What about Elm's progress on the backend. Any frameworks, SSR tools, plans you're aware of?
I stopped following the dev channels some time ago, so I can't really provide much insight here. I think server side rendering is actively being worked on but, unless there comes a new way to do FFI in Elm, I doubt that we'll see it getting much use on the backend. If you want to use an ML language there, I think Haskell, Purescript, OCaml and F# are all established choices that make a lot of sense.
You organized a number of Elm meetups and clearly have a lot of inside experience. How would you describe the Elm community? What about people coming to Elm – are most of them JS programmers or, say, Haskellers?
I think Elm has a very nice community that is very welcoming to newcomers. For a lot of newcomers, Elm is the first ML family language. There are some experienced people with extensive Haskell knowledge, but I have a feeling that quite a few of them moved on to Purescript. Maybe because once you have tasted the sweetness of typeclasses and generics you just can't go back ;)
You have an experience with scientific programming. Most people would immediately recall R and Python...
Which other, maybe newer languages, do you think, have a potential in this area?
Yes this is a constant source of discomfort for me – R is really an awkward language, but so many libraries have been built on top of it, especially in bioinformatics, that you can't just start rewriting everything in your language of choice. What we often end up doing, is having small web services written in R or Python that do one specific thing, and then write the coordination and UI stuff in other languages (Elm and F# mostly). This way you can have the support of a good type system where you need it (in the complexity of coordinating state and complex UIs) but still use the vast scientific libraries of R and Python.
As for different languages, F# is in an interesting position here because it has some support for interfacing Python and R and it is available as a Jupyter kernel (interactive code notebooks) and as a first class language for example on the Azure notebooks. So we might explore that a bit more to maybe write the web service wrappers in F# even and then just call out to specific R or Python functions from F#.
Julia is another interesting language that is built as a language for scientific computing in a modern, highly parallelized world but I haven't really had a chance to dive into that.
Finally for machine learning it looks like the immensely popular TensorFlow framework for deep neural networks might adopt Swift of all languages as its statically typed alternative to Python. This could be very interesting because it would let you do static code analysis to check if e.g. your tensor sizes all match up, instead of encountering these errors at runtime.
Can you share some of your ambitious plans with us? Like "Master Category Theory" or "Write my own language" :)
In the near future I mostly want to improve scientific data management as part of my work at Douglas Connect and give back to the F# community by helping with documentation and work on tooling. With my good friend Fredrik I would like to explore Haskell and Purescript so that I would hopefully get more fluent in both. As I said above, I am also thinking about a language agnostic statically typed FP meetup or something like that to spread the love for these languages more broadly :) Further out who knows – maybe I'll get involved in the P2P technologies that are emerging as alternatives to centralized social networks but combine that with an ML family language or try to spread its use in scientific computing.
What can you advice to people starting their ways into programming right now? Which goals to pursue,
which pitfalls to avoid?
I think Elm is a very good language to start with as it allows you to write web applications and you learn a lot of important concepts in a very nice and clear way without too many distractions. After that I think Python is a very useful language to know and it broadens the areas you can work with immensely: from data science to devops to web backend development. Hopefully by then you will have learned the beauty of immutability and treat mutation and side effects with the respect and care they deserve :) After that it really depends on what you want to do. I think learning Haskell is very rewarding and extending your mental horizon (or Purescript if you want to do a lot of a frontend or Node stuff). Erlang/Elixir have a lot of cool ideas, even though I miss the safety of a static type system.
For software engineers who have already learned one or more languages I think the ML family of languages
is a very worthwhile target of study. If I look back at my own history of code, I can only say that
I wish I had learned an ML family language directly after my first OO language (or even before). The
way we were taught OO really ended up making my code much more coupled and brittle than it had to be,
and for a long time. When thinking about a data model, the first question I asked is "Can this be
described using an IS-A relationship?". E.g. in the status example above, I would have created an
IOperationStatus and then created 3 implementations for it because "
Running is an
But what do they really have in common? For the purpose of expressing the functionality in code, does
it help you anything that the
Success classes all inherit the same interface?
I think are needlessly error-prone (think
NullReference exception and everything being mutable by default)
and have relatively awkward dimensions of abstractions: Factories and Singletons, and Dependency Injection
that are all basically non-issues in ML family languages. So I think it is worthwhile learning some
of these, if for no other reason than to be more compatible to existing jobs. (For me personally
the language I can or cannot use makes quite a difference in how happy I am in a job, though I think that
there are more important criteria to select jobs by).
But it is very useful to learn ML family languages and, given the chance, we should push for languages that allow us to write better code. I think we also need more people who are curious and study what is out there that allows us to improve our discipline, but who also try to distribute their knowledge in a non-condescending way. So if you are starting out today, stay curious and be sceptical of buzzwords and "best practices" and try to search for ways to make software development better – because I think we are only getting started with software and we will need to tackle more and more complexity in the future.
Thank you, Daniel! A lot of interesting and valueable information to think upon. We wish you the best in all your career and private aspirations, and hope to see you here again.