Hacker Timesnew | past | comments | ask | show | jobs | submit | JonCoens's commentslogin

I should have emphasized the speed of deployment being a first order concern more. We certainly can (and do) build our code for every change, but not at the speed that we want to be updating.

We use a monorepo for all of the benefits it has, and deploying fast business logic updates this way helps mitigate one of its downsides (particularly when you've maximally parallelized the build). I've found https://danluu.com/monorepo/ to give a quick overview of how chopping up the repo would have separate downsides.

The section about "Sticky Shared Objects" speaks directly to mutable state across code modifications, just with a Haskell-minded focus.


How much is this because of Haskell's build times in particular? Is there a sort of "target build time" that would make you more comfortable with this stuff


I don't think coming across these problems in general is Haskell specific. We've grown enough to bubble these issues up in this Haskell project, but would have needed to do something much sooner if this was C++.

> make you more comfortable with this stuff

Which stuff are you referring to? Overall I'd love if all builds were significantly faster, so we contribute to upstream GHC to make it better in the ways we come across. Our platform has a deployment SLA that we strive to maintain as our "target build time".


I'm assuming you're asking what's important in writing "production" Haskell rather than "toy example" Haskell.

Ixiaus's point about mechanics more than theory certainly rings true, though we did think a lot about whether to use GADTs for the Dimension type. Overall I see this as similar to writing "production" code in other languages, going through a couple feedback loops using real use-cases. Profiling to find the bottlenecks, observing how APIs are used in practice compared to intent, and reaching the service to a stable equilibrium.


Build times for this library haven't cropped up as a first-order concern. Using GHCI and `stack test` for the dev workflow has been fast enough (though could always be better).


One of the cases we found while performance profiling was a tradeoff between memory usage and computation completion. On certain requests FXL would use too much memory and be halted by the equivalent to AllocationLimits, while Haxl would happily plow on using less memory and complete the request. When looking at many of those requests in aggregate, the end result would have more successfully completed requests but with longer response times mixed in. Completing more requests was seen as a win over the apparent decrease in throughput.


I'm an engineer on the Haxl project and am really excited to launch this today. Ask me anything!


Are you employing the ideas behind reactive programming? And can you explain the types of monads you used for what problem and why? I am writing a paper on Functional Reactive Programming and Haxl really made me curious. The paper (currently in german, but I'll translate it) proposes a new Hypothesis that tries to shred FRP in general, by showing a novel way that solves some of the problems automatically that naturally occur with FRP.

I am really interested in seeing how you solve problems for distributed systems with Haxl and how query sharding is handled etc..

I've wasted a whole day looking for Haxl online a few weeks ago, just to find out that it wasn't released yet. The release really makes me happy :)


This is in a similar space to reactive programming, but isn't reactive at its core. The best explanation of the monad we use developed is described in the paper here: http://community.haskell.org/~simonmar/papers/haxl-icfp14.pd...

Query sharding is at the data source layer, which Haxl doesn't delve into. It's up to each data source integration with Haxl to do the appropriate routing/etc.

Hope you find it useful!


How large has the haskell team at Facebook gotten (unless there isn't an official group and its on a project by project bases)?


Haxl is the only team using Haskell in prod, so the team itself isn't all that large. With the traction we gain, though, we could grow.


Is Bryan O'Sullivan and the team from his Haskell-based startup Facebook acquired in 2011 still there? I sat in on a class of his a while back and remember him ruefully laughing about having to use PHP now.


Bryan is still here. He's actually kicking off a Haskell class within Facebook at the moment - https://twitter.com/bos31337/status/475335956556705792


Great! He's obviously a great advocate for Haskell at Facebook.


I'm confused, Haxl is an open source library, right? So what are you using Haxl for at facebook?


The blog post - https://code.facebook.com/posts/302060973291128/open-sourcin... - describes the initial usage - to assist the Sigma system in answering questions like "Is this content spam?".

Some background on that system, including the FXL component that Haxl replaces, is here: http://research.microsoft.com/en-us/projects/ldg/a10-stein.p...


I'm curious how this is executed.

Is it like a query engine, where you work with the entire query up-front, apply transforms and build a query plan?

Or is it more like an event loop, where you run as far as you can until the code blocks on IO, batch up and send all the pending IO requests, and run further when the tasks you're blocked on resolve?


Part of the beauty is that the actual way IO (note: in this version, IO here means 'reads from the network', almost always) is scheduled is abstracted away such that we could go with either approach w/o impacting client code.

That said, the way it currently works is more like the first. You can think of the entire haxl run (program) as an AST that is given to the execution. It expands as much of the AST as possible (anything that's not IO), and anywhere it needs IO it enqueues those requests to be scheduled. Once it's explored as much as possible, it aggressively schedules the IO (deduping, batching, and overlapping the calls). Once it all comes back, it unblocks the AST where it can, and repeats the process.

This isn't necessarily the optimal scheduling (as you point out, unblocking each part of the tree as each result comes in might be better). It was specifically designed to make it easy to play with this kind of stuff later. Since the concurrency is entirely implicit the implementation is entirely abstracted away.


Have a look at the SQLTap service written by the guys from DaWanda.com (https://github.com/paulasmuth/sqltap). It does basically exactly that for SQL queries but is implemented as a standalone Java/Scala SQL proxy server.


Is there any resources about why Facebook uses haskell? What's your experience?


Our blog post goes into some of this: https://code.facebook.com/posts/302060973291128/open-sourcin...

Interpreted code was no longer cutting it for perf reasons, and any time you create your own language you end up reinventing the entire tool chain (debuggers, profilers, etc.). Haskell provides so much functionality in the language itself and has mature solutions to the other issues plaguing us in FXL, so it was a natural choice.


lbrandy also gives a good explanation here: https://qht.co/item?id=7874537


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: