Erlang/OTP News feeds

Welcome to Planet Trapexit. On this page, you will find the latest entries from all the news feeds currently in our database. If you know feeds that would be suitable here (that is, that relate to Erlang/OTP in any way, please add them here.

June 04, 2011

REALITY.SYS

Stierkampf mal anders.

Wie so viele andere schwachsinnige Rituale ist auch der sogenannte “Stierkampf” (besser: langsames Stiereschlachten) eine “Tradition”. Das Bemerkenswerte daran ist aber vor allem die Tatsache, dass angeblich aufgeklärte Westler wohl keine Probleme damit haben, dieser morbiden Veranstaltung beizuwohnen. Menschen (fast) wie du und ich sitzen schön brav in der Menge, oft von ihren minderjährigen Kindern flankiert, und schauen zu, wie ein Stier langsam zu Tode gequält wird, während mehrere Zirkusfritzen, sog. “Torreros”, in ihren glitzernden Schlachterkostümen um ihn herum tänzeln.

Ich könnte jetzt mehrere Millionen Gründe aufzählen, warum der sog. “Stierkampf” eine Tierquälerei ist. Dies wäre aber nicht so “ökonomisch” und schon gar nicht effektiv, wie wenn ich einfach einige Videos anführe, wo das Tier gewinnt.

Solche Videos zeigen den einzig wahren Stierkampf!

Und ich muss zugeben, dass ich es als sehr angenehm empfinde, zu sehen, wenn das Tier gewinnt. Jedem Besucher, jedem “Torrero” und vor allem den Vollhonks, die hinter den Stieren her laufen, kann ich nur die Bekanntschaft mit dem Stierhorn wünschen. Manchmal kann die Vernunft einfach nicht anders vermittelt werden. 200 Jahre Aufklärung sind für einige von uns eben nicht ausreiched.

Click here to view the embedded video.

Click here to view the embedded video.

Click here to view the embedded video.

Click here to view the embedded video.

Click here to view the embedded video.

Click here to view the embedded video.

Click here to view the embedded video.

 

by admin at June 04, 2011 07:27 PM

Immer dieses Händewaschen…

Angesichts der, ähnlich SARS und Schweinepest, herbeigetrommelten Seuche “EHEC”, muss man sich ernsthaft die Frage stellen, ob wir in Sachen Hygiene nicht immer noch im Mittelalter stecken.

Hallo Leute! Sich die Hände richtig zu waschen ist eine Sache, die man normalerweise als Kleinkind gelernt haben muss!

Und die Betonung, dass man nach einem Toilettenbesuch die Hände waschen sollte, ist mehr als lächerlich. Es ist erschreckend, traurig und zeugt nur von unserer sterilen Lebensweise in dieser angeblich fortschrittlichen Welt. Ja, sterile Lebensweise, als Bezeichnung für unser Abgekoppelt-Sein von elementarsten menschlichen Ritualformen wie richtiges Waschen des eigenen Körpers, richtiger Umgang mit fremden Sachen (z.B. indem wir ausgestellte Waren nicht ohne Weiteres betatschen sollten!) etc.

Wie blamabel muss es denn für uns sein, wenn man uns über die Medien darum bitten muss, sich die Hände richtig zu wachen und das Gemüse bzw. Fleisch vor dem Verzehr ebenfalls gewaschen zu haben.

Aber nein, am leichtesten ist es doch, diesen blöden Südeuropäern alles in die Schuhe zu schieben. Ja, genau, die Spanier waren’s, diese gemeinen Kerle. Sie haben uns all das eingebrockt und siehe da: wir müssen uns jetzt die Hände waschen!

 

by admin at June 04, 2011 08:23 AM

June 03, 2011

ejabberd@jabber.ru

ejabberd 2.1.8 - Fix PubSub

The ejabberd 2.1.7 released yesterday contains a bug that breaks PubSub.

If you use ejabberd 2.1.7 and PubSub, you can find the patch and the fixed mod_pubsub.beam in the page EJAB-1457.

Updated binary installers will be available next monday in http://www.process-one.net/en/ejabberd/downloads

by badlop at June 03, 2011 11:39 AM

June 02, 2011

Process-one Blogs

“Google Social”: A paradox of its own

Yesterday was a sad day for the Internet, as an open and decentralized content repository.

image

At the same moment, Twitter launch its follow button and Google extend its +1 button for websites. They both are attempting to catch up with Facebook success with the "like button".

Of course, it feels ridiculous for content producers and consumers to expect to place more buttons on their pages in the hope to attract more traffic. The escalation in the battle for embedding buttons in web pages and the battle to get the users clicks is leading nowhere.

I would rather however focus on Google move, the paradox and possibly the trap it could become for the company. I have read numerous times that Google initiative in "social" area had always been a failure. Even Eric Schmidt admitted at D9 conference that he had messed up on Google Social initiative.

In my opinion, the main reason for failure is because social, as defined by today's market, is the complete opposite of Google model and view of the online world.

"Like buttons" are replacements for the HTML links that authors used to place on their webpage to talk about topics they liked and point to relevant content. But as HTML links were purely open and available for search engine web spiders, the end result of the "like button" action is closed and proprietary. Facebook and Twitter are one of the biggest threat for Google: they are cutting Google supply of their raw material: the data to analyze, both content and relations (links).

Google reactions yesterday has been to launch its own initiative, its own "like button" to gather its own data. However, doing so Google enter the trap of the closed Internet, in which they cannot prosper.

Google's model has been to compete on algorithm, not data. They have beaten all other search engine using the same data set than anyone could use. They hire smart scientists to produce clever mathematical solution to the world data analysis problem. Algorithm competition is their reason to exist: organizing the world data, not owning it.

Launching a "like button" is admitting that they were wrong in their core vision. Can the world be organized automatically by algorithm, from news to search ? It is an impossible equation to solve for them without deeper changes in Google's vision. In other words, the social graph that Facebook wants to build is actually what the web is about, and should be at the heart of the open web. Copying that closed vision of the web is a mistake.

What happen today looks like Yahoo!'s revenge (even if Yahoo! company is now out of the game): curation is the return of directories of content, with a larger number of curators and ranking depending on the proximity of the curators to yourself.

Google simply cannot win with this definition of the "social web". If social is about building competing subsets of Internet, defined by their user base, Google will decline on search, their core business. They cannot remain the entry point on the Internet for a large number of users, because they could not be able to bring them on every place you might want to go on the web.

The solution for Google requires to change the view of the social web, not to build their own clone of social network. Google social should be about putting people at the heart of an open Internet, where companies keeps on competing on algorithms. They already have valuable tools to rely on: google reader can share publicly what you like to read, blogger to publish your thought. They can build "Google People" around them.

I think that Google should keep on producing tools to make easy to share publicly what people would like to share, from status, to picture, though and links. This should be published in a well defined and documented way that will lead to a more open web. To be frank, such standards are emerging in many initiatives (webfinger, salmon, foaf, ostatus, opensocial and many others). Google promotes or even design several of those web protocols.

However, Google +1, until it leads to publicly available data on user profile, is a move in the wrong direction for Google. Let's hope for the Internet that their next move will open the data and bring the competition back to algorithms.

 

Credits: Photo by Blaise Alleyne

by Mickaël Rémond at June 02, 2011 12:59 PM

June 01, 2011

ejabberd@jabber.ru

ejabberd 2.1.7, 3.0.0-alpha-3 and exmpp 0.9.7 -- security release

ejabberd 2.1.7, and ejabberd 3.0.0-alpha-3, and exmpp 0.9.7 have been released, after a few months of development. They contain a lot of bugfixes, improvements and some new features.

If you have ejabberd running in a public server, please update it immediately: those releases contain a security fix that disables entity expansion completely to prevent billion laughs DoS attack (CVE-2011-1753).

ejabberd 2.1.7

This release contains many bugfixes, improvements and a few new features.

A short list of changes:

read more

by badlop at June 01, 2011 05:05 PM

May 25, 2011

erlang.org RSS Feed

R14B03 released

Erlang/OTP R14B03 has been released as planned on May 25:th 2011. It is the third R14 service release.

See the release notes in the readme file

Download the new release from the download page.

Highlights:

  • Diameter is a brand new application in this release. The application support the diameter protocol specified in RFC 3588 and is intended to provide an Authentication, Authorization and Accounting (AAA) framework for applications. 
  • The documentation for stdlib and kernel now uses type specifications from the source modules which should guarantee that the documentation and code are consistent with regard to the type information.

May 25, 2011 04:00 PM

Erlang Factory News

Countdown to the Erlang Factory London!

There are only two weeks left to the Erlang Factory London. If you haven't registered yet, you can still do this by clicking here.

We have confirmed speakers including Robert Virding and Mike Williams (Inventors of Erlang), Tony Falco (COO of Basho Technologies), John Hughes (Inventor of QuickCheck) and Eric Merritt (Author of Erlang and OTP in Action). Many more are listed on the Speakers Page.

You are welcome to follow @erlangfactory on Twitter.

May 25, 2011 09:01 AM

May 22, 2011

Programming in the 21st Century

Constantly Create

When I wrote Flickr as a Business Simulator, I was thinking purely about making a product--photos--and getting immediate feedback from a real audience. Seeing how much effort it takes to build-up a following. Learning if what you think people will like and what they actually like are the same thing.

It works just as well for learning what it's like to be in any kind of creative profession, such as an author of fiction or a recording artist.

Go look at music reviews on Amazon, and you'll see people puzzling over why a band's latest release doesn't have the spark of their earlier material, pointing out filler songs on albums, complaining about inconsistency between tracks. Sometimes the criticisms are empty, but there's often a ring of truth. There's an underlying question of why. How could a songwriter or band release material that isn't always at the pinnacle of perfection?

After years of posting photos to Flickr, I get it. I'm just going along, taking my odd photographs, when all of a sudden one resonates and breaks through and I watch the view numbers jump way up. Then I've got pressure: How can I follow that up? Sometimes I do, with a couple of winners in a row, but inevitably I can't stay at that level. Sometimes I take a break, not posting shots for a month or more, and then I lose all momentum.

When I'm at a low point, when I devolve into taking pictures of mundane subjects, pictures I know aren't good, I think about how I'm ever going to get out of that rut. Inevitably I do, though it's often a surprise when I go from a forgettable photo one day to something inspired the next.

The key for me is to keep going, to keep taking and posting photos. If I get all perfectionist then there's too much pressure, and I start second-guessing myself. If I give up when my quality drops off, then that's not solving anything. The steady progress of continual output, whether good or bad output, is part of the overall creative process.

by James Hague at May 22, 2011 06:00 AM

May 19, 2011

Damien Katz

I'm in Boston next week for a Couchbase Meetup

Get yo couch on! Sign up here: http://www.meetup.com/Boston-CouchDB/events/17374461/

damien@couchbase.com

by Damien Katz at May 19, 2011 07:03 PM

May 18, 2011

Erlang Training and Consulting
- News

18 May 2011: Radio-Electronics publishes an article on Open Telecommunications Platform, OTP for Open Communications by Francesco Cesarini

Francesco Cesarini, Technical Director of Erlang Solutions, discusses the benefits of the Erlang Open Telecom Platform and how it can be used in product development for writing software for telecommunications systems.

In the article he looks at Erlang as a programming language, what exactly is OTP, OTP as more than a telecoms platform, and the benefits of OTP. This is an article you can't afford to miss. For the full article check out the Radio-Electronics website.

May 18, 2011 04:17 PM

Erlang Factory News

Early Bird Rate - Book before 22 May to save £100

Book your place now at the Erlang Factory London 2011 at the Early Bird Rate of £495 for the conference and £1295 for the University and conference. This is a saving of £100 off the standard price of the conference!Only valid until the 22 May!

Speakers so far include , Kostis Sagonas, the leader of the HIPE Team at Uppsala University, Marcus Kern, CTO at MIG, Steve Vinoski, distributed systems expert and Scott Lystig Fritchie, senior software engineer at Basho Technologies. Many more speakers are still to be announced.

Book now to secure this fantastic rate!

May 18, 2011 09:16 AM

Trapexit's Erlang Blog Filter

More Erlang Web Server Benchmarking

In my previous blog entry I questioned the value of most web server benchmarking, particularly as related to Erlang. Typical benchmarks are misleading, inaccurate, and poorly executed. Perhaps worse, the intent of publishing them seems to be to assert that the fastest web server (at least according to the tests performed) is of course also the best web server. You’d think the flaws of this fallacy would be so obvious that nobody would fall for it, but think again: watching the delicious “erlang” tag over the past few days revealed the benchmarks my blog post referred to to be one of the most bookmarked Erlang-related pages during that timeframe.

Not surprisingly, though, it looks like I’m not the only one bothered by poor benchmarking practices. Over on his blog, Mark Nottingham just published a brilliant set of rules for HTTP load testing. It’s quite instructive to take your favorite set of published web server benchmarks and see just how many of Mark’s rules they violate.

Like I hinted last time, if you want benchmarks, you are best off by far if you run them yourself. That way, their relevance to the problems you’re addressing will be much more likely, and you can run them in a similar, or even the same, environment on which you plan to deploy. You can also gear the benchmarks to much more closely resemble your applications and the loads you require them to handle. Doing the benchmarking work yourself will give you valuable hands-on experience with the servers and frameworks you’re considering, allowing you to get a feel for important factors such as feature completeness and correctness, ease of development, flexibility, and ease of deployment and runtime management/monitoring, none of which can be gauged by someone else’s performance benchmarks. Finally, by doing your own benchmarking you can also help ensure the validity and usefulness of your results by following Mark’s load testing rules.

by steve at May 18, 2011 05:54 AM

May 16, 2011

Erlang Inside

Motivated Reasoning and Erlang vs Python vs Node

A comparison between Misultin, Mochiweb, Cowboy, Node.JS, and Tornado. Tests such as this one tend to focus on either raw parsing speed or total number of concurrent connections, and which is more important depends on the ultimate application. Or they perfectly tune the tester’s favorite framework and use a stock configuration for the other competitors. [...]

by Chad DePue at May 16, 2011 03:51 PM

Erlang in Haskell

Haskellers who want to experiment with the concepts of Erlang in Haskell can now do so with an experimental Erlang-like distributed computing framework called ‘remote’. It’s quite basic but covers concurrency, selective receive of messages, local and global registration of processes, and trapping exits. Related Microsoft Research Paper.

by Chad DePue at May 16, 2011 03:49 AM

May 11, 2011

Erlang Training and Consulting
- News

11 May 2011: Juan Puig has been invited to talk at The University of Coruña's Computer Science faculties 25th Anniversary celebrations!

The University of Coruña's Computer Science faculty is turning 25 years, and in order to celebrate such an important event, this Thursday, 12th May 2011, talks, workshops and expositions will be run in Palexco, the main auditorium in A Coruña. We here at Erlang Solutions are proud that Juan Puig has been invited to talk at this event as one of the graduates from The University of Coruña. He will then give a more technical lecture on the Friday at the Computer Science department.

Location: Palexco auditorium, A Coruña, Spain
Time: 12th May, 12:30h
Program (in Spanish)

Juan's talk will be:
Title: Building fault-tolerant systems.
Summary: An introduction to the current software and fault-tolerant systems will be held, followed by how Erlang/OTP does provide with the right built-in features to build large and scalable fault-tolerant systems.

Friday 13,
Juan will be giving a lecture within the Computer Science faculty, more technical, within the slot of a Software Design course of the Computer Engineering degree.
Location: Room 3.3, Faculty of Computer Science, University of A Coruña, Spain
Time: 13th May, 11:30h
Title: "Design of abstract models with Erlang"
Summary: Introduction to Erlang followed by a quick OTP and its behaviours showing how to design large applications. Second part of the talk will be oriented to property based-testing design.

May 11, 2011 05:32 PM

Scattered Thoughts

mist

I have a new secret project. It's called Mist. The goal is to make it easy to build, use, distribute and maintain small-scale p2p apps. By small-scale I mean apps where each invocation has at most a few dozen users. You would use Mist to build the next etherpad, not the next bittorrent.

The main inspiration behind Mist is Sugar. Sugar supports seamless sharing of activities (applications) across local mesh networks or the internet, even if some of the peers don't even have the activity installed. Where Sugar is focused on providing a working system here and now, I am more interested in how distributed programming can be made easier in the future. I also want to build applications that I can use myself so my initial target is Ubuntu/Gnome.

Starting from September, I'm going to be doing just enough freelance work to support full time work on Mist. I'm aware that this is a huge project and probably doomed to failure so there a few key principles that I hope will keep me on track. Wherever possible I will reuse existing code and protocols. Mist will be built out of small components each of which are useful by themselves. Where practical Mist will be compatible with Sugar, so that any improvements can be folded back into the Sugar codebase. Finally, Mist is a prototype. The goal will be to get something up and running to experiment with different ways of building distributed apps.

So without further ado here is a misty outline of what Mist might one day look like:

  • Seamless connectivity, serverless presence and activity discovery

    • Mostly borrowed from Sugar

    • Apps communicate over Telepathy tubes

    • Use Sugar presence service for activity discovery

    • LAN / ad-hoc networks supported by Telepathy Salut and Avahi

    • Centralized communication supported using Telepathy Gabble to talk to Jabber servers

    • Decentralized communication supported using a Telepathy plugin for erl-telehash

      • Use telehash peers for ICE

      • Use link-local XMMP to talk

      • Use telehash taps to implement Avahi compatible mDNS

    • Use pgp for authentication and end-to-end encryption

  • Securely portable applications

    • Use an existing system. Some options which will need to be researched:

      • Bitfrost is specific to the XO laptop and probably too hard to port

      • Browsers provide well-tested sandboxes and mobile code but limit access to hardware. Perhaps ChromeOS or Firefox Chromeless?

      • Java apps can possibly be installed securely. Looking at the Android source may be instructive.

  • Development tools for distributed apps

    • Want to experiment with Bloom. I have some ideas for static analysis of bloom programs that might be fun.

    • Expose libraries/APIs for common patterns eg leader election, operational transform

    • Support serverless syncing of contacts, bookmarks, mail etc using DHT with enforced reciprocal storage (pretty sure I've read a paper on this somewhere)

The immediate plan for the next few months (during which I'm still working full time) is:

  • Get erl-telehash working (currently has no test suite and doesn't support taps, _ring/_line or rate limiting)

  • Get ICE working between telehash peers

  • Write a Telepathy backend using erl-telehash and link-local XMPP

  • Extend Empathy to use the telehash Telepathy backend

If you are wondering where Mist gets its name from - it's not the cloud.

May 11, 2011 09:40 AM

May 10, 2011

Trapexit's Erlang Blog Filter

Erlang Web Server Benchmarking

Over on his blog, Roberto Ostinelli published “A comparison between Misultin, Mochiweb, Cowboy, NodeJS and Tornadoweb.” I was going to write a reply comment there, but it got pretty long so I decided to publish it here instead. I’m going to ignore the non-Erlang web servers it discusses and focus entirely on Erlang. I’m not trying to really pick specifically on Roberto here, but rather I decided to finally write something I’ve been meaning to write for awhile now about Erlang web servers and benchmarking.

First, I second the request made by one commenter for including Yaws in the measurements. Roberto, if you need help with the code or setup, just let me know. If one insists on writing these kinds of benchmarks, which as you’ll learn if you read this whole entry is something I question, the least he or she could do is include Yaws since it’s the granddaddy of all Erlang web servers.

Based on the benchmark code Roberto published, I wrote the following simple Yaws module to conform to the problem statement and registered it in my Yaws configuration as a “/” appmod:

-module(yaws_bench).
-export([out/1]). out(Arg) -> [{status, 200}, {content, "text/xml", case yaws_api:queryvar(Arg, "value") of {ok, Value} -> ["<http_test><value>", Value, "</value></http_test>"]; _ -> "<http_test><error>no value specified</error></http_test>" end}].

I then measured it on my Ubuntu 10.10 two-core system using Roberto’s published httperf command against the misultin and Mochiweb code he published, and found that Yaws definitely holds its own, even though it’s a full-featured web server and does not claim to be just a lightweight library offering (sometimes partial) HTTP support as some frameworks do. For some tests Yaws outperforms misultin, and for others it doesn’t. This is interesting, considering that neither Klacke nor I have made any attempts at performance improvements in Yaws recently.

Second, the benchmarks do not compare apples to apples. Both Mochiweb and Yaws, for example, produce replies that are larger in size than misultin’s replies, primarily because they both include Server and Date headers. As I’ve learned from years of helping maintain Yaws, date calculations can noticeably and surprisingly impact Erlang web server performance, yet simply leaving Date headers out isn’t an option for real-world apps since HTTP 1.1 pretty much requires them (section 13.2.3 of RFC 2616 states, “HTTP/1.1 requires origin servers to send a Date header, if possible, with every response, giving the time at which the response was generated…”). Caches use Date headers for several reasons, for example in the absence of cache-control headers to help heuristically calculate content expiration. Even ignoring the date calculation requirements, just creating and delivering larger replies due to the presence of the Server header will negatively impact any comparisons based on request/second measurements.

Third, the benchmarking approach includes no application “think time.” How many real-world apps just blast request after request down a connection without any intervening time to handle replies? If the goal is to measure something akin to real-world apps, then the benchmarks should at least be using something like httperf’s --wsess option to simulate client think time. And unfortunately doing that is hard to get right for generic benchmarks, since different client apps will have different think times.

On a related note, what exactly is the goal of these benchmarks? To imply that faster is better? That’s unfortunately a commonly-held fallacy. Given that the blog entry states that the target is dynamic applications, then consider the fact that the performance of a real-world dynamic application is often dominated by something other than the web server — perhaps some back-end service from which page data is being fetched, for example. A real-world setup greatly concerned with performance is likely to have nginx out in front, probably with a local cache, to handle fast-path requests, shunting only those requests it can’t fulfill off to the slower back-end server. Such benchmarking games are therefore often misguided as far as real-world dynamic apps are concerned because they end up measuring something that isn’t even in the critical path in a real setup.

I don’t agree with Kyle Drake’s comment on Roberto’s blog about code ugliness, since the Erlang code posted there is very clear and would look like “garbage” only to someone who doesn’t know the language. But I do agree with the sentiment, which is that for dynamic apps, what often matters is what kind of code, and how much, you have to write and maintain to support your app. Given that Erlang web servers tend to make use of the underlying Erlang/OTP facilities for HTTP parsing and socket handling, then all things considered you’re just not going to get a huge variation in performance among them, assuming they’re written halfway decently. What matters for dynamic apps are the stability of the web server/library and the programming model it offers. These are what Roberto should really be benchmarking, but of course that’s basically impossible since stability would take a long time to prove, and programming model is a matter of taste that can’t be conveniently measured using artificial benchmarking tools. This reminds me of one of my old columns on this very issue as applied to enterprise middleware, entitled “The Performance Presumption” (PDF); the short version is that people often measure performance simply because performance is relatively easy to measure. The lesson is that you shouldn’t rely on generic benchmarks, but rather you should take the time to create specific benchmarks that mimic the app you want to develop, and base your decisions on the results of that exercise.

On top of all that, I don’t really understand the desire to keep writing new Erlang web frameworks for performance reasons. As I stated earlier, if a framework uses Erlang’s built-in packet decoding and socket handling, it won’t perform a great deal better than any other Erlang web framework. OTOH, if someone writes a new framework with the hope of providing a really nice new programming model — webmachine is a fantastic example of this — then they shouldn’t be “proving” how good the programming model is by trying to show how fast it is. Ever seen webmachine being advertised via performance benchmarks? Neither have I.

Let’s face it, the Erlang web development community isn’t large enough to support numerous web servers and frameworks. I’m sure some will disagree, but publishing artificial benchmarks designed to “prove” which is best IMO results mostly in just fragmenting the community. If you really have an itch to write a fast Erlang web server, you’d help the community much more by contributing to an existing one, including the Erlang inets web server included in Erlang/OTP and now powering the Erlang website. For Yaws, Klacke and I often take patches and suggestions from our users, and we gladly welcome solid contributions intended to improve Yaws performance. If you’re just dying to show off your chops, note that improving performance in a long-lived and highly stable codebase like Yaws without breaking anyone’s code is far more challenging than writing another new server that basically doesn’t differ much from what already exists.

Or perhaps better yet, contribute to the Erlang core. IMO the next major performance improvements in Erlang web servers will come not from minor tweaks in handling binaries or such things, but rather via radical improvements in the Erlang TCP driver or even from developing a whole new HTTP-specific driver. Unlike a war of artificial benchmarks among Erlang web servers, these approaches have a great chance to improve the lot of all Erlang web systems.

by steve at May 10, 2011 02:01 AM

May 06, 2011

erlang.org RSS Feed

ACM SIGPLAN Erlang Workshop in Tokyo on September 23, 2011

ACM SIGPLAN Erlang Workshop

The Tenth ACM SIGPLAN Erlang Workshop will take place in Tokyo, Japan, on September 23, 2011. Please see the call for papers here .

May 06, 2011 11:08 PM

May 04, 2011

Erlang Factory News

Book your place at the Erlang Factory London Now and save £100!

Book your place now at the Erlang Factory London 2011 at the Early Bird Rate of £495 for the conference and £1295 for the University and conference. This is a saving of £100 off the standard price of the conference! This is only available for a limited time!

Speakers so far include , Kostis Sagonas, the leader of the HIPE Team at Uppsala University, Marcus Kern, CTO at MIG, Steve Vinoski, distributed systems expert and Scott Lystig Fritchie, senior software engineer at Basho Technologies. Many more speakers are still to be announced.

Book now to secure this fantastic rate!

May 04, 2011 10:31 AM

Anders Conbere's Journal

Learning to Program is Easy

Years ago I spent a summer teaching science classes to kids aged 6 to 12 at the Oregon Museum of Science and Industry. This program wasn't particularly well managed, mostly a disaster, but quite a bit of fun. The most fun I had was helping teach a class on game programming to 5th and 6th graders. The class used a pretty nice implementation of Logo called MicroWorlds which benefited from having a simple GUI and IDE for the kids to use. Maybe today we'd use something like Scratch or EToys. Either way the class began with a simple introduction to making things happen in this virtual world, and some of the basics of the command structure. And within minutes there was a volley of hands that shot into the air each with it's very own bug.

That might not sounds like much, but by the time you've engaged someone to the point that they've typed in some commands and arrived at an unexpected occurrence... you've basically got them hook line and sinker. So that day and some of the next was often spent detailing the use of variables, the intricacies of syntax, and what rules exist and why they make their life easier. By the end of the week we would have flies narrowly dodging venus fly traps, and robots that had to defeat monsters.

I'm not saying every kid produced a work of genius, and some had more difficulties than others. But the nature of programming does not have to be a complex and difficult venture. Simply mapping input to a behavior gets you pretty far. This is a big part of the idea of Scheme. How do we build a language that is simple enough that our english majors can accomplish something in it, but powerful enough that our computer science students can develop complex ideas with it. The point is that most people can be taught to create at least rudimentary programs in just a few hours of tutorial.

And this really brings me to the reason I'm writing this post. And that is, that while programming might be easy, making applications, architecting large projects, putting all the various pieces together in a way that gives you the flexibility to accomplish what you might need to, but the structure to finishing what you have to, is amazingly difficult. And two different skills. You can teach someone to use python like a calculator, and to write functions that help them manage their bills. But the ability to take that knowledge and expand upon it to write a personal finance applications is much more difficult to teach.

Some of this is abstraction. Abstraction is difficult for humans, and I say this knowing that of all the creatures we've ever encountered we're likely the most capable abstractors in existence. But while that might be true, or brains are wired to work inside a rather constrained and physical world. There's a test that's often mentioned in literature where they ask the participants to accomplish a mathematical task using variable names, and then ask it again relating it to cards. In the first case most participants answer incorrectly, while in the next they do it with ease. The great difficult of mathematics as well as programming arrises from the depths of abstraction.

Just like Mathematics might be the deep study one abstract layer after another, complex application design mirrors this. While in mathematics at the bottom might be something like Category Theory from which derives Algebra from which derive the infinity of number systems from which derive Real and Complex numbers from which derive Real Analysis and then the Calculuses and on down the chain. A Complex application might have a User Interface which abstracts a Programable Interface which calls down into a Data Layer that then talks to a Persistence Layer that talks to the Operating System on down to the bare metal. Most of the extremely competent application architects I know might be considered a software version of a "tall, thin man". Someone who is familiar with all the ramifications of his design from the top layers to the cost at the bottom layers.

What I'm trying to say is that Programming might be easy. But the understanding of both the ramifications of the current environment (what can be done with the current language and tools) and the cost of the various layers below it, is hard. And that teaching THAT is much more difficult task than teaching someone how to use a fancy calculator. Books like SICP attempt to accomplish this task by dealing with abstraction in it's own right, by merging programing and mathematics and enabling a kind of cross pollination there of. But even that can leave a student with little understanding of how to put all the pieces together. I write this because my focus since college has been into the art of programming. And I'm beginning to understand where the limitations of my studies have been, and where I'm headed. And my new goal of trying to find more people to look at my code and find new ways of expressing those problems.

May 04, 2011 01:42 AM

May 03, 2011

Erlang Factory News

Only 5 Places left at the Erlang Factory London!

There are only 5 chances left to book your place at the Erlang Factory London 2011 at the Very Early Bird Rate of £395 for the conference and £1195 for the University and conference. This is a saving of £200 off the standard price of the conference!When they are gone, they are gone!

Speakers so far include , Kostis Sagonas, the leader of the HIPE Team at Uppsala University, Marcus Kern, CTO at MIG, Steve Vinoski, distributed systems expert and Scott Lystig Fritchie, senior software engineer at Basho Technologies. Many more speakers are still to be announced.

Book now to secure this fantastic rate!

May 03, 2011 03:02 PM

erlang.org RSS Feed

The Erlang Factory London is back!

The Erlang Factory London is back! The dates you need for your diary are 6th, 7th and 8th June for the Erlang University courses and 9th and 10th June for the Erlang Factory Conference.

There are 10 places left at the very Early bird rate of £395 which is a saving of £200! Book now  to get your place!

 

May 03, 2011 09:10 AM

April 30, 2011

Programming in the 21st Century

Impressed by Slow Code

At one time I was interested in--even enthralled by--low-level optimization.

Beautiful and clever tricks abound. Got a function call followed by a return statement? Replace the pair with a single jump instruction. Once you've realized that "load effective address" operations are actually doing math, then they can subsume short sequences of adds and shifts. On processors with fast "count leading zero bits" instructions, entire loops can be replaced with a couple of lines of linear code.

I spent a long time doing that before I realized it was a mechanical process.

I don't necessarily mean mechanical in the "a good compiler can do the same thing" sense, but that it's a raw engineering problem to take a function and make it faster. Take a simple routine that potentially loops through a lot of data, like a case insensitive string comparison. The first step is to get as many instructions out of the loop as possible. See if what remains can be rephrased using fewer or more efficient instructions. Can any of the calculations be replaced with a small table? Is there a way to process multiple elements at the same time using vector instructions?

The truth is that there's no magic in taking a well-understood, working function, analyzing it, and rewriting it in a way that involves doing slightly or even dramatically less work at run-time. If I ended up with a routine that was a bottleneck, I know I could take the time to make it faster. Or someone else could. Or if it was small enough I could post it to an assembly language programming forum and come back in a couple of days when the dust settled.

What's much more interesting is speeding up something complex, a program where all the time isn't going into a couple of obvious hotspots.

All of a sudden, that view through the low-level magnifying glass is misleading. Yes, that's clearly an N-squared algorithm right there, but it may not matter at all. (It might only get called with with low values of N, for example.) This loop here contains many extraneous instructions, but that's hardly a big picture view. None of this helps with understanding the overall data flow, how much computation is really being done, and where the potential for simplification lies.

Working at that level, it makes sense to use a language that keeps you from thinking about exactly how your code maps to the underlying hardware. It can take a bit of faith to set aside deeply ingrained instincts about performance and concerns with low-level benchmarks, but I've seen Python programs that ended up faster than C. I've seen complex programs running under the Erlang virtual machine that are done executing before my finger is off the return key.

And that's what's impressive: code that is so easy to label as slow upon first glance, code containing functions that can--in isolation--be definitively proven to be dozens or hundreds of times slower than what's possible on a given CPU, and yet the overall program is decidedly one of high performance.

(If you liked this, you might enjoy Timidity Does Not Convince.)

by James Hague at April 30, 2011 06:00 AM

April 29, 2011

Learn You Some Erlang

Building an Application With OTP

This chapter lets us make practical use of the OTP behaviours seen so far. We do this by writing a process pool application that will let us handle resources and tasks. We explore ideas behind process trees, onion layer theories for processes and general views of how OTP can be used to write software.

April 29, 2011 12:00 PM

April 20, 2011

Erlang Training and Consulting
- News

20 April 2011: Registration is now open for the Erlang Factory London - Early Bird Rate Available

 

Erlang Factory

 

The registration for the Erlang Factory London 2011 is now open! Register now and you will save £100 by taking advantage of the Early-Bird offer. So far each Erlang Factory has been met with great interest so don't waste time. Register before it sells out!

The Erlang Factory London is the best place to exchange knowledge, experience and the vision for the future of Erlang. We have great speakers who will share all that with YOU. This conference has become one of the best Erlang networking events in Europe.

Among our first confirmed speakers for the Erlang Factory London are: Kostis Sagonas, the leader of the HIPE Team at Uppsala University, Marcus Kern, CTO at MIG, Steve Vinoski, distributed systems expert and Scott Lystig Fritchie, senior software engineer at Basho Technologies with many more still to be announced.

The list of speakers is available here, with many more to be announced.

In addition to the conference, there will be a three day Erlang University, allowing you to learn Erlang from the experts. You can choose from the following courses: Erlang Express, OTP Express, Erlang and Test Driven Development or Quick Check for Erlang Developers.

The Erlang University training courses will take place on the 6th, 7th and 8th June and the Erlang Factory conference on 9th and 10th June. Don't miss out - register now.

We will also be offering certification at Foundation and Intermediate level. You can register for this when you book your place at the Factory. Pricing for each exam is £195, or you can sit both for £345. For more information on Certification please visit the website.

 

April 20, 2011 11:18 AM

Erlang Factory News

Erlang Factory London Open for Registration - Save £200

Be one of the first 50 people to book your place at the Erlang Factory London 2011 at the Very Early Bird Rate of £395 for the conference and £1195 for the University and conference. This is a saving of £200 off the standard price of the conference!

Speakers so far include , Kostis Sagonas, the leader of the HIPE Team at Uppsala University, Marcus Kern, CTO at MIG, Steve Vinoski, distributed systems expert and Scott Lystig Fritchie, senior software engineer at Basho Technologies. Many more speakers are still to be announced.

Book now to secure this fantastic rate!

April 20, 2011 09:16 AM

Programming in the 21st Century

Follow the Vibrancy

Back in 1999 or 2000, I started reading a now-defunct Linux game news site. I thought the combination of enthusiastic people wanting to write video games and the excitement surrounding both Linux and open source would result in a vibrant, creative community.

Instead there were endless emulators and uninspired rewrites of stale old games.

I could theorize about why there was such a lack of spark, a lack of motivation to create anything distinctive and exciting. Perhaps most of the projects were intended to fulfill coding itches, not personal visions. I don't know. But I lost interest, and I stopped following that site

When I wanted to modernize my programming skills, I took a long look at Lisp. It's a beautiful and powerful language, but I was put off by the community. It was a justifiably smug community, yes, but it was an empty smugness. Where were the people using this amazing technology to build impressive applications? Why was everyone so touchy and defensive? That doesn't directly point at the language being flawed--not by any means--but it seemed an indication that something wasn't right, that maybe there was a reason that people driven to push boundaries and create new experiences weren't drawn to the tremendous purported advantages of Lisp. So I moved on.

Vibrancy is an indicator of worthwhile technology. If people are excited, if there's a community of developers more concerned with building things than advocating or justifying, then that's a good place to be. "Worthwhile" may not mean the best or fastest, but I'll take enthusiasm and creativity over either of those.

(If you liked this, you might enjoy The Pure Tech Side is the Dark Side.)

by James Hague at April 20, 2011 06:00 AM

April 19, 2011

Scattered Thoughts

telehash: router

Now that we have all the necessary datastructures we can build the router itself. Most of the routing table logic is handled by the bit_tree and bucket modules. The router just ties these together and handles I/O.

Before actually running the routing table the router has to find out its own address, as it is seen from the outside world. It does this by sending +end signals to a list of known telehash nodes (eg telehash.org:42424).

record(bootstrap, { % the state of the router when bootstrapping
          timeout, % give up if no address received before this time
          addresses % list of addresses contacted to find out our address
         }).

bootstrap(Addresses, Timeout) ->
    ?INFO([bootstrapping]),
    State = #bootstrap{timeout=Timeout, addresses=Addresses},
    {ok, _Pid} = gen_server:start_link(?MODULE, State, []).
                          
init(State) ->
    switch:listen(),
    case State of
        #bootstrap{timeout=Timeout, addresses=Addresses} ->
            Telex = telex:end_signal(util:random_end()),
            lists:foreach(fun (Address) -> switch:send(Address, Telex) end, Addresses),
            erlang:send_after(Timeout, self(), giveup);
        #state{} ->
            ok
    end,
    {ok, State}.

Then we listen until we either get a reply with a _to field or run out of time.

handle_info({switch, {recv, From, Telex}}, #bootstrap{addresses=Addresses}=Bootstrap) ->
    % bootstrapping, waiting to receive a message telling us our own address
    case {lists:member(From, Addresses), telex:get(Telex, '_to')} of
        {true, {ok, Binary}} ->
            try util:to_end(util:binary_to_address(Binary)) of
                End ->
                    Self = util:to_bits(End),
                    Table = touched(From, Self, empty_table(Self)),
                    dialer:dial(End, [From], ?ROUTER_DIAL_TIMEOUT),
                    refresh(Self, Table),
                    ?INFO([bootstrap, finished, {self, Binary}, {from, From}]),
                    {noreply, #state{self=Self, pinged=sets:new(), table=Table}}
            catch
                _ ->
                    ?WARN([bootstrap, bad_self, {self, Binary}, {from, From}]),
                    {noreply, Bootstrap}
            end;
        _ ->
            {noreply, Bootstrap}
    end;

handle_info(giveup, #bootstrap{}=Bootstrap) ->
    % failed to bootstrap, die
    ?INFO([giveup, {state, Bootstrap}]),
    {stop, {shutdown, gaveup}, Bootstrap};

Once we know our own address we can fill in the state record and start managing the routing table.

-record(state, { % the state of the router in normal operation
          self, % the bits of the routers own end
          pinged, % set of addresses which have been pinged and not yet replied/timedout
          table % the routing table, a bit_tree containing buckets of nodes
         }).

One of the jobs of the router is to remove unresponsive nodes from the routing table. To check if a node is responsive we just a random +end signal and wait for a reply. If the node is unresponsive it gets marked as stale and we try to find a suitable replacement. The node won't actually be dropped from the table until a replacement is found - this prevents the table from getting flushed if our network connection goes down.

ping(To) ->
    Telex = telex:end_signal(util:random_end()),
    % do this in a message to self to avoid some awkward control flow
    self() ! {pinging, To},
    switch:send(To, Telex),
    erlang:send_after(?ROUTER_PING_TIMEOUT, self(), {timeout, Address}).

timedout(Address, Self, Table) ->
    bit_tree:update(
      fun (_Suffix, _Depth, _Gap, Bucket) ->
              case bucket:timedout(Address, Bucket) of
                  {node, Node, Update} ->
                      % try to touch this node, might be suitable replacement
                      ping(Node),
                      Update;
                  Update ->
                      Update
              end
      end,
      util:to_bits(Address),
      Self,
      Table
     ).

handle_info({pinging, Address}, #state{pinged=Pinged}=State) ->
    % do this in a message to self to avoid some awkward control flow
    ?INFO([recording_ping, {address, Address}]),
    Pinged2 = sets:add_element(Address, Pinged),
    {noreply, State#state{pinged=Pinged2}};
handle_info({timeout, Address}, #state{self=Self, pinged=Pinged, table=Table}=State) ->
    case lists:member(Address, Pinged) of
        true ->
            % ping timedout
            ?INFO([timeout, {address, Address}]),
            Table2 = timedout(Address, Self, Table),
            {ok, State#state{table=Table2}};
        false ->
            % address already replied
            {ok, State}
    end;

One of the rules of the router is that it should never pass on information about a node that it hasn't personally confirmed to exist. Once we receive a message from a node we know that it exists (later we will implement _ring/_line to protect against address spoofing):

touched(Address, Self, Table) ->
    bit_tree:update(
      fun (Suffix, _Depth, Gap, Bucket) ->
              May_split = (Gap < ?K), % !!! or (Depth < ?ROUTER_TABLE_EXPANSION)
              bucket:touched(Address, Suffix, now(), Bucket, May_split)
      end,
      util:to_bits(Address),
      Self,
      Table
     ).

On receiving a .see command we record all the contained addresses as potential nodes and ping them to try to confirm their existence.

seen(Address, Self, Table) ->
    bit_tree:update(
      fun (Suffix, _Depth, _Gap, Bucket) ->
              case bucket:seen(Address, Suffix, now(), Bucket) of
                  {node, Node, Update} ->
                      % check if this node is stale
                      ping(Node),
                      Update;
                  Update ->
                      Update
              end
      end,
      util:to_bits(Address),
      Self,
      Table
     ).

On receiving a +end signal we reply with a .see command containing the nearest K addresses which we have confirmed to exist.

see(To, End, Table) ->
    Telex = telex:see_command(nearest(?K, End, Table)),
    switch:send(To, Telex).

nearest(N, End, Table) when N>=0 ->
    Bits = util:to_bits(End),
    iter:take(
      N,
      iter:flatten(
        iter:map(
          fun ({_Prefix, Bucket}) -> bucket:by_dist(End, Bucket) end,
          bit_tree:iter(Bits, Table)))).

On receiving a message we have handle the above three cases, which gets a little ugly.

handle_info({switch, {recv, From, Telex}}, #state{self=Self, pinged=Pinged, table=Table}=State) ->
    % this counts as a reply
    Pinged2 = sets:del_element(From, Pinged),
    % touched the sender
    % !!! eventually will check _line here
    ?INFO([touched, {node, From}]),
    Table2 = touched(From, Self, Table),
    % maybe seen some nodes
    Table3 =
        case telex:get(Telex, '.see') of
            {ok, Binaries} ->
                try [util:binary_to_address(Bin) || Bin <- Binaries] of
                    Addresses ->
                        ?INFO([seen, {nodes, Addresses}, {from, From}]),
                        lists:foldl(fun (Address, Table_acc) -> seen(Address, Self, Table_acc) end, Table2, Addresses)
                catch
                    _ ->
                        ?INFO([bad_seen, {nodes, Binaries}, {from, From}]),
                        Table2
                end;
            _ ->
                Table2
        end,
    % maybe send some nodes back
    case telex:get(Telex, '+end') of
        {ok, Hex} ->
            try util:hex_to_end(Hex) of
                End ->
                    ?INFO([see, {'end', End}, {from, From}]),
                    see(From, End, Table3)
            catch
                _ ->
                    ?WARN([bad_see, {'end', Hex}, {from, From}])
            end;
        _ ->
            ok
    end,
    {noreply, State#state{pinged=Pinged2, table=Table2}};

The last responsibility of the router is to periodically refresh buckets which haven't recently seen any activity.

handle_info(refresh, #state{self=Self, table=Table}=State) ->
    ?INFO([refreshing_table]),
    refresh(Self, Table),
    {noreply, State};
handle_info({dialed, _, _}, State) ->
    % response from a bucket refresh, we don't care
    {noreply, State};

dialed(Address, Self, Table) ->
    bit_tree:update(
      fun (_Suffix, _Depth, _Gap, Bucket) ->
              bucket:dialed(now(), Bucket)
      end,
      util:to_bits(Address),
      Self,
      Table
     ).

needs_refresh(Bucket, Now) ->
    case bucket:last_dialed(Bucket) of
        never ->
            true;
        Last ->
            (timer:now_diff(Now, Last) div 1000) < ?ROUTER_REFRESH_TIME
    end.

refresh(Self, Table) ->
    Now = now(),
    iter:foreach(
      fun ({Prefix, Bucket}) ->
              case needs_refresh(Bucket, Now) of
                  true ->
                      ?INFO([refreshing_bucket, {prefix, Prefix}, {bucket, Bucket}]),
                      To = util:random_end(Prefix),
                      From = nearest(?K, To, Table),
                      dialer:dial(To, From, ?ROUTER_DIAL_TIMEOUT);
                  false ->
                      ok
              end
      end,
      bit_tree:iter(Self, Table)
     ),
    erlang:send_after(?ROUTER_REFRESH_TIME, self(), refresh),
    ok.

That's it. As usual the (untested) code is in the repo. The next post will probably deal with taps.

April 19, 2011 06:24 PM

telehash: gen_event woes

I ran into some tricky bugs caused by a misconception I had about gen_event. Since this is not explicitly stated in the gen_event documentation I will say it here: gen_event does NOT spawn individual processes for each handler. Each handler is run sequentially in the event manager process.

Now obviously the documentation is not at fault here. I assumed that each handler got its own process solely because the callbacks resembled gen_server. However, a little googling reveals that several other people made the same mistake so I thought it was worth mentioning.

Here is how I found this out. I was working on the router implementation for telehash. When I tested the bootstrapping algorithm everything looked fine until the first dial, after which nothing else happened. Straight away I suspected a bug in the dialer, but repeating the exact same call in the console worked fine. After a few deadends I opened pman to look for anything suspicious but couldn't find the dialer process (because it doesn't exist, it's an event handler). I assumed that it was somehow crashing silently and wasted an hour or so reading and rereading the code and stepping through various calls in the debugger. No matter what I tried the dialer worked absolutely perfectly unless it was called by the router.

Eventually I noticed that the switch_event process was blocking inside a receive and the whole thing unravelled. The dialer is an event handler so when started it calls:

gen_event:add_handler(switch_event, dialer, State)

which is a synchronous call to the switch_event process. The router event handler is running inside the switch_event process so when the router tries to dial it deadlocks.

The moral of this story is RTFM.

This is easily fixed by changing

dialer:dial(End, [Address])

to

spawn(fun () -> dialer:dial(End, [Address]) end)

but there were more problems. Most of the event handlers used erlang:send_after to handle timeouts but since they all run in the same process they all receive each others timeouts. Also, every event handler is run sequentially so the switch_event process becomes a huge bottleneck.

The solution I settled on was to change each event handler into a gen_server and write a simple event handler that just forwards events to its owner. By using gen_event:add_sup_handler and listening for event handler exits we can keep the two in sync.

April 19, 2011 05:33 PM

Erlang Factory News

Erlang Factory SF Bay Area 2011 - A Great Success

On the 24th - 25th March 2011, over 170 people gathered in the SF Bay Area for the 3rd Erlang Factory. With two Keynotes this year delivered by, Kostis Sagonas and Dan Ingalls, 10 extra talks compared to last year and a record-breaking number of delegates, this year's conference was nothing short of a tremendous success.

We would like to thank everyone that attended and keep your eyes out for the videos that will be released soon!

April 19, 2011 02:01 PM

Only 5 Places left at the Erlang Factory SF Bay Area!

We only have 5 places left at the Erlang Factory SF Bay Area so if you haven't booked yet, don't miss out and book now! The OTP Express course has now sold out but there are a few places left on the other courses at the University.

April 19, 2011 02:01 PM

Erlang Factory London 2011 - Talk Submission Open!

Do you have something you want to share with the programming world? A great project, something new or innovative then we want to hear from you. If you want to give a presentation at the Erlang Factory, you can find more information how to do that, here.

The submission deadline is the 11th April 2011.

April 19, 2011 02:01 PM

Early Bird Rate - 1 Week left to save over $200!

There is only 1 week left to book your place at the Early Bird Rate of $690 so don't miss out!

You will get the chance to, see some fantastic names from the Erlang world including, Joe Armstrong, Kostis Sagonas, Damien Katz, and Dan Ingalls, network with your fellow programmers and find out what’s new not only in the world of Erlang but in the world of programming.

Register now to receive the Early Bird Rate and save over $200!

April 19, 2011 02:01 PM

Erlang Factory SF Bay Area - Save over $200

We are offering a fantastic saving of over $200 when you register for the conference before the 1st of March 2011.

You will get the chance to, see some fantastic names from the Erlang world including, Joe Armstrong, Kostis Sagonas, Damien Katz, and Dan Ingalls, network with your fellow programmers and find out what’s new not only in the world of Erlang but in the world of programming.

Register now to receive the Early Bird Rate and save over $200!

April 19, 2011 02:01 PM

April 15, 2011

Programming in the 21st Century

Revisiting "Tricky When You Least Expect It"

Since writing Tricky When You Least Expect It in June 2010, I've gotten a number of responses offering better solutions to the angle_diff problem. The final version I presented in the original article was this:

angle_diff(Begin, End) ->
   D = End - Begin,
   DA = abs(D),
   case {DA > 180, D > 0} of
      {true, true} -> DA - 360;
      {true, _}    -> 360 - DA;
      _ -> D
   end.
But, maybe surprisingly, this function can be written in two lines:
angle_diff(Begin, End) ->
   (End - Begin + 540) rem 360 - 180.
The key is to shift the difference into the range -180 to 180 before the modulo operation. The "- 180" at the end adjusts it back. One quirk of Erlang is that the modulo operator (rem) gives a negative result if the first value is negative. That's easily fixed by adding 360 to the difference (180 + 360 = 540) to ensure that it's always positive. (Remember that adding 360 to an angle gives the same angle.)

So how did I miss this simpler solution? I got off track by by thinking I needed an absolute value, and things went downhill from there. I'd like to think if I could rewind and re-attempt the problem from scratch, then I'd see the error of my ways, but I suspect I'd miss it the second time, too. And that's what I was getting at when I wrote "Tricky When You Least Expect It": that you never know when it will take some real thought to solve a seemingly simple problem.

(Thanks to Samuel Tardieu, Benjamin Newman, and Greg Rosenblatt, who all sent almost identical solutions.)

by James Hague at April 15, 2011 06:00 AM

April 13, 2011

Erlang Training and Consulting
- News

13 April 2011: Ulf Wiger and Robert Virding to be on the judging panel at Spawnfest - 9 - 10 July 2011

We are happy to announce that our CTO, Ulf Wiger and Robert Virding will be on the panel of judges at this years Spawnfest. Spawnfest is an annual 48 hour development competition in which teams of skilled application developers get exactly one weekend to develop the best Erlang applications that they can. First contest is scheduled for July 9th and 10th. Please follow them on twitter to stay updated.

The judging panel also includes Bob Ippolito, Co-Founder/CTO of Mochi Media, Inc.

For more information on the event please see our events page.

April 13, 2011 03:02 PM

Erlang Inside

Memory Models in Erlang vs Java

A view of Erlang (focused on the memory model) from a “Java Code Geek” – http://www.javacodegeeks.com/2011/04/erlang-vs-java-memory-architecture.html

by Chad DePue at April 13, 2011 11:58 AM

April 11, 2011

Programming in the 21st Century

Caught-Up with 20 Years of UI Criticism

Interaction designers have leveled some harsh criticisms at the GUI status-quo over the last 20+ years. The mouse is an inefficient input device. The desktop metaphor is awkward and misguided. Users shouldn't be exposed to low-level details like the raw file-system and having to save their work.

And they were right.

But instead of better human/computer interaction, we got faster processors and hotter processors and multiple processors and entire processors devoted to 3D graphics. None of which are bad, mind you, but it was always odd to see such tremendous advances in hardware while the researchers promoting more pleasant user experiences wrote books that were eagerly read by people who enjoyed smirking at the wrong-headedness of an entire industry--yet who weren't motivated enough to do anything about it. Or so it seemed.

It's miraculous that in 2011, the biggest selling computers are mouse-free, run programs that take over the entire screen without the noise of a faux-desktop, and the entire concept of "saving" has been rendered obsolete.

Clearly, someone listened.

(If you liked this, you might enjoy Free Your Technical Aesthetic from the 1970s.)

by James Hague at April 11, 2011 06:00 AM

April 10, 2011

Functional Jobs

Programmer - Lisp and other languages at Streamtech bv (Full-time)

Streamtech is a small Dutch company specialized in the development of high quality custom (web) applications. We are looking for programmers who like to write pragmatic but elegant code.

who we're looking for

Elegant code makes you happy.

To you, programming is not just a job but a hobby. You've been programming for fun for years.

You're not blind to practical demands, but you relish doing things the right way. When something seems to work but you're not sure how or why, you bang your head against it until you understand.

You like the net.

You love exploring interesting new languages, concepts and approaches.

Having played with Lisp, Haskell, Erlang, Prolog, Smalltalk, or other non-mainstream languages gets you bonus points. The same goes for language/compiler implementation, kernel hacking, cryptography, etc. Impress us with your need to learn cool stuff.

Because of the unfortunately strict immigration laws, already having permission to work in the Netherlands is also a plus.

the job

You'll be working on the development of our diverse web based projects – in streaming video, lawful interception, narrowcasting, and many other topics – together with the rest of the programming team, from our office in The Hague.

Together with the rest of the team, you'll make applications that are technically elegant and pleasant to work with, using whatever language and technologies suit the project best.

You will be involved in projects from A to Z – including finding out what exactly the customer needs, choosing the best approach for the job, and then making it so.

what we use

In a perfect world, we'd be Lisping all day long. In the real world, the needs of our clients don't always allow this, so there's also quite a lot of Python, and even some C and PHP.

We don't expect you to have experience with all the languages we use, but we do expect you to be able to learn new things and enjoy doing it.

what we offer

A challenging, fun job in a small but professional team that does not think Java, MVC and SOAP are the answer to every question.

April 10, 2011 08:42 PM

REALITY.SYS

Adlige Posse

Eigentlich ist Guttenberg ist Sachen Öffentlichkeit sehr konsequent: immer wenn es darum ging, sich in Pose zu setzen, waren die Kameras willkommen (selbst eine Talk-Show aus Afghanistan durfte sich das gemeine Volk antun). Später aber, als es darum ging, für eigene Fehler gerade zu stehen, wurde zensiert was das Zeug hält. Zuerst versuchte er, vor “ausgewählten” Presse-Vertretern seine erste Stellungnahme zu den Plagiatsvorwürfen zu geben, während zur gleichen Zeit eine Bundespresse-Konferenz abgehalten wurde. Danach verbot er die Live-Übertragung seines Rücktritts aus dem Bundesverteidigungsministerium, so dass das gemeine Volk nur durch das zwischengeschaltete Handy einer NTV-Reporterin die “live Stimme” des ehemaligen Dr./Ministers hören konnte. Und jetzt erleben wir einen weiteren Zensurversuch. Typisch adliges aber sicherlich nicht “edles” Verhalten: immer schöne Posen halten, aber kaum Leistung erbringen. Dass solche “erlauchten” Kreise wenig von Fleißarbeit, Mühsal und Ehrlichkeit halten, dasselbe aber ihren Wählern einbläuen, muss hier nicht ausführlich erörtert werden. ABER: so hat es vielleicht in den Jahrhunderten zuvor funktioniert. In dieser Zeit, und das hat mittlerweile auch ein Herr KTG begriffen, reicht eine bloße Pose nicht mehr aus, um die Massen zu begeistern. Nicht einmal die Royal-versessenen Briten erlauben ihren blaublütigen Posern mehr zu sein, als Yellow-Press-Volontäre! Herr KTG ist nur ein typisches Beispiel für die Oberflächlichkeit der ehemaligen Herrschenden dieses Landes, die wohl immer noch davon träumen, durch bloßes Posing und “gute” Manieren das Volk zu steuern. Diese Zeiten sind für den größten Teil Deutschlands aber schon längst vorbei. Die feudale Matrix funktioniert nicht mehr. KTG, du bist raus!

by admin at April 10, 2011 05:03 PM

April 07, 2011

Process-one Blogs

Sea Beyond 2011 Talk 7: Jukka Alakontiola on Nokia Push Notifications

Nokia Push Notification system has been developed by Nokia to send notifications on mobile directly.

Jukka Alakontiola presents the architecture of Nokia push and speaks about mobile related XMPP optimisations.

Do not miss this video, it features lots of great insights on mobile realtime services.

Here is the video of his presentation:

You can see the slides here:

Do not miss the event summary and the other videos from Sea Beyond event.

by Mickaël Rémond at April 07, 2011 12:53 PM

April 06, 2011

Erlang Training and Consulting
- Jobs

Erlang Developer, Mountain View, CA, USA

Erlang Developer (Mountain View, CA)
As a software engineer with experience in Erlang and call processing, you would be responsible for evolving the key call processing and event distribution systems at the heart of the platform. Your responsibilities will include designing and developing new features and capabilities in the Erlang based components.

Key Responsibilities
    •    Design and develop key features and capabilities in the Erlang based systems
    •    Maintain and evolve the call processing and event distribution components

Requirements
    •    Bachelors Degree or 5+ years of comparable experience in architecture and development using Erlang and OTP
    •    Experience in eunit and quickcheck
    •    Experience in developing call processing software, telephony systems
    •    Experience working in an agile environment (Scrum, Kanban or both)

Desired Skills
    •    Experience in Python, Perl, Java, Javascript
    •    Knowledge of telephony protocols (SIP, RTP, RTMP, etc.)
    •    Experience with Jira, Perforce
    •    Experience with AGILE development
    •    Experience with Test Driven Development
    •    Must be team-oriented, possess a positive attitude and work well with others
    •    Must be flexible and able to work accurately in a fast-paced environment
    •    Ability to work independently and deliver on schedule with little supervision
    •    Ability to quickly understand and articulate interactions in a complex technical environment
    •    Able to plan and execute own tasks in timely manner
    •    Passionate about software development, willing to learn new technology, self-motivated with high technical competency

 

Please send your CV and a cover letter when applying.

April 06, 2011 02:00 PM

March 30, 2011

Scattered Thoughts

telehash: buckets

The other half of the routing table is the buckets which store node addresses.

Usual disclaimer: none of this is properly tested yet.

The Kademlia paper has much to say on the issue of routing, most of it contradictory. My takeaway from many readings and from browsing the source code of various different implementations is that the following points are the most important:

  • each bucket should contain at most K nodes

  • we should only ever report node addresses which we have personally confirmed exist

  • responsive nodes should never be removed from buckets

  • nodes should never be removed from buckets unless a suitable replacement exists

The first three points make the routing table very resistant to flooding and spoofing. In particular, they prevent a common attack for p2p networks where some bad guy floods the routing tables of all the other nodes so that all traffic is routed through nodes controlled by the bad guy. The last point prevents nodes from flushing their routing tables if their own network connection goes down.

I think the implementation I have come up with is fairly clean, if a little lengthy. Like the bit_tree I want the bucket to be completely pure. All side effects will be handled by the router itself. The main data structures are explained pretty well by the comments:

-define(K, ?DIAL_DEPTH).

-record(node, {
          address, % node #address{} record
          'end', % node end
          suffix, % the remaining bits of the nodes end left over from the bit_tree
          status, % one of [live, stale, cache]
          last_seen % for live/stale nodes, the time of the last received message. for cache nodes the time of the last .see reference to the node
         }).

-record(bucket, {
          nodes, % gb_tree mapping addresses to {Status, Last_seen}
          % remaining fields are pq's of nodes sorted by their last_seen field
          live, % nodes currently expected to be alive
          stale, % nodes which have not replied recently
          cache % potential nodes which we have not yet verified
         }). % invariant: pq_maps:size(live) + pq_maps:size(stale) <= ?K

The bucket is a two-stage data structure. This allows us the keep nodes of different statuses sorted by the last_seen time but still be able to get/delete nodes just knowing the address. The get_node function should make it clear how this works:

get_node(Address,
         #bucket{nodes=Nodes, live=Live, stale=Stale, cache=Cache}) ->
    case gb_trees:lookup(Address, Nodes) of
        {value, {Status, Last_seen}} ->
            case Status of
                live ->
                    {ok, pq_maps:get({Last_seen, Address}, Live)};
                stale ->
                    {ok, pq_maps:get({Last_seen, Address}, Stale)};
                cache ->
                    {ok, pq_maps:get({Last_seen, Address}, Cache)}
            end;
        none ->
            none
    end.

This is only long because records are purely a compile time structure ie we can't write Bucket#bucket.Status so we have to pattern match on Status instead. We also define add_node/2, del_node/2 and update_node/2, which look pretty similar, as well as to_list/1, from_list/1 and sizes/1.

The router is going to react to various events by calling the appropriate bucket functions and possibly sending out messages based on the result. The first event it has to handle is a node becoming unresponsive. The bucket will mark this node as stale and return a cache node which the router can attempt to verify.

% this address failed to reply in a timely manner
timedout(Address, Bucket) ->
    log:info([?MODULE, timing_out, Address, Bucket]),
    case get_node(Address, Bucket) of
        {ok, Node} ->
            case Node#node.status of
                live ->
                    % mark as stale, return a cache node that might be a suitable replacement
                    Bucket2 = update_node(Node#node{status=stale}, Bucket),
                    pop_cache_hi(Bucket2);
                _ ->
                    % if cache or stale already we don't care
                    ok(Bucket)
            end;
        none ->
            % wtf? we don't even know this node?
            % one way this could happen:
            % send N1, sendN1, timedout N1, add N2 (pushing N1 out of stale), timedout N1
            log:warning([?MODULE, unknown_node_timedout, Address, Bucket]),
            ok(Bucket)
    end.

% return most recently seen cache node, if any exist
pop_cache_hi(#bucket{cache=Cache}=Bucket) ->
    case pq_maps:pop_hi(Cache) of
        {_Key, Node, Cache2} ->
            {node, Node, ok(Bucket#bucket{cache=Cache2})};
        false ->
            ok(Bucket)
    end.

The next event is receiving a .see command. This may be as a result of a +end sent by the router but is more likely to be part of a dialing process happening elsewhere. The beauty of Kademlia is that the router can populate the routing table just by listening in on dialing attempts.

For each node listed in the .see command the router will call seen. This adds the node to the cache and returns the least recently seen live node so the router can check that it is still responsive.

% this address has been reported to exist by another node
seen(Address, Time, Suffix, Bucket) ->
    log:info([?MODULE, seeing, Address, Bucket]),
    case get_node(Address, Bucket) of
        {ok, Node} ->
            case Node#node.status of
                cache ->
                    % for cache nodes being in a .see is good enough
                    ok(update_node(Node#node{last_seen=Time}, Bucket));
                _ ->
                    % for live/stale nodes we require direct contact so ignore this
                    ok(Bucket)
            end;
        none ->
            % put node in cache, return a live node to ping
            Node = #node{
              address = Address,
              'end' = util:to_end(Address),
              suffix = Suffix,
              status = cache,
              last_seen = Time
             },
            Bucket2 = add_node(Node, Bucket),
            case peek_live_lo(Bucket) of
                none -> ok(Bucket2);
                {ok, Live_node} -> {node, Live_node, ok(Bucket2)}
            end
    end.

% return the oldest live node
peek_live_lo(#bucket{live=Live}) ->
    case pq_maps:peek_lo(Live) of
        none -> none;
        {_, Node} -> {ok, Node}
    end.

Any time we receive a message we learn that the node sending it exists (or not - we'll deal with address spoofing in a later post) so we can potentially mark it as a live node. The touched function checks if the node is already in the bucket or if it needs to be added.

% this address has been verified as actually existing
touched(Address, Suffix, Time, Bucket, May_split) ->
    log:info([?MODULE, touching, Address, Bucket]),
    case get_node(Address, Bucket) of
        {ok, Node} ->
            case Node#node.status of
                live ->
                    % update last_seen time
                    ok(update_node(Node#node{last_seen=Time}, Bucket));
                stale ->
                    % update last_seen time and promote to live
                    ok(update_node(Node#node{last_seen=Time, status=live}, Bucket));
                cache ->
                    % potentially promote the node to live
                    Bucket2 = del_node(Node, Bucket),
                    new_node(Address, Suffix, Time, Bucket2, May_split)
            end;
        none ->
            % potentially add the node to live
            new_node(Address, Suffix, Time, Bucket, May_split)
    end.

If the node needs to be added then touched calls new_node which decides if there is space in the bucket and, if so, adds the new node. If the bucket is full and May_split is true then new_node will split the bucket before adding the new node. Deciding whether or not splitting is allowed is the routers job.

% assumes Address is not already in Bucket, otherwise crashes
new_node(Address, Suffix, Time, Bucket, May_split) ->
    Node = #node{
      address = Address,
      'end' = util:to_end(Address),
      suffix = Suffix,
      status = undefined,
      last_seen = Time
     },
    {Lives, Stales, _} = sizes(Bucket),
    if
        Lives + Stales < ?K ->
            % space left in live
            log:info([?MODULE, adding, Node, Bucket]),
            ok(add_node(Node#node{status=live}, Bucket));
        (Lives < ?K) and (Stales > 0) ->
            % space left in live if we push something out of stale
            log:info([?MODULE, adding, Node, Bucket]),
            Bucket2 = drop_stale(Bucket),
            ok(add_node(Node#node{status=live}, Bucket2));
        May_split and (Suffix /= []) ->
            % allowed to split the bucket to make space
            log:info([?MODULE, splitting, Node, Bucket]),
            {split, BucketF, BucketT} = split(Bucket),
            [Bit | Suffix2] = Suffix,
            case Bit of
                false ->
                    BucketF2 = new_node(Address, Suffix2, Time, BucketF, May_split),
                    {split, BucketF2, BucketT};
                true ->
                    BucketT2 = new_node(Address, Suffix2, Time, BucketT, May_split),
                    {split, BucketF, BucketT2}
            end;
        true ->
            % not allowed to split, will have to go in the cache
            log:info([?MODULE, caching, Node, Bucket]),
            ok(add_node(Node#node{status=cache}, bucket))
    end.

% drop the oldest stale node, crashes if none exist
drop_stale(#bucket{stale=Stale}=Bucket) ->
    {_Key, _Node, Stale2} = pq_maps:pop_one_hi(Stale),
    Bucket#bucket{stale=Stale2}.

split(Bucket) ->
    Nodes = to_list(Bucket),
    NodesF = [Node#node{suffix=Suffix2} || #node{suffix=[false|Suffix2]}=Node <- Nodes],
    NodesT = [Node#node{suffix=Suffix2} || #node{suffix=[true|Suffix2]}=Node <- Nodes],
    {split, from_list(NodesF), from_list(NodesT)}.

Finally, upon receiving a +end signal the router needs to reply with a .see command listing the K nearest nodes to the specified end. This will be done using a combination of bit_tree:iter and bucket:nearest.

nearest(N, End, #bucket{live=Live, stale=Stale}) ->
    Nodes = pq_maps:to_list(Live) ++ pq_maps:to_list(Stale),
    Num_nodes = pq_maps:size(Live) + pq_maps:size(Stale),
    if
        Num_nodes =< N ->
            [Node#node.address || {_Key, Node} <- Nodes];
        true ->
            % !!! maybe should prefer to return live nodes even if further away
            Nodes_by_dist = [{util:distance(End, Node#node.'end'), Node} || {_Key, Node} <- pq_maps:to_list(Live)],
            {Closest, _} = lists:split(N, lists:sort(Nodes_by_dist)),
            [Node#node.address || {_Dist, Node} <- Closest]
    end.

As usual all the code is sitting in the repo.

March 30, 2011 07:44 PM

March 29, 2011

Process-one Blogs

Sea Beyond 2011 Talk 6: Diana Cheng on OneSocialWeb

Diana Cheng, from Vodafone, introduces OneSocialWeb project.

Diana Cheng presents OneSocialWeb distributed social web initiative, primarily based on XMPP and Activitystreams standards.

Here is the video of her presentation:

You can see the slides here:

Do not miss the event summary and the other videos from Sea Beyond event.

by Mickaël Rémond at March 29, 2011 11:45 AM

Erlang Factory News

Erlang Factory SF Bay Area 2011 - A Great Success

On the 24th - 25th March 2011, over 170 people gathered in the SF Bay Area for the 3rd Erlang Factory. With two Keynotes this year delivered by, Kostis Sagonas and Dan Ingalls, 10 extra talks compared to last year and a record-breaking number of delegates, this year's conference was nothing short of a tremendous success.

We would like to thank everyone that attended and keep your eyes out for the videos that will be released soon!

March 29, 2011 11:34 AM

March 28, 2011

Trapexit's Erlang Blog Filter

London Erlang User Group Slides

While attending QCon recently I also spoke at the London Erlang User Group meeting. I gave a quick talk about Erlang Native Implemented Functions (NIFs) as used in my little erlsha2 implementation. Slides are here (pdf).

by steve at March 28, 2011 06:00 PM

March 26, 2011

Programming in the 21st Century

If You're Not Gonna Use It, Why Are You Building It?

Just about every image editing or photo editing program I've tried has a big collection of visual filters. There's one to make an image look like a mosaic, one to make it look like watercolors, and so on. Except for few of the most fundamental image adjustments, like saturation and sharpness, I never use any of them.

I have this suspicion that the programmers of these tools got hold of some image processing textbooks and implemented everything in them. If an algorithm had any tweakable parameters, then those were exposed to the user as sliders.

Honestly, that sounds like something I might have done in the past. The process of implementing those filters is purely technical--almost mechanical--yet it makes the feature list longer and more impressive. And they could be fun to code up. But no consideration is given to if those filters have any practical value.

Contrast this with apps like Instagram and Hipstamatic. Those programs use your phone's camera to grab images, then apply built-in filters to them. They're fully automatic; you can't make any manual adjustments. And yet unlike all of those filter-laden photo editors I've used in the past, I'm completely hooked on Hipstamatic. It rekindled my interest in photography, and I can't thank the authors enough.

What's the difference between those apps and old-fashioned photo editors?

The Hipstamatic and Instagram filters were designed with clear goals in mind: to emulate certain retro-camera aesthetics, to serve as starting points and inspirations for photographs. Or more succinctly: they were built to be used.

If you find yourself creating something, and you don't understand how it will be used, and you don't plan on using it yourself, then it's time to take a few steps back and reevaluate what you're doing.

(If you liked this, you might like Advice to Aimless, Excited Programmers.)

by James Hague at March 26, 2011 06:00 AM

RJ's Blog

Erlang rebar tutorial: generating releases and upgrades

During my experiments with rebar, I made a simple example app for testing upgrades and releases. This article will walk you through using rebar to create an application, lay it out properly, package and deploy it, and create and install new versions without downtime.

The code accompanying this article is in various branches of github.com/RJ/erlang_rebar_example_project.

N.B. The OTP Design Principles docs are a good place to start if you want an overview of the OTP approach to Erlang apps and releases. However, rebar isn’t (yet) part of OTP, so consider that background reading. Rebar makes things much easier.

Creating the project

Build rebar:

$ cd ~/src
$ git clone https://github.com/basho/rebar.git
Initialized empty Git repository in /tmp/rebar/.git/
remote: Counting objects: 2651, done.
remote: Compressing objects: 100% (1344/1344), done.
remote: Total 2651 (delta 1540), reused 2227 (delta 1174)
Receiving objects: 100% (2651/2651), 622.99 KiB | 495 KiB/s, done.
Resolving deltas: 100% (1540/1540), done.
$ cd rebar && make
...snip....
==> rebar (compile)
Congratulations! You now have a self-contained script called "rebar" in
your current working directory. Place this script anywhere in your path
and you can use rebar to build OTP-compliant apps.

Now we’ll make a project directory called “dummy_proj”, copy rebar into it, and use rebar to generate a skeleton application:

$ mkdir -p ~/src/dummy_proj/apps
$ cd ~/src/dummy_proj/
$ cp ../rebar/rebar .
$ cd apps
$ ../rebar create-app appid=dummy_proj
==> dummy_proj (create-app)
Writing src/dummy_proj.app.src
Writing src/dummy_proj_app.erl
Writing src/dummy_proj_sup.erl

To the skeleton, I added a basic gen_server called dummy_proj_server, which just keeps track of the number of times it was poked, i.e. it holds some state, for demonstration purposes.

I also renamed dummy_proj_app.erl to just dummy_proj.erl, and added a start/0 function, which is useful when starting the application during developement, when not running from a generated release.

Compiling with rebar

You need a rebar.conf, place this in the top-level project directory:

{sub_dirs, [
            "apps/dummy_proj",
            "rel"
           ]}.
{erl_opts, [debug_info, fail_on_warning]}.

{require_otp_vsn, "R14"}.

And now to compile, you do:

$ ./rebar compile
==> dummy_proj (compile)
Compiled src/dummy_proj_sup.erl
Compiled src/dummy_proj.erl
Compiled src/dummy_proj_server.erl
==> rel (compile)
==> dummy_proj (compile)

Note that you now have .beam files in apps/dummy_proj/ebin/, and the .app.src generated apps/dummy_proj/ebin/dummy_proj.app for you, with a complete modules list.

N.B. I made a simple Makefile that calls ‘rebar compile’, because I’m too used to typing make. Find it in the git repo.

Running your app (development)

Here’s how you can start the application (and sasl, for nice error reporting):

$ erl -pa apps/*/ebin -boot start_sasl -s dummy_proj
...snip...
=INFO REPORT==== 16-Mar-2011::14:17:04 ===
Starting dummy_proj application...

=PROGRESS REPORT==== 16-Mar-2011::14:17:04 ===
          supervisor: {local,dummy_proj_sup}
             started: [{pid,<0.45.0>},
                       {name,dummy_proj_server},
                       {mfargs,{dummy_proj_server,start_link,[]} },
                       {restart_type,permanent},
                       {shutdown,5000},
                       {child_type,worker}]

=PROGRESS REPORT==== 16-Mar-2011::14:17:04 ===
         application: dummy_proj
          started_at: nonode@nohost
Eshell V5.8.1  (abort with ^G)
1> dummy_proj_server:num_pokes().
0
2> dummy_proj_server:poke().     
{ok,1}
3> dummy_proj_server:poke().
{ok,2}
4> dummy_proj_server:num_pokes().
2
5>  

Now you have a nice sensibly structured Erlang project that you can compile with rebar. Exit the VM with q(). and let’s use rebar to package it up, so you can deploy it on a production box.

Generating your first release

When you generate a release with rebar, and indeed if you use the erlang tools manually (not recommended, just use rebar), you end up with the whole Erlang VM and required libraries packaged up under one directory.

This means you have a self-contained environment containing Erlang, the OTP libraries you need, and all your application code and dependencies. You can just tar it up, ship it over to another machine (of the same architecture, eg GNU/Linux 64-bit), and run it there.

Creating a node config

Use rebar to create a default node configuration in a rel subdirectory:

$ mkdir rel
$ cd rel/
$ ../rebar create-node nodeid=dummynode
==> rel (create-node)
Writing reltool.config
Writing files/erl
Writing files/nodetool
Writing files/dummynode
Writing files/app.config
Writing files/vm.args

You need to edit reltool.config a little; point to to your apps directory, and make sure the version number matches your .app.src file. You should also add dummy_app to the list of applications that are started as part of the release. Here’s reltool.conf from my v1 tag

Generating the release

Back in the top level directory, just run:

$ ./rebar generate
==> rel (generate)

Now have a look in rel/dummynode. This is the release directory containing everything you need to run your application.

We are going to be creating more releases later, so rename rel/dummynode to rel/dummynode_first, and then launch it using the handy script that rebar created for us:

$ cd rel/dummynode_first
$ ./bin/dummynode console
...snip...
Erlang R14B (erts-5.8.1) [source] [64-bit] [smp:8:8] [rq:8] [async-threads:5] [hipe] [kernel-poll:true
=INFO REPORT==== 16-Mar-2011::13:29:59 ===
Starting dummy_proj application...
Eshell V5.8.1  (abort with ^G)
(dummynode@127.0.0.1)1> 
(dummynode@127.0.0.1)1> dummy_proj_server:num_pokes().
0
(dummynode@127.0.0.1)2> dummy_proj_server:poke().     
{ok,1}
(dummynode@127.0.0.1)3> dummy_proj_server:poke().
{ok,2}
(dummynode@127.0.0.1)4> dummy_proj_server:num_pokes().
2
(dummynode@127.0.0.1)5> 

Now the release is running, we never want to have to restart it ever again, so open up another console because we want to leave that running whilst we work on version 2.

N.B. In a production environment, you would start with “./bin/dummynode start” so it runs in the background, and use “dummynode attach” to get a console.

Check the 'v1 branch' on github for code up to this point.

Upgrading to Version 2

Add the poke_twice() function to dummy_proj_server.

Change the version from “1” to “2”, in both apps/dummy_proj.app.src and rel/reltool.conf.

Here’s the github diff between v1...v2

Erlang application version numbers can be any string - I tend to use a date format with letter: “20110316a”, but you can use any scheme you want. I tag releases in git with the same version as the erlang application. We’ll just use “1”, “2”, “3” here for simplicity.

N.B. If you use {vsn, git} as the version in your .app.src, rebar will get the version string from the closest git tag.

Now build the new version:

$ ./rebar compile
$ ./rebar generate

So now you have rel/dummy_proj, containing a full release (VM included) of version 2. If you don’t care about online-upgrades, you could just kill your version 1 VM, and start version 2 from this new release directory.

Writing the .appup upgrade instructions

In order to make an upgrade, you must have a valid .appup file. This tells the erlang release_handler how to upgrade and downgrade between specific versions of your application.

Rebar has a (relatively new) command called ‘generate-appups’. I’ll show how it works, but ultimately we’ll write our .appup manually, and keep it in our project directory (in git).

$ ./rebar generate-appups previous_release=dummynode_first
==> rel (generate-appups)
Generated appup for dummy_proj
Appup generation complete
$ cat ./rel/dummynode/lib/dummy_proj-2/ebin/dummy_proj.appup
%% appup generated for dummy_proj by rebar ("2011/03/16 13:37:43")¬                                   
{"2", [{"1", [{update,dummy_proj_server,{advanced,[]}}]}], [{"1", []}]}.¬

Get rid of the autogenerated one, and create the appup file manually, in apps/dummy_proj/ebin/dummy_proj.appup:

{"2", 
    %% Upgrade instructions from 1 to 2
    [{"1", [
        {load_module, dummy_proj_server}    
    ]}], 
    %% Downgrade instructions from 2 to 1
    [{"1",[
        {load_module, dummy_proj_server}    
    ]}]
}.

This .appup contains instructions for upgrading and downgrading between versions “2” and “1”. Typically the downgrade instructions are the reverse of the upgrade instructions. Since we just added a function to our server process, without changing any internal state, we can just use load_module instructions. The Appup Cookbook explains the various upgrade instructions in depth.

Now generate again, overwriting the previous version 2. This will just make sure the .appup is part of the release directory:

$ ./rebar generate -f

And now, create the upgrade package:

$ ./rebar generate-upgrade previous_release=dummynode_first
==> rel (generate-upgrade)
dummynode_2 upgrade package created

The generate-upgrade command will look for rel/dummynode as the current version, and rel/dummynode_first as the previous version. It should have created the upgrade .tar.gz in rel:

$ ls -lh rel/
total 15M
drwxr-xr-x 8 rj rj 4.0K 2011-03-16 13:42 dummynode
drwxr-xr-x 8 rj rj 4.0K 2011-03-16 13:29 dummynode_first
-rw-r--r-- 1 rj rj  14M 2011-03-16 13:45 dummynode_2.tar.gz
drwxr-xr-x 2 rj rj 4.0K 2011-03-16 13:11 files
-rw-r--r-- 1 rj rj  922 2011-03-16 13:36 reltool.config

Installing the upgrade package

You should still have the VM running from dummynode_first. Make sure you called poke(), so the internal state is something other than the default. This will help illustrate that the upgrade worked seamlessly.

Copy the upgrade package to the releases directory of the running release:

$ cp rel/dummynode_2.tar.gz rel/dummynode_first/releases

Now, at the Erlang console where version 1 is running, we use release_handler to check which releases are currently available, and install our new one:

(dummynode@127.0.0.1)5> release_handler:which_releases().
[{"dummynode","1",[],permanent}]
(dummynode@127.0.0.1)6> release_handler:unpack_release("dummynode_2").
{ok,"2"}
(dummynode@127.0.0.1)7> release_handler:install_release("2").
{ok,"1",[]}   
(dummynode@127.0.0.1)8> dummy_proj_server:num_pokes().
2
(dummynode@127.0.0.1)9> dummy_proj_server:poke_twice().
{ok,4}
(dummynode@127.0.0.1)10> dummy_proj_server:num_pokes(). 
4
(dummynode@127.0.0.1)11> release_handler:which_releases().
[{"dummynode","2",
  ["kernel-2.14.1","stdlib-1.17.1","dummy_proj-2",
   "sasl-2.1.9.2","compiler-4.7.1","crypto-2.0.1",
   "syntax_tools-1.6.6","edoc-0.7.6.7","et-1.4.1","gs-1.5.13",
   "hipe-3.7.7","inets-5.5","mnesia-4.4.15","observer-0.9.8.3",
   "public_key-0.8","runtime_tools-1.8.4.1","ssl-4.0.1",
   "tools-2.6.6.1","webtool-0.8.7","wx-0.98.7","xmerl-1.2.6"],
  current},
 {"dummynode","1",[],permanent}]

The upgrade worked; you can see that the num_pokes() was preserved, and that the new poke_twice() function is available.

release_handler shows our version 2 as “current”, and the original version 1 as “permanent”. This means that although version 2 is running right now, if you restart the VM, version “1” will be booted up.

If you are happy with the upgrade, make it permanent, meaning it will boot instead of version 1 if you restart the VM:

(dummynode@127.0.0.1)12> release_handler:make_permanent("2").

Check the 'v2 branch' on github for code up to this point.

Version 3 and beyond

The upgrade from v1 to v2 was simple: we just added a fun without changing the internal #state{} record.

Erlang .appup files can do all sorts of clever stuff, allowing you to rewire your running applications during the upgrade process.

The Appup Cookbook details the various commands you can put in your .appup.

Let’s do an upgrade with a more complex appup - we’ll change the #state record in the dummy_proj_server process.

For version 3, we’ll track prods as well as pokes, which will require another field in the state record.

Here’s the github diff between v2...v3.

Check out the addition to .appup for this release:

{"3", 
    %% Upgrade instructions
    [{"2", [
        {update,dummy_app_server,{advanced,[from2to3]}}
    ]}], 
    %% Downgrade instructions
    [{"2",[
        {update,dummy_app_server,{advanced,[from3to2]}}
    ]}]
}.

This {update..} directive will result in the code_change function being called on the dummy_app_server. The purpose of code_change is to change the State from the old (v2) format, to the new (v3) format.

Although it’s not strictly necessary, I pass ‘from2to3’ as the ‘Extra’ field in the code_change call. This can be pattern matched on, and makes it clear in your code_change code exactly what version upgrade is expected.

Packaging and upgrading to v3

Move the generated release dir for v2:

$ mv rel/dummynode rel/dummynode_2

Compile and generate for v3, then create the upgrade package:

$ ./rebar compile
$ ./rebar generate
$ ./rebar generate-upgrade previous_release=dummynode_2

N.B. You need to provide the full, standalone generated release dir as the previous_release, you can’t use dummynode_first, even though that contains version 2 of your release.

As before, copy the upgrade package to the releases directory of the running release:

$ cp rel/dummynode_3.tar.gz rel/dummynode_first/releases

Now, at the Erlang console where you upgraded v1 to v2:

(dummynode@127.0.0.1)12> release_handler:unpack_release("dummynode_3").
{ok,"3"}
(dummynode@127.0.0.1)13> release_handler:install_release("3").
{ok,"2",[]}

Congratulations

Now you can deploy hot-code-upgrades the proper OTP way. Ideal for complex or large upgrades that change internal state or do require special upgrade hooks. Read the Appup Cookbook a few times, and test your upgrade packages in a staging environment before deploying. You can just tar up and copy the live environment to your staging box, to get an exact clone of the production system to test upgrades against.

Warning: a current issue with downgrades

To downgrade, you just install a previous release. However, there is currently a bug where release_handler chokes during downgrades to the first version, because of a discrepancy in the naming of .boot files in the release. release_handler has start.boot hardcoded, but rebar will generate appname.boot, with start.boot as a symlink. If you need to do downgrades, test this carefully before deploying; you may need to manually rename the boot file.

Cowboying out quick fixes

Appup files and generating releases is rather heavyweight. Here’s an overview of the process I’m using on IRCCloud at the moment

Complex upgrades are proper releases, with .appup

No way around it; bit of a pain to create and test, but glorious when you pull off a complex upgrade with zero downtime. Releases are tagged in git with the datetime and letter version, eg: v20110324a. If I do a second release that day, v20110324b.

‘Hotfixes’ preserve my sanity

If I make a quick fix that simply requires reloading a module with no risk of instability, I do the following:

  1. Reset code to currently deployed tag, egv 20110324a
  2. Write the fix
  3. Commit as v200110324a-hotfix1
  4. Build this version of the specific module that’s changed, and copy it into the production environment, eg to: /somewhere/dummyapp/lib/dummyapp-20110324a/ebin/
  5. Reload the module, eg by using l(module_name). at the shell.

It’s of vital importance to have a repeatable process, so you know exactly which version of code (ie, the git tag) is currently running in production. If you can’t be sure, then it’s much harder to write successful upgrade code later on.

This process gives a reasonable balance between periodic ‘proper’ releases, during which any complex changes are made that rewire internal state, and quick fixes that just require a module reload.

Pitfalls to avoid

In the IRCCloud app, there’s one place that I don’t use a supervisor, but wish I had; the user process acts as a sort of supervisor for connection processes, because I needed exponential backoff / more control over restarting crashed children.

Don’t do that. release_handler isn’t aware of the child processes I spawn myself, so I can’t use the normal appup process to call code_change.

by rj@metabrew.com (Richard Jones) at March 26, 2011 12:00 AM

March 25, 2011

erlang.org RSS Feed

Mailing Lists Operating

The mailing lists at erlang.org are now back online after fixing a subtle Python "gotcha" configuration error.

Please report any posts that you feel slipped into the void, or double posts or whatnot!

March 25, 2011 06:52 PM

March 24, 2011

Scattered Thoughts

telehash: bit_trees revisited

It has been suggested that the bit_trees presented in the last post are overly complicated. Indeed, in the cold light of the morning there is absolutely no need for that zipper. Without further ado, here is the much simpler version.

% implements the tree part of kademlias k-buckets
% a bit_tree maps ends (lists of bits) to buckets
% as far as the bit_tree is concerned the buckets are completely opaque
% the bit_tree also calculates various numbers needed for splitting decisions

-module(bit_tree).

-include("conf.hrl").

-export([empty/2, update/4, iter/2]).

% a bit_tree is either a leaf or a branch
-record(leaf, {
          size, % size of bucket
          bucket % some opaque bucket of stuff
         }).
-record(branch, {
          size, % size(childF) + size(childT)
          childF, % tree containing nodes whose next bit is false
          childT % tree containing nodes whose next bit is true
         }).

% --- api ---

empty(Size, Bucket) ->
    #leaf{size=Size, bucket=Bucket}.
                
update(Fun, Bits, Self, Tree) when is_function(Fun), is_list(Bits), is_list(Self) ->
    update(Fun, Bits, {self, Self}, 0, Tree).

update(Fun, Bits, Gap, Depth, #leaf{bucket=Bucket}) ->
    Gap_size =
        case Gap of
            {gap, G} -> G;
            {self, _} -> 0
        end,
    bucket_update_to_tree(Fun(Bits, Depth, Gap_size, Bucket));
update(Fun, Bits, Self, Depth, #branch{childF=ChildF, childT=ChildT}) ->
    [Next|Bits2] = Bits,
    Self2 =
        case Self of
            {gap, _} -> Self;
            {self, [Next|Rest]} -> {self, Rest};
            {self, [false|_]} -> {gap, tree_size(ChildF)};
            {self, [true|_]} -> {gap, tree_size(ChildT)}
        end,
    Depth2 = Depth+1,
    case Next of
        true ->
            ChildT2 = update(Fun, Bits2, Self2, Depth2, ChildT),
            Size = tree_size(ChildF) + tree_size(ChildT2),
            #branch{size=Size, childF=ChildF, childT=ChildT2};
        false ->
            ChildF2 = update(Fun, Bits2, Self2, Depth2, ChildF),
            Size = tree_size(ChildF2) + tree_size(ChildT),
            #branch{size=Size, childF=ChildF2, childT=ChildT}
    end.

% iterate through buckets in ascending order of xor distance to Bits
iter(Bits, Tree) ->
    iter(Bits, Tree, fun() -> done end).
                             
iter(_Bits, #leaf{bucket=Bucket}, Iter) ->
    fun () ->
            {Bucket, Iter}
    end;
iter([Bit|Bits], #branch{childF=ChildF, childT=ChildT}, Iter) ->
    case Bit of
        true ->
            iter(Bits, ChildT, iter(Bits, ChildF, Iter));
        false ->
            iter(Bits, ChildF, iter(Bits, ChildT, Iter))
    end.

% --- internal functions ---

tree_size(#leaf{size=Size}) ->
    Size;
tree_size(#branch{size=Size}) ->
    Size.

bucket_update_to_tree({ok, Size, Bucket}) ->
    #leaf{size=Size, bucket=Bucket};
bucket_update_to_tree({split, SplitF, SplitT}) ->
    ChildF = bucket_update_to_tree(SplitF),
    ChildT = bucket_update_to_tree(SplitT),
    #branch{size=tree_size(ChildF)+tree_size(ChildT), childF=ChildF, childT=ChildT}.

% --- end ---

And the corresponding test code.

% simple buckets used for testing bit_tree

-module(test_bucket).

-include("conf.hrl").

-export([bits/1, add/3, split/1, add_to_tree/2, make_tree/1, distance/2, list_from/2]).

-define(MAX_SIZE, 3).
-define(BITS, ?END_BITS).

bits(Int) ->
    util:to_bits(<<Int:?BITS>>).

add(Suffix, Int, Bucket) ->
    split([{Suffix, Int} | Bucket]).

split(Bucket) ->
    if
        length(Bucket) > ?MAX_SIZE ->
            BucketF = [{Suffix2, Int2} || {[false | Suffix2], Int2} <- Bucket],
            BucketT = [{Suffix2, Int2} || {[true | Suffix2], Int2} <- Bucket],
            {split, split(BucketF), split(BucketT)};
        true ->
            {ok, length(Bucket), Bucket}
    end.

add_to_tree(Int, Tree) ->
    bit_tree:update(
      fun (Suffix, _Depth, _Gap_size, Bucket) ->
              add(Suffix, Int, Bucket)
      end,
      bits(Int),
      bits(Int), % dont care about gap for now
      Tree).

make_tree(Ints) ->
    Tree = bit_tree:empty(0, []),
    lists:foldl(fun add_to_tree/2, Tree, Ints).

distance(IntA, IntB) ->
    util:distance({'end', <<IntA:?BITS>>}, {'end', <<IntB:?BITS>>}).

% output *should* be in ascending order
list_from(Int, Tree) ->
    List = util:iter_to_list(bit_tree:iter(bits(Int), Tree)),
    lists:map(
      fun (Bucket) ->
              lists:sort([{distance(Int, Elem), Elem} || {_,Elem} <- Bucket])
      end,
      List).

March 24, 2011 11:08 PM

telehash: bit_trees

The next step in building a switch is managing a routing table. Actually, the next step is handling sessions via _ring/_line but I'm still mulling over the protocol so we'll skip to the routing table.

I'll add the usual 'I don't understand Kademlia and I don't test my code' disclaimer in here.

Routing in the Kademlia paper is described using what can best be called the 'mash everything together and be vague about the details' pattern. I want my switch to be a bit cleaner than that so I've split it into three modules. The first of these is the bit_tree.

The bit_tree is a suffix tree which maps ends (lists of bits) to buckets. The bit_tree neither knows nor cares what a bucket is and for now you don't either. The utility of this tree comes down to one important property: the floor of the log (base 2) of the XOR distance between two ends is the height of the smallest sub-tree which contains both of them. Got that? For example, if log(distance(EndA,EndB)) == 7.234... then the height of the smallest sub-tree containing both EndA and EndB is 7 nodes. This makes it easy to locate the nearest known nodes to a specified end, something we are supposed to do in response to a .see command.

So here is a bog-standard binary suffix tree:

% a bit_tree is either a leaf or a branch
-record(leaf, {
          size, % size of bucket
          bucket % some opaque bucket of stuff
         }).
-record(branch, {
          size, % size(childF) + size(childT)
          childF, % tree containing nodes whose next bit is false
          childT % % tree containing nodes whose next bit is true
         }).

When adding nodes to a bucket we need to keep track of certain numbers which will be used by the router to decide when to split buckets. Some of these are quite complicated so to make this easier we will work with a zipper-like structure instead of using leaf and branch directly. If you know what a zipper is the code in this post will make sense. If you don't know what a zipper is, go find out. When you come back the code in this post will make sense.

% zipper-esque structure marking a position in a bit_tree
-record(finger, {
          sizer, % a size function for buckets
          tree, % current sub-tree
          self, % the path *to* self (the nodes own end). either {down, Down_bits} or {up, Up_bits, Down_bits, Gap}
                % where Gap is the size of the largest tree containing self but not touching this finger
          depth, % the number of bits away from the root tree
          zipper % a list of {Bit, Tree} pairs marking branches NOT taken
         }).

The finger keeps track of where the nodes own end is located in the tree in order to calculate something I have termed the gap - the size of the largest sub-tree containing the nodes own end but not touching the finger.

The empty bit_tree is easy to define:

empty(Self, Bucket, Sizer) ->
    #finger{
       sizer = Sizer,
       tree = #leaf{size=Sizer(Bucket), bucket=Bucket},
       self = {down, Self},
       depth = 0,
       zipper = []
      }.

Moving around within the tree is a little more complicated but if you already went away and read about zippers it should feel familiar. Most of the work is in keeping track of the gap.

extend(Bits, #finger{tree=#leaf{}}=Finger) -> % must always end on a leaf
    {Bits, Finger};
extend([Next | Bits],
       #finger{
         tree = #branch{childF=ChildF, childT=ChildT},
         self = Self,
         depth = Depth,
         zipper = Zipper
        }=Finger) ->
    {Branch_taken, Branch_missed} =
        case Next of
            false -> {ChildF, ChildT};
            true -> {ChildT, ChildF}
        end,
    Self2 =
        case Self of
            {up, Up, Down, Gap} ->
                % already stepped out of gap
                {up, [not(Next)|Up], Down, Gap};
            {down, [Bit|Down]} when Bit == Next ->
                % still in the gap
                {down, Down};
            {down, [Bit|Down]} when Bit /= Next ->
                % leaving gap, check its size
                {up, [not(Next)], [Bit|Down], tree_size(Branch_missed)}
        end,
    Depth2 = Depth+1,
    Zipper2 = [{not(Next), Branch_missed} | Zipper],
    Finger2 = Finger#finger{
      tree = Branch_taken,
      self = Self2,
      depth = Depth2,
      zipper = Zipper2
     },
    extend(Bits, Finger2).
retract(0, Finger) ->
    Finger;
retract(N,
        #finger{
          tree = Tree,
          self = Self,
          depth = Depth,
          zipper = [{Last,Branch}|Zipper]
         }=Finger) when N>0 ->
    Size = tree_size(Tree) + tree_size(Branch),
    Tree2 =
        case Last of
            false -> #branch{size=Size, childF=Branch, childT=Tree};
            true -> #branch{size=Size, childF=Tree, childT=Branch}
        end,
    Self2 =
        case Self of
            {down, Down} ->
                % already in gap
                {down, [Last|Down]};
            {up, [], Down, _Gap} ->
                % just entered gap
                {down, [Last|Down]};
            {up, [Bit|Up], Down, Gap} ->
                % still outside gap
                true = (Bit==Last), % assert
                {up, Up, Down, Gap}
        end,
    Depth2 = Depth-1,
    Finger2 =
        Finger#finger{
          tree=Tree2,
          self=Self2,
          depth=Depth2,
          zipper=Zipper
         },
    retract(N-1, Finger2).

The extend and retract functions are only used internally. We export a much simpler function, move_to, which moves the finger to point at the bucket corresponding to the specified end.

move_to(Bits, #finger{depth=Depth}=Finger) when length(Bits) == ?END_BITS ->
    % !!! naive version
    extend(Bits, retract(Depth, Finger)).

We could make this more efficient by only retracting until the finger meets Bits partway up. For now I don't expect performance of the bit_tree to be an issue.

Now that we can find buckets we can modify them. Deciding when to split buckets is not the concern of the bit_tree so we delegate it to the caller.

update(Fun,
       #finger{
         sizer=Sizer,
         tree=#leaf{bucket=Bucket}
        }=Finger) ->
    Tree = bucket_update_to_tree(Sizer, Fun(Bucket)),
    Finger#finger{tree=Tree}.

bucket_update_to_tree(Sizer, {ok, Bucket}) ->
    #leaf{size=Sizer(Bucket), bucket=Bucket};
bucket_update_to_tree(Sizer, {split, SplitF, SplitT}) ->
    ChildF = bucket_update_to_tree(Sizer, SplitF),
    ChildT = bucket_update_to_tree(Sizer, SplitT),
    #branch{size=tree_size(ChildF)+tree_size(ChildT), childF=ChildF, childT=ChildT}.

In order to handle .see commands the iter function is used to return buckets in order of distance from the specified end. Here we are making use of the aforementioned nice properties of the bit_tree in order to efficiently return the buckets in order.

% iterate through buckets in ascending order of xor distance to (current position ++ Suffix)
iter(Suffix, #finger{tree=Tree, zipper=Zipper}) ->
    iter_buckets(Tree, Suffix, iter_zipper(Zipper, Suffix)).

% iterate through buckets in ascending order of xor distance to (current position ++ Suffix)
iter_zipper([], _Suffix) ->
    fun () ->
            done
    end;
iter_zipper([{Bit, Tree} | Zipper], Suffix) ->
    iter_buckets(Tree, Suffix, iter_zipper(Zipper, [not(Bit)|Suffix])).

% iterate through buckets in ascending order of xor distance to Bits, then hand over to Iter
iter_buckets(#leaf{bucket=Bucket}, _Bits, Iter) ->
    fun () ->
            {Bucket, Iter}
    end;
iter_buckets(#branch{childF=ChildF, childT=ChildT}, [Bit|Bits], Iter) ->
    case Bit of
        true ->
            iter_buckets(ChildT, Bits, iter_buckets(ChildF, Bits, Iter));
        false ->
            iter_buckets(ChildF, Bits, iter_buckets(ChildT, Bits, Iter))
    end.

It will typically be called like this:

{Suffix, Tree2} = bit_tree:move_to(util:to_bits(End), Tree),
bit_tree:iter(Suffix, Tree2)

Splitting the routing table into separate structures like this makes for easier testing. The bit_tree can be tested independently using really simple buckets where the elements are just integers and the buckets split when they reach more than three elements.

% simple buckets used for testing bit_tree

-module(test_bucket).

-include("conf.hrl").

-export([bits/1, add/3, split/1, move_to/2, add_to_tree/2, make_tree/2, distance/2, move_list_from/2, list_from/3]).

-define(MAX_SIZE, 3).
-define(BITS, ?END_BITS).

bits(Int) ->
    util:to_bits(<<Int:?BITS>>).

add(Suffix, Int, Bucket) ->
    split([{Suffix, Int} | Bucket]).

split(Bucket) ->
    if
        length(Bucket) > ?MAX_SIZE ->
            BucketF = [{Suffix2, Int2} || {[false | Suffix2], Int2} <- Bucket],
            BucketT = [{Suffix2, Int2} || {[true | Suffix2], Int2} <- Bucket],
            {split, split(BucketF), split(BucketT)};
        true ->
            {ok, Bucket}
    end.

move_to(Int, Tree) ->
    bit_tree:move_to(bits(Int), Tree).

add_to_tree(Int, Tree) ->
    {Suffix, Tree2} = move_to(Int, Tree),
    bit_tree:update(fun (Bucket) -> add(Suffix, Int, Bucket) end, Tree2).

make_tree(Int, Ints) ->
    Tree = bit_tree:empty(bits(Int), [], fun (Bucket) -> length(Bucket) end),
    lists:foldl(fun add_to_tree/2, Tree, Ints).

distance(IntA, IntB) ->
    util:distance({'end', <<IntA:?BITS>>}, {'end', <<IntB:?BITS>>}).

% output *should* be in ascending order
move_list_from(Int, Tree) ->
    {Suffix, Tree2} = bit_tree:move_to(bits(Int), Tree),
    list_from(Int, Suffix, Tree2).

list_from(Int, Suffix, Tree) ->
    List = util:iter_to_list(bit_tree:iter(Suffix, Tree)),
    lists:map(
      fun (Bucket) ->
              lists:sort([{distance(Int, Elem), Elem} || {_,Elem} <- Bucket])
      end,
      List).

We can play around with the test buckets a bit:

25> Tree = test_bucket:make_tree(47, lists:seq(1,1000)).
{finger,#Fun<test_bucket.1.121651971>,
        {leaf,1,[{[false,false,false],1000}]},
        {up,[false,true,false,false,false,false,false,true,true,
             true,true,true,true,true,true,true,true,true,true,true,true,
             true,true|...],
            [true,true,true,true,true,true,true,true,true,true,true,
             true,true,true,true,true,true,true,true,true,true,true|...],
            0},
        157,
        [{false,{branch,8,
                        {branch,4,
                                {leaf,2,[{[true],993},{[false],992}]},
                                {leaf,2,[{[true],995},{[false],994}]}},
                        {branch,4,
                                {leaf,2,[{[true],997},{[false],996}]},
                                {leaf,2,[{[true],999},{[false],998}]}}}},
         {true,{leaf,0,[]}},
         {false,{branch,32,
                        {branch,16,
                                {branch,8,
                                        {branch,4,
                                                {leaf,2,[{[true],961},{[...],...}]},
                                                {leaf,2,[{[...],...},{...}]}},
                                        {branch,4,
                                                {leaf,2,[{[...],...},{...}]},
                                                {leaf,2,[{...}|...]}}},
                                {branch,8,
                                        {branch,4,{leaf,2,[{[...],...},{...}]},{leaf,2,[{...}|...]}},
                                        {branch,4,{leaf,2,[{...}|...]},{leaf,2,[...]}}}},
                        {branch,16,
                                {branch,8,
                                        {branch,4,{leaf,2,[{[...],...},{...}]},{leaf,2,[{...}|...]}},
                                        {branch,4,{leaf,2,[{...}|...]},{leaf,2,[...]}}},
                                {branch,8,
                                        {branch,4,{leaf,2,[{...}|...]},{leaf,2,[...]}},
                                        {branch,4,{leaf,2,[...]},{leaf,2,...}}}}}},
         {false,{branch,64,
                        {branch,32,
                                {branch,16,
                                        {branch,8,
                                                {branch,4,{leaf,2,...},{leaf,...}},
                                                {branch,4,{leaf,...},{...}}},
                                        {branch,8,{branch,4,{leaf,...},{...}},{branch,4,{...},...}}},
                                {branch,16,
                                        {branch,8,{branch,4,{leaf,...},{...}},{branch,4,{...},...}},
                                        {branch,8,{branch,4,{...},...},{branch,4,...}}}},
                        {branch,32,
                                {branch,16,
                                        {branch,8,{branch,4,{leaf,...},{...}},{branch,4,{...},...}},
                                        {branch,8,{branch,4,{...},...},{branch,4,...}}},
                                {branch,16,
                                        {branch,8,{branch,4,{...},...},{branch,4,...}},
                                        {branch,8,{branch,4,...},{branch,...}}}}}},
         {false,{branch,128,
                        {branch,64,
                                {branch,32,
                                        {branch,16,
                                                {branch,8,{branch,...},{...}},
                                                {branch,8,{...},...}},
                                        {branch,16,{branch,8,{...},...},{branch,8,...}}},
                                {branch,32,
                                        {branch,16,{branch,8,{...},...},{branch,8,...}},
                                        {branch,16,{branch,8,...},{branch,...}}}},
                        {branch,64,
                                {branch,32,
                                        {branch,16,{branch,8,{...},...},{branch,8,...}},
                                        {branch,16,{branch,8,...},{branch,...}}},
                                {branch,32,
                                        {branch,16,{branch,8,...},{branch,...}},
                                        {branch,16,{branch,...},{...}}}}}},
         {false,{branch,256,
                        {branch,128,
                                {branch,64,
                                        {branch,32,{branch,16,{...},...},{branch,16,...}},
                                        {branch,32,{branch,16,...},{branch,...}}},
                                {branch,64,
                                        {branch,32,{branch,16,...},{branch,...}},
                                        {branch,32,{branch,...},{...}}}},
                        {branch,128,
                                {branch,64,
                                        {branch,32,{branch,16,...},{branch,...}},
                                        {branch,32,{branch,...},{...}}},
                                {branch,64,
                                        {branch,32,{branch,...},{...}},
                                        {branch,32,{...},...}}}}},
         {false,{branch,511,
                        {branch,255,
                                {branch,127,
                                        {branch,63,{branch,31,...},{branch,...}},
                                        {branch,64,{branch,...},{...}}},
                                {branch,128,
                                        {branch,64,{branch,...},{...}},
                                        {branch,64,{...},...}}},
                        {branch,256,
                                {branch,128,
                                        {branch,64,{branch,...},{...}},
                                        {branch,64,{...},...}},
                                {branch,128,{branch,64,{...},...},{branch,64,...}}}}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,[]}},
         {true,{leaf,0,...}},
         {true,{leaf,...}},
         {true,{...}},
         {true,...},
         {...}|...]}
26> List = test_bucket:move_list_from(657, Tree).
[[{0,657},{1,656}],
 [{2,659},{3,658}],
 [{4,661},{5,660}],
 [{6,663},{7,662}],
 [{8,665},{9,664}],
 [{10,667},{11,666}],
 [{12,669},{13,668}],
 [{14,671},{15,670}],
 [{16,641},{17,640}],
 [{18,643},{19,642}],
 [{20,645},{21,644}],
 [{22,647},{23,646}],
 [{24,649},{25,648}],
 [{26,651},{27,650}],
 [{28,653},{29,652}],
 [{30,655},{31,654}],
 [{32,689},{33,688}],
 [{34,691},{35,690}],
 [{36,693},{37,692}],
 [{38,695},{39,694}],
 [{40,697},{41,696}],
 [{42,699},{43,698}],
 [{44,701},{45,700}],
 [{46,703},{47,702}],
 [{48,673},{49,672}],
 [{50,675},{51,...}],
 [{52,...},{...}],
 [{...}|...],
 [...]|...]
27> lists:flatten(List) == lists:sort(lists:flatten(List)).
true

As usual the full code is in the repo.

March 24, 2011 07:34 PM

erlang.org RSS Feed

New Site Look

Our demo site has now been launched as the regular erlang.org. We hope you like it. Report any problems.

March 24, 2011 06:11 PM

Erlang Training and Consulting
- News

24 March 2011: Erlang Solutions partners with Travelping to provide next generation AAA Solutions to telecom operators

Erlang Solutions (ESL), the driver behind the worldwide uptake of the Erlang development environment, today announced its partnership with Travelping, a Germany based software vendor offering solutions for network operators to build, manage and charge their broadband services securely and efficiently.  Together, the two companies will offer operators highly scalable, robust and distributed, control plane solutions to build AAA and session control systems for their next generation networks (NGN).

For more information please see the press release.

March 24, 2011 05:17 PM

March 23, 2011

Trapexit's Erlang Blog Filter

Don&#8217;t Lose Your ets Tables

In my recent QCon talk I talked about accidentally crashing an Erlang process on a customer’s subscription streaming video website running live in production. The code involved had not been used in production before, and the customer had decided somewhat unexpectedly to turn on a new feature that required it. The developer who wrote it had not tested it and had long since left the company.

The purpose of the code was to monitor bandwidth and session usage for each video subscriber to make sure they weren’t streaming more than they’d paid for. Concerned about the viability of the code, a colleague and I logged into the customer site (with their permission, of course), chose a subscriber at random, and, in an Erlang shell, I interactively invoked a function in the code in question to check that subscriber’s current bandwidth and session count. After a second check, we saw the numbers dropping, potentially indicating the subscriber was logging out, and we wanted to make sure all went well when the subscriber completely stopped streaming. After waiting a bit, I interactively called the function again, and — BAM! — the process holding session state for all paying customers crashed.

The original developer had used an Erlang ets table, an in-memory data store, to hold the subscriber data, and wrote something like this for lookups:

[SubscriberData] = ets:lookup(Table, Subscriber),

My interactive call from the shell looked up a nonexistent subscriber, so the result was the empty list [] rather than [SubscriberData], which caused a pattern mismatch and a badmatch exception. Uncaught, the exception crashed the process. Since the process owned the ets table, when it went down it took the ets table and all subscriber session data with it. It wasn’t so bad, since all it meant was that for a few hours a few subscribers potentially got a bit more video than they’d paid for, but still, it’s not at all the kind of design Erlang’s “Let It Crash” philosophy actually encourages. Crashing a process when something unexpected occurs is perfectly fine, since coding defensively introduces problems of its own, but you can still avoid losing your ets tables like this relatively easily.

Name an Heir

When you create an ets table you can also name a process to inherit the table should the creating process die:

TableId = ets:new(my_table, [{heir, SomeOtherProcess, HeirData}]),

If the creating process dies, the process SomeOtherProcess will receive a message of the form

{'ETS-TRANSFER', TableId, OldOwner, HeirData}

where TableId is the table identifier returned from ets:new, OldOwner is the pid of the process that owned the table, and HeirData is the data provided with the heir option passed to ets:new. Once it receives this message, SomeOtherProcess owns the table.

Give It Away

Alternatively, you can create an ets table and then give it to some other process to keep it:

TableId = ets:new(my_table, []),
ets:give_away(TableId, SomeOtherProcess, GiftData),

If the creating process dies, the process SomeOtherProcess will receive a message of the form

{'ETS-TRANSFER', TableId, OldOwner, GiftData}

where TableId is the table identifier returned from ets:new, OldOwner is the pid of the process that owned the table, and GiftData is the data provided in the ets:give_away call. Once it receives this message, SomeOtherProcess owns the table.

Table Manager

Instead of naming an heir or giving a table away, you can just have your Erlang supervisor process create a child process whose sole task is to own the table. This process creates the table as a named public table, thus allowing other processes to know its name and read/write it directly, with ets built-in concurrency protection dealing with any concurrency issues. Since the owner process does nothing more than create the table and then wait to be told to shut down, the likelihood of it crashing and taking the table with it is practically nil. The drawback here, though, is that the process actually using the table may have to coordinate with the owner process to ensure the table is available, and worse, it ends up using what is essentially a global variable — the table name — which can make code harder to read and maintain.

A Combination Approach

A nice way of managing ets tables, though, is to use a combination of the three previous techniques:

  1. The Erlang supervisor creates a table manager process. Since all this process does is manage the table, the likelihood of it crashing is very low.
  2. The table manager links itself to the table user process and traps exits, allowing it to receive an EXIT message if the table user process dies unexpectedly.
  3. The table manager creates a table, names itself (self()) as the heir, and then gives it away to the table user process.
  4. If the table user process dies, the table manager is informed of the process death and also inherits the table back.

Once it inherits the table, the table manager can then for example wait until the supervisor recreates the table user process, and then repeat the steps above to give the table to the new table user process. Other variations on this approach, like maybe a small pool of child process clones that cooperate to transfer the table between them in case of error, are of course also possible. Even though there are still process coordination issues here (but nothing difficult), I like this approach because it avoids global named tables and takes advantage of Erlang's supervision hierarchy.

The title of my QCon talk was "Let It Crash...Except When You Shouldn't." This scenario is an example of "when you shouldn't" — losing ets data due to a process crash is easily avoided.

by steve at March 23, 2011 03:01 PM

March 21, 2011

Damien Katz

Couchbase SF Training Was Awesome

I had a blast teaching the first Couchbase CouchDB Training with training pro Alan McKean last week. 2 intensive days of hands on teaching and talking about Apache CouchDB to enthusiastic and excited people. It was actually a learning experience for me too, there's a lot in CouchDB I haven't had a chance to use yet :)

It's not too late to sign up for the remaining 3 cities on the Couchbase Training World Tour: Austin, London and Berlin.

by Damien Katz at March 21, 2011 10:49 PM

Scattered Thoughts

telehash: dialing

The next step in building a telehash switch is being able to dial.

First a disclaimer: this post reflects my current understanding of TeleHash and Kademlia and is highly likely to be wrong. This code has only received minimal testing. Properly testing a p2p network is not something I'm entirely sure how to do yet. Expect to see more on that in later posts.

Each TeleHash node and each key in the DHT is identified by a 160 bit sha1 hash (aka end). In the original Kademlia paper the node ids are selected at random but in TeleHash they are the hashed address (IP:port) of the node. This means that malicious nodes don't get to choose where they are inserted in the DHT.

Kademlia routing is based on the XOR distance between ends. This forms a metric space over the set of ends.

distance(A, B) ->
    {'end', EndA} = to_end(A),
    {'end', EndB} = to_end(B),
    Bytes = lists:zip(binary_to_list(EndA), binary_to_list(EndB)),
    Xor = list_to_binary([ByteA bxor ByteB || {ByteA, ByteB} <- Bytes]),
    <<Dist:?END_BITS>> = Xor,
    Dist.

The Kademlia paper defines two constants, K and A. K controls the amount of redundant storage in the DHT and A controls the number of parallel requests issued by each node. To insert a key into the DHT a node must be able to locate the K nodes whose IDs are closest to the key. This process is called dialing.

Dialing works roughly as follows. Each node keeps track of all the other nodes it has seen. Upon receiving a +end signal a node will reply with a .see command containing the K nodes it is aware of which are closest to the specified end. To dial an end we send a +end signal to each of the K closest nodes we are aware of. Then to each node contained in the .see replies we send +end signals, and so on until we run out of nodes to contact.

Now this is nice and simple and will work but it generates a huge amount of load on the network. To reduce this Kademlia introduces two additional rules. First, we only send up to A signals at a time and don't send any new signals until previous signals have either generated a reply or timed out. Second, we finish early if at any point we have received replies from K nodes which are closer to the end than all the nodes we are waiting to contact. The Kademlia paper proves that under reasonable assumptions about the knowledge of each node this still has a very high chance to return the correct results.

The dialer process is an event handler which has two important data structures. The first stores the dialer configuration:

-record(conf, {
          target, % the end to dial
          timeout, % the timeout for the entire dialing process
          ref, caller % reply details
         }).

The second record stores the state of the dialing process. The principle around which the dialer is designed is that the state record is a reflection of the outside world and the sole job of the dialer is to keep this record up to date while maintaining the invariants in the comments. This is often the way that I write code and I feel that it needs it's own post once I can articulate it properly. It's certainly heavily informed both by the designs in Okasaki's Purely Functional Data Structures and by Conal Elliott's ideas about denotational semantics and type class morphisms.

-record(state, {
          fresh, % nodes which have not yet been contacted
          pinged, % nodes which have been contacted and have not replied
          waiting, % nodes in pinged which were contacted less than ?DIAL_TIMEOUT ago
          ponged, % nodes which have been contacted and have replied
          seen % all nodes which have been seen
         }). % invariant: pq:length(waiting) = ?A or pq:empty(fresh)

The dialer module exports the dial function which creates the records and starts the event handler.

dial(To, From, Timeout) ->
    log:info([?MODULE, dialing, To, From, Timeout]),
    Ref = erlang:make_ref(),
    Target = util:to_end(To),
    Conf = #conf{
      target = Target,
      timeout = Timeout,
      ref = Ref,
      caller = self()
     },
    Nodes = [{util:distance(Address, Target), Address}
             || Address <- From],
    State = #state{
      fresh=pq:from_list(Nodes),
      pinged=sets:new(),
      waiting=pq:empty(),
      ponged=pq:empty(),
      seen=sets:new()
     },
    ok = switch:add_handler(?MODULE, {Conf, State}),
    Ref.

The aim is to handle events and maintain the state invariants until we are finished. How do we define finished?

% is the dialing finished yet?
finished(#state{fresh=Fresh, waiting=Waiting, ponged=Ponged}) ->
    (pq:is_empty(Fresh) and pq:is_empty(Waiting)) % no way to continue
    or
    (case pq:length(Ponged) >= ?K of
         false ->
             false; % dont yet have K nodes
         true ->
             % finish if the K closest nodes we know are closer than all the nodes we haven't checked yet
             {Dist_fresh, _} = pq:peek(Fresh),
             {Dist_waiting, _} = pq:peek(Waiting),
             {Nodes, _} = pq:pop(Ponged, ?K),
             {Dist_ponged, _} = lists:last(Nodes),
             (Dist_ponged < Dist_fresh) and (Dist_ponged < Dist_waiting)
     end).

One of the invariants we aim to maintain is that either the fresh queue is empty or the length of the waiting queue is A. This ensures that we send out +end signals whenever possible. This invariant is maintained by calling the ping_nodes function after every event.

% contact nodes from fresh until the waiting list is full
ping_nodes(#conf{target=Target}, #state{fresh=Fresh, waiting=Waiting, pinged=Pinged}=State) ->
    Num = ?A - pq:length(Waiting),
    {Nodes, Fresh2} = pq:pop(Fresh, Num),
    Telex = {struct, [{'+end', util:end_to_hex(Target)}]},
    lists:foreach(
      fun ({Dist, Address}=Node) ->
              log:info([?MODULE, ping, Dist, Address]),
              switch:send(Address, Telex),
              erlang:send_after(?DIAL_TIMEOUT, self(), {timeout, Node})
      end,
      Nodes),
    Waiting2 = pq:push(Nodes, Waiting),
    Pinged2 = sets:union(Pinged, sets:from_list(Nodes)),
    State#state{fresh=Fresh2, waiting=Waiting2, pinged=Pinged2}.

We handle replies by moving the replying node from the waiting queue to the ponged queue and inserting the .see nodes into the fresh list. We cannot allow duplicate nodes so the seen set is kept up to date. The pinged set will be used later to ensure that we only accept replies from nodes we have already contacted and only accept one reply per node.

% handle a reply from a node
ponged(Node, See, #state{fresh=Fresh, waiting=Waiting, pinged=Pinged, ponged=Ponged, seen=Seen}=State) ->
    Waiting2 = pq:delete(Node, Waiting),
    Pinged2 = sets:del_element(Node, Pinged),
    Ponged2 = pq:push_one(Node, Ponged),
    New_nodes = lists:filter(fun (See_node) -> not(sets:is_element(See_node, Seen)) end, See),
    Fresh2 = pq:push(New_nodes, Fresh),
    Seen2 = sets:union(Seen, sets:from_list(See)),
    State#state{fresh=Fresh2, waiting=Waiting2, pinged=Pinged2, ponged=Ponged2, seen=Seen2}.

Once we are finished we need to return the results to the caller.

% return results to the caller
return(#conf{ref=Ref, caller=Caller}, #state{ponged=Ponged}) ->
    {Nodes, _} = pq:pop(Ponged, ?K),
    log:info([?MODULE, returning, Nodes]),
    Result = [Address || {_Dist, Address} <- Nodes],
    Caller ! {dialed, Ref, Result}.

Finally, after each event we call continue to decide whether to finish and return results or to carry on sending signals.

% either continue to dial or return results
% meant for use at the end of a gen_event callback
continue(Conf, State) ->
    case finished(State) of
        true ->
            return(Conf, State),
            remove_handler;
        false ->
            State2 = ping_nodes(Conf, State),
            {ok, {Conf, State2}}
    end.

The functions above are glued together by a gen_event handler. The handler is attached to the switch gen_event manager and receives an event for each telex arriving at the switch.

-behaviour(gen_event).
-export([init/1, handle_event/2, handle_call/2, handle_info/2, terminate/2, code_change/3]).

The init function is called when the handler is started. It sends out the first +end signals and sets a timer that tells the handler when to give up dialling.

init({#conf{timeout=Timeout}=Conf, State}) ->
    erlang:send_after(Timeout, self(), giveup),
    State2 = ping_nodes(Conf, State),
    {ok, {Conf, State2}}.

The giveup timeout is simple to deal with.

handle_info(giveup, {Conf, State}) ->
    log:info([?MODULE, giveup, Conf, State]),
    remove_handler;

As are the timeouts from individual signals.

handle_info({timeout, Node}, {Conf, #state{waiting=Waiting}=State}) ->
    log:info([?MODULE, timeout, Node]),
    State2 = State#state{waiting=pq:delete(Node, Waiting)},
    continue(Conf, State2);

The last callback is the messiest. This essentially just calls ponged and continue, but first has to sanity check the incoming message.

handle_event({recv, Address, Telex}, {#conf{target=Target}=Conf, #state{pinged=Pinged}=State}) ->
    case telex:get(Telex, '.see') of
        {error, not_found} ->
            {ok, {Conf, State}};
        {ok, Address_binaries} ->
            Dist = util:distance(Address, Target),
            Node = {Dist, Address},
            case sets:is_element(Node, Pinged) of % !!! command ids would make a better check
                false ->
                    {ok, {Conf, State}};
                true ->
                    try [{util:distance(Target, Bin), util:binary_to_address(Bin)} || Bin <- Address_binaries] of
                        Nodes ->
                            log:info([?MODULE, pong, Node, Nodes]),
                            State2 = ponged(Node, Nodes, State),
                            continue(Conf, State2)
                    catch
                        _:Error ->
                            log:info([?MODULE, bad_see, Address, Telex, Error, erlang:get_stacktrace()]),
                            {ok, {Conf, State}}
                    end
            end
    end;

That's pretty much it - we now (probably) have a working dialer. I spent a fair few hours teasing this apart but hopefully the end result is fairly simple to understand. The full code is in the repo as always.

4> switch:start_link().
{ok,<0.79.0>,<0.80.0>}
5> Root = {address, "208.68.163.247", 42424}.
{address,"208.68.163.247",42424}
6> dialer:dial_sync(Root, [Root], 10000).

=INFO REPORT==== 21-Mar-2011::14:48:06 ===
    pid: <0.35.0>
    dialer
    dialing
    {address,"208.68.163.247",42424}
    [{address,"208.68.163.247",42424}]
    10000

=INFO REPORT==== 21-Mar-2011::14:48:06 ===
    pid: <0.79.0>
    dialer
    ping
    0
    {address,"208.68.163.247",42424}

=INFO REPORT==== 21-Mar-2011::14:48:06 ===
    pid: <0.80.0>
    switch_event
    send
    {address,"208.68.163.247",42424}
    struct: [{<<"_to">>,<<"208.68.163.247:42424">>},
             {'+end',<<"38666817e1b38470644e004b9356c1622368fa57">>}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.80.0>
    switch_event
    recv
    {address,"208.68.163.247",42424}
    struct: [{<<"_ring">>,18115},
             {<<".see">>,
              [<<"204.232.205.180:42424">>,<<"208.68.163.247:42424">>]},
             {<<"_br">>,240},
             {<<"_to">>,<<"203.218.138.245:42424">>}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.79.0>
    dialer
    pong
    0: {address,"208.68.163.247",42424}
    [{535375931004298447338698443374311161987273280591,
      {address,"204.232.205.180",42424}},
     {0,{address,"208.68.163.247",42424}}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.79.0>
    dialer
    ping
    0
    {address,"208.68.163.247",42424}

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.79.0>
    dialer
    ping
    535375931004298447338698443374311161987273280591
    {address,"204.232.205.180",42424}

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.80.0>
    switch_event
    send
    {address,"208.68.163.247",42424}
    struct: [{<<"_to">>,<<"208.68.163.247:42424">>},
             {'+end',<<"38666817e1b38470644e004b9356c1622368fa57">>}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.80.0>
    switch_event
    send
    {address,"204.232.205.180",42424}
    struct: [{<<"_to">>,<<"204.232.205.180:42424">>},
             {'+end',<<"38666817e1b38470644e004b9356c1622368fa57">>}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.80.0>
    switch_event
    recv
    {address,"204.232.205.180",42424}
    struct: [{<<"_ring">>,16506},
             {<<".see">>,
              [<<"204.232.205.180:42424">>,<<"208.68.163.247:42424">>]},
             {<<"_br">>,162},
             {<<"_to">>,<<"203.218.138.245:42424">>}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.79.0>
    dialer
    pong
    535375931004298447338698443374311161987273280591: {address,
                                                       "204.232.205.180",
                                                       42424}
    [{535375931004298447338698443374311161987273280591,
      {address,"204.232.205.180",42424}},
     {0,{address,"208.68.163.247",42424}}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.80.0>
    switch_event
    recv
    {address,"208.68.163.247",42424}
    struct: [{<<"_ring">>,18115},
             {<<".see">>,
              [<<"204.232.205.180:42424">>,<<"208.68.163.247:42424">>]},
             {<<"_br">>,320},
             {<<"_to">>,<<"203.218.138.245:42424">>}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.79.0>
    dialer
    pong
    0: {address,"208.68.163.247",42424}
    [{535375931004298447338698443374311161987273280591,
      {address,"204.232.205.180",42424}},
     {0,{address,"208.68.163.247",42424}}]

=INFO REPORT==== 21-Mar-2011::14:48:07 ===
    pid: <0.79.0>
    dialer
    returning
    [{0,{address,"208.68.163.247",42424}},
     {535375931004298447338698443374311161987273280591,
      {address,"204.232.205.180",42424}}]
{ok,[{address,"208.68.163.247",42424},
     {address,"204.232.205.180",42424}]}

One last note: after I finished writing this I started thinking about what would happen if I run more than one dialer in parallel. Unlike Kademlia, TeleHash does not currently use command IDs so the dialer cannot tell if the response came in reply to its own command or in reply to the command of another dialer on the same node. It's the kind of bug that would be very rare in actual use but might be carefully exploited by a malicious node. Finding these kinds of bugs is going to be really hard.

March 21, 2011 03:14 PM

March 18, 2011

Learn You Some Erlang

Who Supervises The Supervisors?

Right in time before the Bay Area Erlang Conference, the supervisors make their place in Learn You Some Erlang. We see how to set up an OTP supervisor, the restart strategies available, how to write children specifications and have a little demonstration where a band manager takes pleasure at firing band members.

March 18, 2011 12:30 PM

March 17, 2011

Process-one Blogs

Sea Beyond 2011 Talk 5: Marek Foss on designing mobile collaboration software

Sea Beyond event was this year heavily focused on mobile real time applications.

Marek Foss, Chief Web Officer at ProcessOne, gave a talk at Sea Beyond 2011 event, sharing some useful design oriented patterns for collaborative real time mobile applications.

Here is the video of his presentation:

You can see the slides here:

Do not miss the event summary and the other videos from Sea Beyond event.

by Mickaël Rémond at March 17, 2011 03:46 PM

Scattered Thoughts

telehash: basics

TeleHash is a p2p network based on the Kademlia DHT that provides addressing and NAT traversal. These are problems that every p2p app has to deal with, including my poppi. Unfortunately there is no erlang implementation yet so I have to roll my own. The code so far lives here In this first post I'll just cover the absolute basics - sending, receiving, encoding and decoding messages.

TeleHash messages (telexes) are utf8-encoded json packets sent over udp. Luckily, mochijson2 uses utf8 by default so encoding/decoding is trivial.

encode(Telex) ->
    mochijson2:encode(Telex).

decode(Json) ->
    mochijson2:decode(Json).

The telex module also defines some convenience methods for working with json - get/2, set/3, update/4 - which are used like this:

2> T = telex:decode("{\"foo\":[\"bar\", {\"baz\":0}]}").
{struct,[{<<"foo">>,[<<"bar">>,{struct,[{<<"baz">>,0}]}]}]}
3> telex:get(T, foo).
[<<"bar">>,{struct,[{<<"baz">>,0}]}]
4> telex:get(T, {foo,2}).
{struct,[{<<"baz">>,0}]}
5> telex:get(T, {foo,2,baz}).
0
6> telex:set(T, {foo,2,baz}, 1).
{struct,[{<<"foo">>,[<<"bar">>,{struct,[{<<"baz">>,1}]}]}]}
7> telex:set(T, bigger, true).
{struct,[{<<"bigger">>,true},
         {<<"foo">>,[<<"bar">>,{struct,[{<<"baz">>,0}]}]}]}
8> telex:update(T, {foo,2,baz}, fun (X) -> X + 10 end, -1).
{struct,[{<<"foo">>,[<<"bar">>,{struct,[{<<"baz">>,10}]}]}]}

The next step is to be able to send and receive messages. The switch module runs a gen_server which manages the udp socket and a gen_event which allows other processes to subscribe to incoming messages.

-module(switch).

-include("conf.hrl").

-export([start_link/0, add_handler/2, add_sup_handler/2, send/2]).

-behaviour(gen_server).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]).

-record(state, {socket}).
-define(EVENT, switch_event).
-define(SERVER, switch_server).

% --- api ---

start_link() ->
    {ok, Gen_event} = gen_event:start_link({local, ?EVENT}),
    {ok, Gen_server} = gen_server:start_link({local, ?SERVER}, ?MODULE, [], []),
    {ok, Gen_event, Gen_server}.

add_handler(Module, Args) ->
    gen_event:add_handler(?EVENT, Module, Args).

add_sup_handler(Module, Args) ->
    gen_event:add_sup_handler(?EVENT, Module, Args).

send({address, _Host, _Port}=Address, Telex) ->
    gen_server:cast(?SERVER, {telex, Address, Telex}).

% --- gen_server callbacks ---

init([]) ->
    {ok, Socket} = gen_udp:open(?PORT),
    {ok, #state{socket=Socket}}.

handle_call(_Request, _From, State) ->
    {reply, ok, State}.

handle_cast({telex, {address, Host, Port}, Telex}, #state{socket=Socket}=State) ->
    gen_udp:send(Socket, Host, Port, telex:encode(Telex)),
    {noreply, State};
handle_cast(_Msg, State) ->
    {noreply, State}.

handle_info({udp, Socket, Host, Port, Msg}, #state{socket=Socket}=State) ->
    Event = {telex, {address, Host, Port}, telex:decode(Msg)},
    gen_event:notify(?EVENT, Event),
    {noreply, State};
handle_info(_Info, State) ->
    {noreply, State}.

terminate(_Reason, #state{socket=Socket}) ->
    gen_udp:close(Socket),
    ok.

code_change(_OldVsn, State, _Extra) ->
  {ok, State}.

% --- end ---

To demonstrate this, let's write the simplest possible event handler:

-module(log).

-export([start/0]).
-export([info/1, warn/1, error/1]).

-behaviour(gen_event).
-export([init/1, handle_event/2, handle_call/2, handle_info/2, terminate/2, code_change/3]).

% --- api ---

start() ->
    switch:add_sup_handler(?MODULE, none).

info(Info) ->
    error_logger:info_report([{pid, self()} | Info]).

warn(Warn) ->
    error_logger:warning_report([{pid, self()} | Warn]).

error(Error) ->
    error_logger:error_report([{pid, self()} | Error]).

% --- gen_event callbacks ---

init(none) ->
    {ok, none}.

handle_event(Event, State) ->
    log:info([Event]),
    {ok, State}.

handle_call(_Request, State) ->
    {ok, ok, State}.

handle_info(_Info, State) ->
    {ok, State}.

terminate(_Reason, _State) ->
    ok.

code_change(_OldVsn, State, _Extra) ->
    {ok, State}.

% --- end ---

Here we have some wrappers around the standard error logger and an event handler which (after masses of gen_event boilerplate) simply logs every event.

This is enough functionality now to start talking to a TeleHash node:

1> c(util), c(telex), c(switch), c(log).
{ok,log}
2> switch:start_link().
{ok,<0.55.0>,<0.56.0>}
3> log:start().
ok
4> T = {struct, [{'+end', 'a9993e364706816aba3e25717850c26c9cd0d89d'}]}.
{struct,[{'+end',a9993e364706816aba3e25717850c26c9cd0d89d}]}
5> switch:send({address, "127.0.0.1", 55555}, T).
ok
6>
=INFO REPORT==== 17-Mar-2011::12:21:13 ===
    pid: <0.55.0>
    {telex,{address,{127,0,0,1},55555},
           {struct,[{<<"_ring">>,5932},
                    {<<".see">>,[]},
                    {<<"_br">>,51},
                    {<<"_to">>,<<"127.0.0.1:42424">>}]}}

Here we ask localhost:55555 for the nearest nodes it knows to the end 'a99...89d'. The reply is contained in the .see field (which is empty because localhost:55555 hasn't seeded itself yet and so doesn't know any nodes at all).

The next post will deal with dialing, at which point we will have a working announcer.

March 17, 2011 12:47 PM

erlang.org RSS Feed

R14B02 released

Erlang/OTP R14B02 has been released as planned on March 16:th 2011. It is the second R14 service release.

See the release notes in the readme file

Download the new release from the download page.

Highlights:

  • The "halfword" emulator is now official. A 64-bit emulator that uses less memory than the full 64-bit emulator.
  • EDoc handles Erlang specifications and types.
  • All test suites now run with CommonTest

March 17, 2011 09:47 AM

Erlang/OTP Projects
Personal tools
Subscriptions

Powered by Planet!
Last updated:
June 04, 2011 07:33 PM