The road to serverless computing


This is my presentation about serverless computing for the TEFcon 2016.

Hold on
No Comments

Ansible. Automate everything.


Ansible is an automation tool that works by running tasks (a playbook) against a set of hosts (inventory).

The first beta of Ansible 2.x is ready, and it comes with modules to automate the management of your openstack infrastructure (other cloud platforms are also available). This is my presentantion for begginers who want to start using Ansible and to stop wasting their time.

Hold on
No Comments

Apache Mesos and the future of microservice architectures


Here you have a presentation Apache Mesos, Marathon and Flock I gave for the cloud community in Telefónica.

Flock is an unstable proof of concept I’ve been working on lately. It is a next-generation service platform that uses Marathon to allocate services across a cluster of machines.

I think the time to start caring only about services -not machines- has come. Compute resources should be treated as a commodity. As tap water.

Hold on
No Comments

Simple moving average with bacon.js


Bacon is a small functional reactive programming lib for JavaScript. Sometimes it is easier to handle data as a stream and react to changes in the stream instead of processing individual events. In the example below (nodejs), a simple moving average.

var Bacon = require('baconjs')

function avg(array) {
  var sum = array.reduce(function(a, b) { return a + b; }, 0);
  return sum / array.length;

var bus = new Bacon.Bus();
bus.push(1); // output = 1
bus.push(2); // output = 1.5
bus.push(3); // output = 2

You can see another example (github) where an event stream is created and populated with your mouse positions. The “y” position is represented along with the moving average.

Hold on
No Comments

Evolutionary computation · TEFcon 2014


This is my presentation (spanish) about evolutionary computation for the TEFcon 2014. It was a talk about how we code and how genetic algorithms and genetic programming might help us. Because “programming should be more about the what and less about the how”.

Edit: I’ve pushed the Java code for the queens problem using genetic algorithms to github.

Hold on
1 Comment

DTrace para flipar pepinillos


This is my presentation (spanish) about DTrace, a tracing framework created by Sun that is really cool. It is available on “Solaris” systems, but also on OSX, BSD, and some Linux ports.

It is a really powerful tool once you get used to it.

Hold on
No Comments

How to catch an Internet troll


Some weeks ago I carried out a social experiment (, spanish) that consisted in writing a poem in a collaborative and anonymous way. This means that anyone can add a new verse to the poem without identifying themselves or leaving any metadata (no cookies, no IP address tracking, etc).

Our first trolls didn’t take long to appear, mostly in the form of copyrighted material, spam and offensive contents. Is it possible to automatically classify an anonymous verse as spammy?


Text classification

LingPipe is a powerful Java toolkit for processing text, free for research use under some conditions. I followed the Text Classification Tutorial, to classify verses in one of these categories: “spam” or “love”.

The classifier I built uses 80% of the poem (already classified into “spam” or “love” categories by hand), as a training set to learn and build a language model. Then, it uses the remaining 20% of the poem (48 verses) for cross-validation of this model.

You can find the code in the Annex I, it is less than 50 lines of code.

Classification results

The classification accuracy is 75% ± 12.25%, so we can say that our model performs better than a monkey at significance level of 0.05.

Categories=[spam, love]
Total Count=48
Total Correct=36
Total Accuracy=0.75
95% Confidence Interval=0.75 +/- 0.1225

Confusion Matrix
Macro-averaged Precision=0.7555
Macro-averaged Recall=0.7412
Macro-averaged F=0.7428

It seems pretty promising and it can serve as inspiration but, to be honest, I don’t think it is such a good model. With so few contributions to the poem, it is prone to overfitting, so it is probably learning to classify just our usual trolls that are not original whatsoever.

Moreover, we are not taking into account other factors that would greatly improve the results, such as the structure of the verse (length, number of words, etc), the relation between the verses (rhyme) or the presence of inexistent words and typos. If you want to further investigate, I suggest taking a look at Logistic Regression, to build better models that also include these kind of factors.

On a practical note, if you ever plan to carry out a similar experiment, remember two rules. First, make it easier for you to revert vandalism than for the troll to vandalize your site. Second, don’t feed the troll. They will eventually get tired.

Annex I. Java Code

String[] CATEGORIES = { "spam", "love" };
int NGRAM_SIZE = 6;

String textSpamTraining =
        "Ola ke Ase\n" +
        "censurator\n" +

String textLoveTraining =
        "Me ahogo en un suspiro,\n" +
        "miro tus ojos de cristal\n" +

String[] textSpamCrossValidation = {
        "os va a censurar",
        "esto es una mierda",

String[] textLoveCrossValidation = {
        "el experimento ha revelado",
        "que el gran poeta no era orador",
        "y al no resultar como esperado",
        "se ha tornado en vil censor",

// FIRST STEP - learn
DynamicLMClassifier<NGramProcessLM> classifier =
        DynamicLMClassifier.createNGramProcess(CATEGORIES, NGRAM_SIZE);

    Classification classification = new Classification("spam");
    Classified<CharSequence> classified =
        new Classified<CharSequence>(textSpamTraining, classification);

    Classification classification = new Classification("love");
    Classified<CharSequence> classified =
        new Classified<CharSequence>(textLoveTraining, classification);

// SECOND STEP - compile
JointClassifier<CharSequence> compiledClassifier =

JointClassifierEvaluator<CharSequence> evaluator =
        new JointClassifierEvaluator<CharSequence>(
                compiledClassifier, CATEGORIES, true);

// THIRD STEP - cross-validate
for (String textSpamItem: textSpamCrossValidation) {
    Classification classification = new Classification("spam");
    Classified<CharSequence> classified =
        new Classified<CharSequence>(textSpamItem, classification);

for (String textLoveItem: textLoveValidation) {
    Classification classification = new Classification("love");
    Classified<CharSequence> classified =
        new Classified<CharSequence>(textLoveItem, classification);

ConfusionMatrix matrix = evaluator.confusionMatrix();
System.out.println("Total Accuracy: " + matrix.totalAccuracy());

Hold on
No Comments

Lazy loading of modules in nodejs


This is a pattern I found in pkgcloud to lazy-load nodejs modules. That is, to defer their loading until a module is actually needed.

var providers = [ 'amazon', 'azure', ..., 'joyent' ];

// Setup all providers as lazy-loaded getters
providers.forEach(function (provider) {
  pkgcloud.providers.__defineGetter__(provider, function () {
    return require('./pkgcloud/' + provider);

It basically defines a getter, so modules won’t be loaded until you do:

var provider =;

It might be useful in applications where you have different adapters (“providers” in the example above) offering different implementations of the same API, and you want to let the user choose which one to use at runtime. This is a common requirement in cloud environments but it could be applicable to other scenarios as well (e.g. choose a payment gateway).

This is the first time I see it, so please share your thoughts on it and any other alternative approaches.

Hold on

Cloud is not cheap


There is a myth about cloud computing. Many people think they will save money moving their services to the cloud, but the reality is that the cloud is not cheap.

Virtualization, one of the core parts of cloud computing, tries to meet the promise of elastic capacity and pay-as-you-go policies. Despite of this promise, the true story is that today we are running virtual machines that don’t do much because, most part of the time, our applications are not doing anything. Their processors are underutilized. While this is an opportunity for cloud providers to oversubscribe their data centers, it also means we are overpaying for it. There is still much untapped potential for applications running on the cloud.

Services in the 21st century

In the last few years we have seen many improvements in the way applications are packaged and deployed to the cloud, how to automate these processes, and we have learnt that we have to build applications for failure (see “There will be no reliable cloud“).

But what I have not seen yet is anything about services communicating to each other to share its health status. I think services in the cloud should be able to expose their status in real time. This way they could talk to others and say “hey, I’m struggling to handle this load, who can help me out with 2 extra GB of RAM for less than 10 cents/hour?”.

How do you think cloud will change apps in the next 5-10 years?

Ryan Dahl – How do you see the future of PaaS (see 4:38)

Hold on
No Comments

The long tail in this blog


This blog is two years old, and I’d like to share how its >50K visits are distributed.

Long Tail

One single post drives 40% of the traffic to the blog. At the bottom, 70% of its posts represent 4% of the traffic.

In my opinion, the most popular ones are not the best ones. They are about very specific technical subjects, containing keywords in the title and in the URL slug. Google does the rest.

Hold on
No Comments