Serverless computingWhat’s that old schoolyard rhyme? “AWS and Azure, sitting in a tree, I - A -A - S, P - A -Y - G. First come VMs, then containers, then come stateless microservices running on public cloud infrastructure at fractions of a cent per second.” Or something like that.
Anyway, application deployments are getting lighter, backend microservices are getting smaller, and now many development shops are moving toward “serverless architectures” in which dynamic computational tasks are handled using a few cycles on somebody else’s managed server. As of 2016, the public cloud giants (AWS, Google Cloud and Microsoft Azure) all have their own “serverless services” that allow you to buy processing time for cheap. And I do mean cheap - a million AWS Lambda requests per month, each lasting five seconds, will set you back about $10.62.
Developers gravitate toward this approach because it’s scalable, cost-effective and requires little to no infrastructure maintenance. In AWS, you might deploy an application with data stores in RDS or DynamoDB, static web content hosted in S3, an API Gateway directing traffic and Lambda functions running the business rules - look Mom, no servers!
But wait a minute. Is a pay-as-you-go public cloud really the only place to run serverless compute functions? After all, a handful of computer scientists have been running little pieces of code on distributed computers for years, at a price even Lambda will never beat: free.
Enter the ”volunteer cloud”Volunteer computing is a decades-old distributed computing model, typically used by a university or other research facility to crowdsource computationally-intensive tasks. Ordinary computer users connect their home PCs to a centralized service, using their CPU’s idle cycles to process small jobs as part of the research project’s larger goal.
The largest and best-organized volunteer computing effort is UC Berkeley’s open source BOINC framework, currently facilitating about 60 research projects around the world. This UC Berkeley paper, uploaded in 2013, has a good overview of volunteer computing’s history and its ongoing challenges. If you read the paper carefully, you might catch the money quote near the end:
“There have been attempts to commercialize volunteer computing by paying participants, directly or via a lottery, and reselling the computing power. These efforts have failed because the potential buyers, such as pharmaceutical companies, are unwilling to have their data on computers outside of their control.”To understand this quote, you have to think about volunteer computing the way the author does: as a method for processing huge computations in bite-size pieces. And he’s right that highly-regulated companies with piles of data to crunch may not be able to farm out their calculations to strangers on the Internet.
But all of a sudden, there’s another widespread commercial use case for itty bitty pieces of distributed compute time: serverless architectures. Couldn’t somebody leverage volunteer computing to provide a free, highly available public compute infrastructure for some of these new-school apps?
AdvantagesFree compute on an open source platform is a volunteer cloud’s biggest selling point for commercial enterprises. You could also argue that a cloud comprising millions of personal devices would be much more highly available than AWS or Azure, which rely on a relatively small number of datacenters, but as we’ll see below, this argument is a bit illusory.
In fact, a public volunteer cloud would face even bigger versions of some of volunteer computing’s traditional challenges. Let’s consider a few potential issues in roughly increasing order of difficulty.
Issue #1: IncentiveHow do you convince people to run someone else’s code on their computers? BOINC projects woo users with lofty humanist or humanitarian goals, but there’s no altruistic incentive to enable commercial projects. Paying cloud participants a flat rate, or allowing them to bid on jobs, may never be feasible; the existing cloud services mentioned above are simply too cheap an alternative.
But here’s where the new paradigm comes in. With so many people running serverless architectures now, a free, public distributed compute service has the ability to directly benefit more people than ever before. So maybe peer-to-peer ethics are the answer. Maybe the rule is, if you submit X requests to the service, you have to accept Y jobs in return. The more compute cycles you pay into the grid, the more you get out.
Issue #2: ChurnHow do you ensure the availability of a volunteer cloud? Volunteer computing resources, by definition, are voluntary: they can appear or disappear at any time, even in the middle of a calculation. For anybody other than hobbyists to use a service like this, you need at least a few 9’s of average uptime.
One solution may be to assign a rank to individual computer nodes based on their availability (how often they are online and accepting jobs) and reliability (what percentage of assigned jobs they complete), with a steep penalty for reliability lapses. In a sufficiently large network, untrustworthy nodes would rarely or never receive requests. In fact, a node might be required to bank a certain amount of “goodwill”, demonstrated by time spent online and graceful offline exits, before becoming a full-fledged participant in the grid.