#MesosCon 2015 has ended

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Ops [clear filter]
Thursday, August 20


Fully Fault tolerant Streaming workflows at Scale using Apache Mesos & Spark Streaming - AkhilDas, Sigmoid
Reliability, maintainability and scalability are the key concerns while designing any data intensive application . With the advent of realtime streaming platforms like Apache Spark, Storm etc, it is important that these computation frameworks adhere to all of them. One has to face a lot of challenges while implementing a realtime spark streaming pipeline in production. In this presentation we will showcase how to overcome these challenges and develop a fully, scalable fault tolerant streaming system with the help of Apache Mesos which not only makes it easier to deploy and manage the resources but also helps in handling varying data loads through dynamic resource management and allocation .



AkhilDas is a Software Developer at Sigmoid with focus on distributed computing, big data analytics, scaling and optimising performance. Sigmoid has worked with over 25 customers in the Big data space to get them real time insights on Tbs of data using Apache Spark and Spark Streaming.Previously... Read More →

Thursday August 20, 2015 11:00am - 11:40am
Grand Ballroom B


How to Monitor Mesos - Alexis Le-Quoc, Datadog
By providing a robust abstraction over core computing resources, Mesos does away with what has been until now the foundation of most monitoring systems: the individual host. With that gone, what should monitoring of applications running on Mesos revolve around? In this talk, Alexis Le-Quoc argues that imperative monitoring of hosts must give way to declarative monitoring, built on tags, tasks and queries. With concrete examples and live monitoring data, he will present a better way to monitor Mesos.


Thursday August 20, 2015 11:50am - 12:30pm
Grand Ballroom B


Twitter’s Production Scale: Mesos and Aurora Operations - Joe Smith, Twitter
Twitter has used Aurora and Mesos to scale from only tens of nodes to tens of thousands. During the process of scaling up, the configuration, deployment, and operational procedures for this system have evolved and improved significantly. This talk will offer an operations perspective on the management of a Mesos+Aurora cluster, and cover many of the cluster management best practices that have evolved at Twitter from real-world production experience. It will explore methods that decrease operational overhead, as well as examples of outages and incidents to illustrate various failure domains. Furthermore, the talk will highlight current and future safeguards in Aurora and Mesos to mitigate impact from these failures in the future.

avatar for Joe Smith

Joe Smith

Site Reliability Engineer, Slack
Joe Smith is a Site Reliability Engineer at Slack, working on the Identity, Compliance, & Security team. Previously he was at Twitter, and built the Aurora and Mesos cluster from tens of nodes to tens of thousands. As Mesos and Aurora SRE, he automated the build, deployment, management... Read More →

Thursday August 20, 2015 2:00pm - 2:40pm
Grand Ballroom B


Running Stateful Services with Mesos - Arunabha Ghosh, Moz & Ankan Mukherjee, Moz
The Mesos system allows services to be decoupled from machines, however this decoupling creates problems for legacy apps that rely on persistent state. Traditional SQL databases are a prime example of such apps. It is still possible, however to run such apps under Mesos and gain the operational advantages it provides. We will cover techniques used at Moz to successfully run SQL databases and other legacy persistent state services under Mesos. The presentation will also cover challenges, best practices and a look at how to leverage Mesos primitives for stateful services. We will demonstrate several such services running on Mesos.


Arunabha Ghosh

Arunabha Ghosh is the Director of Engineering at Moz where among other things he leads the effort to move Moz onto Mesos. Prior to Moz, Arunabha worked at Yahoo and Google focusing on building large scale infrastructure. Arunabha leads the Systems Research group at the HackerDojo... Read More →

Ankan Mukherjee

Ankan Mukherjee is a senior engineer at Moz and is currently focussed on building the next generation cluster operating system for Moz's datacenters. Prior to working at Moz he donned many different roles in the enterprise software world - software engineer, technical architect, technical/project... Read More →

Thursday August 20, 2015 4:00pm - 4:40pm
Grand Ballroom B


Building A Machine Learning Platform to Predict User Behavior on Mesos - Jeremy Stanley, Sailthru
In 2014 Sailthru launched a new machine learning platform, named Sightlines, using Mesos. This platform predicts the behavior of hundreds of millions of users for hundreds of clients. In this talk, Jeremy Stanley and Alex Gaudio will tell the story of how (and why) this platform was built on Mesos, and outline some of the key challenges and detours taken along the way. This will include an overview of the goals and architecture of the system, and an account of the 10 month effort to develop, scale and stabilize the system for production.

Topics covered will include our use of AWS spot instances and Netflix’s Asgard to save costs, Marathon and Zookeeper to schedule jobs, detours with Chronos, Spark and Redis as our architecture evolved, and challenges we faced with maximizing resource utilization and ensuring we had identical and isolated test and production environments.


Jeremy Stanley

"Jeremy Stanley is the Chief Data Scientist & EVP of Engineering at Sailthru, where he is focused on building data-driven solutions for marketers that drive long-term customer engagement and optimize revenue opportunities.Prior to Sailthru, Jeremy was the CTO at Collective where he... Read More →

Thursday August 20, 2015 4:50pm - 5:30pm
Grand Ballroom B
Friday, August 21


Global Control of Decentralized Mesos Clusters - Daniel Giribet, Schibsted
Apache Mesos and the frameworks running on top of them are a great resource management solution but they lack some features to make it suitable for automatic scaling and communication of microservices that run across different locations and vendors. In this presentation Schibsted will discuss these limitations and present their plan for a Global Scheduler capable of controlling independent Mesos clusters across multiple distant datacenters and service providers.

avatar for Daniel Giribet

Daniel Giribet

Daniel Giribet is the Infrastructure Platform Development leader at Schibsted Products and Technology. He holds a Computer Science degree and has been focusing on video processing, web engineering, systems architecture and content management. Has worked in indie projects and also... Read More →

Friday August 21, 2015 10:40am - 11:20am
Grand Ballroom B


Migrating over 1,000 Production Components to a Mesos-Based Platform-as-a-Service (PaaS) - Tom Petr, HubSpot
At HubSpot we built a turn-key PaaS system our engineers use to deploy all the different parts that make up the HubSpot application. In this talk we'll discuss the benefits of using Mesos as a starting point for deploying a PaaS, the open source framework we created called Singularity, and the lessons learned from moving our entire production application to this stack. We will enumerate many of the specific benefits that became apparent after the system was in place, including enhanced security, reliability, operational simplicity, and cost efficiency. Lastly, we will shed light on both the ways in which we extended what Mesos offered in order to empower our product developers, and how we moved over 100 software load balancers into Mesos to make our infrastructure team more efficient.

avatar for Tom Petr

Tom Petr

Engineering Lead, HubSpot
Tom Petr is an Engineering Lead at HubSpot. Prior to working on Vitess and Kubernetes, he was a maintainer of Singularity, an open source Mesos framework. Tom has spoken at multiple conferences about HubSpot's platform infrastructure.

Friday August 21, 2015 11:30am - 12:10pm
Grand Ballroom B


Tactical Mesos: How Internet-Scale Ad Bidding Works on Mesos / Aurora - Dobromir Montauk, TellApart
Real Time Bidding on the large Internet exchanges (Doubleclick, Facebook, etc) requires large-scale, low-latency serving systems: >100K QPS at peak with <100ms tail response times. Time is very literally money. Dobromir will present TellApart's full stack in excruciating detail (if you want it), which includes Mesos/Aurora, ZK service discovery, Finagle-Mux RPC, and a Lambda architecture with Voldemort as the serving layer.


Dobromir Montauk

Software Engineer, Twitter
Nine years at Google, including work on Google+ Stream backend and Search Infrastructure. Recently joined TellApart as Uber Tech Lead working on infrastructure. In charge of TellApart bidding platform performance: >100K QPS with sub-10ms latencies. Optimized the Bidders to run at... Read More →

Friday August 21, 2015 1:30pm - 2:10pm
Grand Ballroom B


Simplifying Maintenance with Mesos - Benjamin Mahler, Twitter
You have computing resources that your developers want to leverage. However, providing developers with direct access to machines would set you up for an operational nightmare: a critical security vulnerability comes along and now you need to reboot all of your kernels to apply the fix, without any downtime.. good luck!

Mesos is a “cluster management” layer that provides developers with access to computing resources while encouraging behavior that leads to simplified maintenance. This talk will address the challenges of maintenance when dealing with Mesos clusters running multiple frameworks (e.g. services, storage, batch compute). We’ll explore a current proposal for adding simple and flexible maintenance primitives in Mesos to address these concerns and enable tooling for automated maintenance.

avatar for Benjamin Mahler

Benjamin Mahler

Software Engineer, Mesosphere
Benjamin Mahler is a committer and PMC member of Apache Mesos and has been working on Mesos since 2012. Benjamin now works at Mesosphere as a technical lead and has given Mesos related talks at several conferences and companies. His interests include distributed systems, fault tolerance... Read More →

Friday August 21, 2015 2:20pm - 3:00pm
Grand Ballroom B