Elastic{ON} 2017: What’s Next

Recently the SignalFx team attended Elastic{ON} for the second year in a row. It was exciting to hear about the widespread adoption of Elastic with nearly 100M downloads and 85,000 open community users. We heard many compelling use cases from Elasticsearch practitioners across a variety of industries.

One of the highlights of the conference was hearing how honorees of the first Elastic Cause Award use Elasticsearch. IST Research powers its Pulse platform with Elastic by collecting, analyzing and visualizing online and offline data to gain insights. This helps them identify people and methods involved in human trafficking, as well as other complex real-world issues. eHealth Africa enables full text searches on call records with Elasticsearch to allow for sorting of records via multiple criteria in its mission to fight against the West African Ebola epidemic. NoSchoolViolence.org uses Elasticsearch to index documents and query how youth behaviors are linked to Department of Justice, CDC, and NIH recognized forms of school violence.

We also learned about compelling business-focused use cases. Walmart gets near real-time retail analytics, such as how many bananas are sold per second. General Mills leverages metadata for users to search across a wide variety of data such as recipes, articles, videos, products, and coupons on their website. A national eTailer uses Elasticsearch to match their customer’s past orders to generate accurate quotes. And a hotel and travel app that uses Elasticsearch for their booking queries. Tinder explained how they use Elasticsearch to make connections on match criteria such as location.

With the continued growth of customer-focused apps, it is implicitly important for organizations to track query latency as well as other ElasticSearch metrics that matter and be alerted before customers are affected. Common health check and node-based monitoring such as Nagios aren’t designed for this type of active alerting and monitoring. Evolving an alerting strategy beyond Nagios is an effort that our engineers have worked through.

We love this type of shop talk. Because we monitor and run Elasticsearch at scale, Elastic{ON} was the perfect place for it. Our own Elasticsearch experts attended — Mahdi Ben Hamida, who oversees the search and metadata persistence layers of SignalFx, and Anupama Mann, who built a generic framework to migrate Elasticsearch indexes between versions that can be leveraged by any system using ES. They both enjoyed technical conversations with the community. Many of their conversations centered around upgrading Elasticsearch, from weighing the benefits of upgrading to implementations and timing.

Updates to Elasticsearch versions have focused on resiliency, reliability, simplification, and performance, as well as new features and enhancements. For the SignalFx engineering team, the most relevant and interesting features of the newer Elasticsearch version included:

  • Resiliency Improvements to ensure no data loss with Elasticsearch.
  • Dynamic Merge Scheduling with an auto-regulating feedback mechanism. This eliminates past worries around manually adjusting merge throttling settings and allows Elasticsearch to provide more stable search performance even when the cluster is under heavy indexing load.
  • Synched Flushes to enable instantaneous recovery of existing replica shards. In previous versions, node failures or a reboot can trigger a shard allocation storm, and entire shards are copied over to the network despite already having a large portion of the data. Recovery times have been reported at over a day to restart a single node.
  • More Lenient Reallocation to avoid unnecessary rescheduling during node restarts. Based on experience, this works well if the indices are primarily read-only (for example, for time- based indices).
  • Memory Pressure Management and Better Handling of Expensive Queries which was limited in the 1.x versions of Elasticsearch.

Though Elastic{ON} is over, we look forward to keeping the conversation flowing!

Follow us on Twitter »

About the authors

Melanie Salman

Melanie works in marketing at SignalFx. Previously she was at PagerDuty and Loggly in field and event marketing.

Enjoyed this blog post? Sign up for our blog updates