Cloud Computing

Improving the Latency of Wellhub’s Autocomplete Service: A Multi-Regional Architecture

19 October 2024

Every company aspires to provide fast and available services at all times, without latency. Although this may seem simple, achieving this goal requires a substantial investment in time and resources. Wellhub has decided to take a decisive step by adopting a multi-regional architecture for its autocomplete service, using cutting-edge technologies like Elasticsearch to optimize the search experience. This strategic choice has led to a significant improvement in latency, transforming the way users interact with the platform. In this quest for efficiency, several techniques and innovations have been implemented to ensure a quick and relevant response, even in a global environment.

In this article, we explore how Wellhub has invested in a multi-regional architecture to offer a low-latency autocomplete service. This solution, developed in Go, uses Elasticsearch to predict user inputs and refine the relevance of search results by taking geolocation into account. By relying on tools like AWS Global Accelerator, the company has optimized traffic routing, ensuring low-latency connections. Moreover, a data replication strategy has been implemented to allow the restoration of backups in different regions, thus meeting data refresh requirements. Finally, to further enhance perceived performance, Wellhub introduced a pre-fetch access point, thereby enhancing the user experience while reducing the burden on the main infrastructure.

discover how to optimize the latency of the autocomplete on wellhub for a smooth and fast user experience. improve the responsiveness of your application with our tips and best practices.

Table of Contents

Improving Wellhub’s Autocomplete Service Latency

Wellhub’s autocomplete service relies on a multi-regional architecture designed to maximize speed and responsiveness. With an initial latency reaching up to 600 ms, it was essential to implement optimizations to enhance the user experience. By utilizing AWS and integrating Elasticsearch, we succeeded in reducing this delay to acceptable levels.

Strategies Implemented

To reduce latency, several strategies have been adopted. First, data replication across different regions allows for bringing information closer to users, thus minimizing transfer delays. Geo-localized queries are also part of this approach, improving the relevance of results based on location. These methods have been coupled with intelligent pre-loading of data subsets, allowing for almost instant suggestions.

Results Achieved

After implementing these improvements, the performance of the service has increased significantly. Query latency has dropped to about 123.3 ms on iOS and 134.6 ms on Android, making the system much more efficient. Additionally, this architecture has enabled optimized use of mobile networks, with a notable reduction in Wi-Fi usage, proving that users are benefiting more from mobile services than before.

Actualité : Accentum Wireless SE : finition cuivrée et latence réduite pour le nouveau casque Sennheiser https://t.co/IxqHa7Irsh
— Les Numériques (@lesnums) October 13, 2024

Partager cet article :

Share this post :

Jean Luc

Hello, my name is Jean Luc, I'm 35 and I'm a passionate editor. I oversee all the content published on the blog. I ensure the editorial quality and relevance of the articles, while coordinating the editorial team. With my wealth of experience in the digital sector, I propose innovative themes and ensure that the articles meet readers' expectations.