Exploring Nacos: Application Scenarios for Alibaba’s Open Source Framework

Image for post
Image for post

This article continues Alibaba’s mini-series on open source naming configuration service Nacos by looking at the ideal application settings for the framework, from storing sensitive configuration information to traffic management.

Alibaba’s open source Nacos platform has generated recent interest among developers as an easy-to-use framework for service discovery, management, and configuration in cloud-native applications. As well as the problems it can help to solve, potential users may wonder about the specific scenarios it is most suited to, as well as the technical reasons why.

Drawing on insights from Alibaba’s He Xu, this article looks in detail at how Nacos can assist in configuration management, rate limiting and degradation,and dynamic dispatch to greatly simplify these scenarios for developers.

Before Nacos: Problems with Sensitive Information

He Xu recalls once being approached by a friend about a configuration problem with a highly involved solution which Nacos did not then exist to help remedy. At the time, his friend had a rapidly developing business with a growing team of employees who were coming and going on an infrequent basis, and needed away to ensure data security and prevent leaks. More specifically, he had personally created the company’s service backend code at the time of its founding, and had directly stored the connection configuration information for the backend and database in the project’s configuration file. To do so, he had used the Spring Boot framework, under which the configuration information was stored in application.properties, while Spring’s profile property was used to ensure different databases were connected in different environments, as follows:

Production environment: application-prod.properties

Development environment: application-dev.properties

As such, the test and pre-launch environments were similar, and the fact that database connection information was stored directly in the configuration file and managed through Git along with the project code presented a high risk of data leakage. Imagining a newcomer who had been granted Git Pull code permission while engaged in research in development, such a person might have the authority to directly operate the online database.

Based on this predicament, He suggested several approaches to reducing data leakage. The first was to strip sensitive configurations like database connection information from the project. Next, he suggested adding IP whitelist connection restrictions on the database. In keeping with the principle of minimum permission, it would then make sense to configure only the necessary permissions for each account and to avoid high-risk operations like dropping tables or dropping databases.

Finally, He advised regularly modifying accounts and passwords in the database.

In retrospect, he acknowledges that these suggestions not only fell short of solving his friend’s problem, but also brought forth additional issues. For instance, the question remained of where to store sensitive configuration information once it had been stripped from the project. While it might seem possible to store it in an environment variable or a file on the application deployment machine, the risk of someone gaining access when logging in to troubleshoot problems would still remain.

In short, He’s friend still needed a way to eliminate the risk of leakage as much as possible throughout the entire development process. Applications may not connect with just one data source, and there are cases of sharding, unalike data storage (such as MySQL/Redis/Elasticsearch), and many other sensitive configuration items. Increases in configuration data will tend to bring considerable management inconveniences.

Additional questions remained of how to quickly and easily distribute regular modifications of database accounts and passwords to all applications. Because these are sensitive configurations, any changes in them present large risks, and need to be verified on a small number of machines to ensure they will not impact the service. Further, should the Canary Release function be considered when sending these to other machines, and how can operation quickly be rolled back if the application encounters an anomaly that affects the service after modified accounts and passwords are distributed? Should there be version control and fast rollback functions to this end? As not all employees engaged in development have permission to modify sensitive configuration information, should there be a permission management function? Given that any operations on sensitive configurations should be documented, should there be a change auditing function?

At the time, there was no one comprehensive solution to these issues to recommend.

Improving Sensitive Configuration Management with Nacos

As of July 2018, He could offer a friend with the same problems a better solution: using the Nacos configuration management module to store sensitive configuration information in Nacos.

One of the core functions of Nacos configuration management is protecting sensitive configurations. Nacos offers the necessary functions for the scenarios discussed in the previous section, distinguishing among different environments (such as development, testing, pre-issue, and production) through namespaces, guaranteeing traceability of changes through version control, minimizing the impact of error changes through quick rollback, and ensuring smooth and safe configuration changes through Canary Release. Additionally, more comprehensive functions like permission management and change auditing are set to become supported in the near future.

With such a powerful framework, the next question is how to move configuration files for sensitive configuration items into Nacos. The following steps describe how to do so for the example of connecting MySQL though Spring Boot.

1. Add the dependency

Note that Spring Boot 1.x uses the nacos-config-spring-boot-starter 0.1.x version, and Spring Boot 2.x uses the nacos-config-spring-boot-starter 0.2.x version.

2. Add the Nacos connection configuration in application.properties

While this is a simple example, in actual production the Nacos namespace information (distinguishing environments) and authentication information (such as AccessKey, SecretKey, and so on) need to be configured. The Alibaba Cloud product ACM corresponding to the Nacos configuration model, with the help of the RAM role of ECS instances, can eventually achieve the aim of not needing to fill in the AccessKey and SecretKey.

3. Add the annotation of @NacosPropertySource

4. Create a configuration with the dataId “mysql.properties” in the locally launched Nacos console

Using the above four steps, MySQL connection information from the original application.properties can be migrated to Nacos, allowing Nacos to manage sensitive configurations and thus reducing the risk of data leakage. Meanwhile, the powerful functions provided by Nacos configuration management like unified management and control, version control, and fast rollback also greatly enhance the convenience of operation and maintenance management.

A complete coding example can be found at https://github.com/nacos-group/nacos-examples/tree/master/nacos-spring-boot-example/nacos-spring-boot-config-mysql-example.

Current Limiting and Downgrading with Nacos

Another key application of Nacos is in current limiting and downgrading, which are two key points to consider in the development of high-concurrency systems and also two major weapons for ensuring system protection during runtime. The current-limit threshold and the degrading switch are ultimately abstracted into individual configuration items; to realize dynamic adjustment to the threshold and the switch’s start and stop at runtime, the Nacos configuration module offers an ideal place to store these configuration items.

Since August 2018, Alibaba has opened the source of Sentinel (a powerful library from Alibaba’s middleware team) and used traffic as an entry point for protecting the stability of services from multiple dimensions such as flow control, circuit-breaking and degrading, and system load protection. Within the Alibaba Group, Nacos and Sentinel have been applied together for a number of years, with important uses in the 11.11 Global Shopping Festival and other online commerce experiences for consumers.

The following example with Sentinel flow control illustrates the principles of dynamic flow control during runtime in Nacos:

1. Add the dependency

2. Simulate concurrent requests

3. Configure Nacos connection information, dataID, and so on, and set them as the data source for Sentinel

4. Create a flow control configuration with com.alibaba.acos.demo.flow.rule as the dataId in the locally launched Nacos console

5. Run NacosDynamicFlowDemo

The following standard output information will be shown:

6. Modify the newly created flow control configuration in the Nacos console and change the value of the current-limit threshold count to 1.0

The complete standard output information is as follows:

As in this example, the core of implementing dynamic control through Nacos and Sentinel is the Nacos configuration module’s dynamic push capability. The principle at work is that sentinel-datasource-nacos integrates with nacos-client, which maintains a connection with nacos-server. When the user makes configuration changes in the Nacos console, nacos-server quickly pushes the latest content of the configuration to nacos-client. Once it has obtained the latest flow control configuration, Sentinel converts the flow control policy, for example by adjusting the current-control threshold value to 1.0, and thus limits traffic flow into the system’s service process.

A complete coding example can be found at https://github.com/nacos-group/nacos-examples/tree/master/nacos-sentinel-example.

Dynamic Traffic Dispatch with Nacos

When a service grows to a certain scale and a single cluster can no longer be used to carry all user requests, it becomes necessary to divert user traffic to different clusters. However, a further solution is to store different clusters in different regions, such that not only is the pressure of service processing alleviated, but the system is also given disaster tolerance capabilities.

Consider the example of an e-commerce platform with 100 million users and system traffic divided by user ID. Traffic requests from users with IDs 1–1000W are distributed to cluster a in region A, while traffic requests from users with IDs 1001W-2000W are distributed to cluster b in region B, and so on. Eventually, request traffic for all uses is spread to clusters in ten different regions, while each cluster has some redundant system resources. When a disaster that cannot be avoided (such as an earthquake) impacts the data center in region A, there must be a mechanism for dynamically dispatching traffic elsewhere. Ideally, traffic should be dispatched away from region A in a matter of seconds.

Nacos configuration management can enable this dispatching when the user ID shards and corresponding routing rules are stored in Nacos with components such as the unified access layer. In this way, traffic can be spread to all clusters, while the system is able to carry more traffic and better support service growth. Additionally, when stored in Nacos these shards and rules are given the ability to configure dynamically. This means that once problems occur in a particular region and cannot be restored by the infrastructure in time, it is only necessary to modify the routing rules of ID shards in the Nacos console. Once this is done, traffic in the problematic region can be quickly switched to other available regions, ensuring services remain almost completely intact. In Alibaba’s case, Nacos can achieve the efficiency necessary to push to 100,000-level machines within seconds.

Key Takeaways

As well as the three scenarios discussed here in detail, Nacos is suited to other bold application scenarios including big data real-time computing algorithm adjustment, multiple active center off-site disaster resistance, dynamic push under application business scenarios, and many more discussed in detail the Nacos Alibaba Cloud product ACM.

The Nacos configuration management module converges and controls sensitive configurations, greatly reducing the risk of data leakage. It also offers functions like dynamic push, version control, and fast rollback to ensure smooth, secure changes to sensitive configurations. The current limiting and downgrading scenario example demonstrating dynamic traffic control with Nacos and Sentinel offers a typical application scenario for Nacos configuration management, as does the downgrading example. During the peak of the 11.11 Global Shopping Festival, Nacos’s dynamic push function enables shut-down for non-critical system components which can then be turned back on after the peak period. In short, Nacos offers an excellent management and control mechanism for virtually any scenario in which sensitive configuration and dynamic configuration are involved.

(Original article by Huang Xiaoyu黄晓禹)

Alibaba Tech

First hand and in-depth information about Alibaba’s latest technology → Facebook: “Alibaba Tech”. Twitter: “AlibabaTech”.

Written by

First-hand & in-depth information about Alibaba's tech innovation in Artificial Intelligence, Big Data & Computer Engineering. Follow us on Facebook!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store