Pipcook: an algorithm platform for front end development based on tfjs-node
Enable AI to improve the efficiency of development and minimize the burden of using machine learning
With the development of deep learning, intelligentization has begun to empower all walks of life. As close to users, the front-end developers also hope to improve efficiency and reduce labor costs with the help of AI capabilities, and to create better experience for users. Front-end intelligentization is also seen as an important direction of the front-end in the future. However, when we actually communicate with front-end engineers, we often get these questions:
- My business is very mature and the users always have positive comments. Why do I need machine learning?
- It is said that massive amounts of data and manual labelling are necessary. So to use traditional rules (if…else…) is much better
- It seems that AI requires us to have advanced mathematical knowledge?
- I heard that I need to learn Python and other languages and I have no time.
Based on these questions, we hope to have a solution that enables AI to improve the efficiency of the front-end and minimize the cost and burden of using machine learning. From this point, we came up with the idea of developing a JS framework, which should be friendly to front-end engineers, allowing the front-end to quickly collect and process data, do machine learning experiments but do not need to have advanced knowledge of mathematics and deep learning. At the same time, this framework should be flexible and scalable, which can be industrial-level available. Based on these goals, we launched pipcook, a front-end algorithm engineering framework based on tfjs-node.
Through communication and research, we summarize the main reasons that hinder the front-end from entering the field of artificial intelligence as follows:
- Algorithm Barrier: the requirement of mathematical and algorithm knowledge is a huge challenge for the front end.
- Scenario Barrier: scenarios where intelligent technology can be applied in the front-end field are relatively not enough, which hinders the motivation of the front-end to enter this field. In other words, the front end developers do not have the ability to clearly define questions related to intelligence, so they will not consider this direction.
- Data Barrier: the acquisition of high-quality data is a common problem in the intelligent field. The other problem is the format of these data. The specification is not friendly to the front end.
We believe that the problems mentioned above are the key to prevent front-end engineers from moving towards intelligentization. Then, let’s analyze in detail how to solve each problem from aspects mentioned above.
As mentioned before, artificial intelligence has already empowered many industries. We believe that there is no doubt that there are some web scenarios which can be applied. However, in many cases, non-machine-learning professionals cannot effectively identify and determine which scenarios can be solved by machine learning. Meanwhile, due to the lack of in-depth learning of models and algorithms, many non-machine learning developers are not sure how much deep learning can solve a certain problem, whether the solution is better than the traditional exert engine. There are two ways to solve this problem:
- It allows front-end engineers to deeply learn model knowledge and understand the principle behind each algorithm to determine which type of problem can be solved by which technology. However, this method is too costly for the web developers, and they may not have high enthusiasm.
- The second solution is to summarize a set of scenarios that may be encountered in front-end business and fields in our framework, and classify these scenarios. In this way, a case library is actually formed. Through these cases, web developers can easily find similar scenarios and have a better sense of which problems can be solved by machine learning. Finally, applying these cases or similar cases to your own business in an intuitive way with low learning cost.
We know that the core of deep learning is data and model. If model is a kind of rocket engine, then data is fuel for engine. Machine learning needs high quality and a large number of fuels to give full play to its advantages. In fact, the front end has accumulated some data over the years and is also a very advantageous part in data collection since it’s closest to users.
Let’s look back and see what data the front end has:
- UI data from some design documents and some UI libraries with uneven quality.
- Code: write codes every day.
- Log data after the business goes online (performance, error, other custom collected data,. etc)
- specific data owned by other specific businesses.
These data can basically be classified as computer vision and a part of text data, and CV/NLP are also key problems to be solved by machine learning. The problem now is that, with data, front-end engineers often don’t know how to process the data so that the data can become the fuel of the model. Accordingly, our framework should provide fast and simple data processing, as well as convenient capabilities such as data quality assessment and data visualization.
Machine Learning Algorithm
Perhaps for non-professional engineers, another huge obstacle is the model or algorithm itself. People feel confused that if I do not understand the mathematical principles of this model, and accordingly I do not know how to use deep learning frameworks such as Tensorflow. What should I do? This problem is easy to solve, but also difficult to solve.
It’s easy to solve since at this stage, some areas have accumulated many years of experience, and almost every field has the most popular, mature and industrial-level available models, we only need to provide these implementations in the framework, just use them out of the box, and do not need to care about internal implementations. What is not easy to solve is that people may think the model is too black-box and they know some algorithm knowledge and want to make some fine adjustments to the model. Therefore, we should also provide intervention and adjustment capabilities in the framework.
This problem is both simple and complex. To put it simply, For web developers, just to use JS. This is the language that the front-end is most familiar with. We also think so. Therefor pipcook is developed by pure Typescript and provides JS APIs. Almost most plugins for data processing and models are implemented based on tfjs-node. However, in terms of complexity, the machine learning ecosystem of JS is not mature at this stage, and it is unrealistic to implement the same ecosystem as python in a short period of time. In this way, if it is only limited to JS, this framework is bound to be an incomplete system to some extent, so our solution is to implement a node version of python like swift, which can call python libraries in nodejs, so as to help the front end.
By solving the above problems, we have basically solved several problems, such as why to use it, whether it can be used, and how to use it. Each problem is provided from the perspective of front-end engineers. As pipcook gradually mature and the entire JS machine learning ecosystem gradually mature, we believe that, front-end engineers will also step up to a new level of intelligence.
As shown in the figure, we have designed a stream front-end machine learning framework based on pipelines by answering questions about scenarios, algorithms, data, languages, models and data flow in this pipeline. In this pipeline, you can embed plug-ins to process models and data and continue to flow downwards. Each plug-in should have a clear division of labor and be responsible for a specific task of the machine learning cycle. Pipcook ensures that third-party developers can develop plug-ins to extend pipcook’s capabilities by defining a series of specifications. At the same time, our framework is based on tfjs for machine learning training. We can also use the python ecosystem through python bridging. Next, we will introduce several key parts of this architecture.
Pipeline and plug-ins
Pipcook is a pipeline-based framework that includes data collection, data access, data processing, model configuration, model training, model service deployment, and online training. Each part is in the charge of a specific plug-in, which provides the customization capability of each link in the form of plug-ins, and uses pipeline to connect plug-ins to implement algorithm engineering. The whole process is based on the technical system of Node. js, and the plug-in is managed and maintained in the NPM ecosystem. In particular, the deployment of data and model services can be deeply combined with the existing front-end technical system.
Data collection, access, and processing
Pipcook defines a set of dataset specifications. Plug-ins for data collection, access, and processing avoid the cost of accessing and using data caused by different dataset standards. At the same time, it ensures that data can be shared among different pipelines. The protocol standards behind these data plug-ins can support the generation of standard and unified datasets under different labeling tools. Data processing plug-ins greatly reduce the difficulty of data set understanding and optimization.
The underlying model and algorithm capabilities of pipcook are provided by the node version of Tensorflow, a well-known Machine Learning Framework. The emergence of tfjs-node greatly improves the ability of using the JS language for machine learning. As a JS-based machine learning engineering platform, tfjs-node can also be easily used, regardless of some official mature models (such as Mobilenet), using basic operators to build a new model and even using its tensor can make up for the disadvantage that JS platform does not have something like a numpy package.
In fact, as a brand new JS-based machine learning engineering platform pipcook is bound to have many imperfections. Based on the big goal of pushing the whole front-end industry towards intelligent development, we will also continue to improve this platform.
Currently, pipcook’s builtin plugins supports a pipeline for image classification and object detection. The pipeline for object detection uses python’s capabilities. In the future, based on the principle of expanding the JS machine learning ecosystem, we still hope to develop models based on the native tfjs-node. In the future, pipcook will continue to support natural language processing and support plugins for popular deep learning tasks such as image segmentation. Of course, we also welcome third-party developers to contribute these models to us.
As the number of data increases and the complexity of the model increases, the computing power may be insufficient. In the future, we will increase the ability to deploy the model to multiple devices for training, supports data parallel, distributed parallel, and asynchronous training, and supports using clusters to solve computing power problems.
Currently, pipcook only supports simple solutions such as local deployment. In the future, pipcook will cooperate with various cloud services (such as Alibaba Cloud, AWS, and Gcloud, deploy directly to the machine learning deployment service of cloud computing in the pipeline, so that you can directly start the prediction service after the training is completed.
In the future, we hope to combine the power of Alibaba’s internal front-end intelligence team and the entire open-source community to continuously improve the front-end intelligence campaign represented by pipcook, so as to make the front-end intelligence technology solutions inclusive, precipitate more competitive samples and models, and provide intelligent code generation services with higher accuracy and availability; effectively improve front-end R & D efficiency, reduce simple repetitive work, and do not work overtime, let’s focus on more challenging work!
How to contribute?
If you are interested in our project and want to contribute to front-end intelligence, welcome to our Github open source repository (https://github.com/alibaba/pipcook)