Intelligent Data Management System (ART/233CP)

Intelligent Data Management System (ART/233CP)

Intelligent Data Management System (ART/233CP)
ART/233CP
Platform
07 / 07 / 2017 - 06 / 05 / 2019
11,765

Dr Vincent M K LAU

The deliverables comprise of (1)Developer Services, (2) Machine Learning and (3) Distributed Systems. 1. Developer Service: 1a. Research and design documentation for ML web framework, to cater for Chinese handwriting paper form and document management system used by different organizations 1b. Web based ML setup portal for cloud and distributed devices, for easier backend and frontend deployment for paper form processing and increase the commercialization chance 1c. Data Visualization modules, for report generation and visualization of data flow, training status, model definition, knowledge tree and accuracy, enable one to set strategy and improve accuracy for specified context and industrial applications. 1d. Sample code and projects with App and Web, and create the building block for Chinese handwriting paper form processing, increase the commercialization chance 2. Machine Learning: 2a. Research and design documentation for ML models and algorithms , to represent ML checkpoint in timeline and branches, with incremental learning from sample collected through distributed system, and cater for Chinese recognition in specific context and domain semantics 2b. Detail design of models and algorithms, to improve accuracy of Chinese recognition in specific context and domain semantics, create classifiers for different parts of form and fields of document. 2c. Development of Models and Algorithms, including the heterogeneous software in various platforms with consideration of distributed systems. 2d. Integrate previous ML engines from seed projects for Chinese Handwriting Recognition and Signature, also provide a framework for future integration of multiple ML engines and OCR for non-Chinese. 3. Distributed Systems: 3a. Design documentation of core engine/framework, include the enhancement of ML engine for knowledge topology and handle Chinese handwriting image in different context 3b. Data and Machine Learning frameworks integration, for incremental machine learning to handle more Chinese handwriting samples from large number of frontend devices and users 3c. Devices management system, for device to upgrade automatically with centralized management, and allow human intelligence to involve verification and correction to generate more training sample and feedback. 3d. Agents management system, for collection of labeled samples in different context and consider the knowledge topology from large number of frontend devices and users. 4. CS deliverable: (for Broadlearning) 4a. Reference design of document and database management system, with Chinese handwriting support for financial, education and other industries, including the frontend in various portable devices and computers, and backend in cloud or dedicated servers. The reference design also provides SDK and interface with commercial system, it will handle at least 3 kinds of paper forms: (1) change of address, (2) application of account and (3) termination of account.The deliverables comprise of (1)Developer Services, (2) Machine Learning and (3) Distributed Systems. 1. Developer Service: 1a. Research and design documentation for ML web framework, to cater for Chinese handwriting paper form and document management system used by different organizations 1b. Web based ML setup portal for cloud and distributed devices, for easier backend and frontend deployment for paper form processing and increase the commercialization chance 1c. Data Visualization modules, for report generation and visualization of data flow, training status, model definition, knowledge tree and accuracy, enable one to set strategy and improve accuracy for specified context and industrial applications. 1d. Sample code and projects with App and Web, and create the building block for Chinese handwriting paper form processing, increase the commercialization chance 2. Machine Learning: 2a. Research and design documentation for ML models and algorithms , to represent ML checkpoint in timeline and branches,

BROADLEARNING EDUCATION (ASIA) LIMITED
Comba Telecom Limited
Innodimension Company Limited
Konica Minolta Business Solution (HK) Ltd.


Today, most Chinese handwriting recognition engine are relatively static and users need to repeat the correction even for same writing. Also, there are wide range of devices (smart phone, tablet, desktop and laptop computer) used by companies. In previous ASTRI seed project, we have developed proof of concept system, by applying Machine Learning (ML) for Chinese handwriting image, and attract the interests of some local companies to support this full platform project to further develop the technologies, target to improve the efficiency of processing paper document. In this full platform project “Intelligent Data Management System” (IDMS), it will develop paper document processing system with incremental learning of Chinese handwriting image recognition, by applying the interaction of Human Intelligence (HI) on distributed system, let human verification results in different frontend system to become new training samples for ML and to improve the accuracy in Chinese handwriting recognition. This platform is scalable and flexible for SMEs and Enterprises with Developer Service with open Software Development Kit (SDK) and sample source code, for companies to develop specified applications with workflow features with fewer resource. Our framework also improve the ML with knowledge topology, enable accurate result in different domains. In addition, the platform technology and framework can be further applied to the other industries with other data source which may be generated from different smart systems such as Healthcare and Smart city area.