智能數據管理系統 (ART/233CP)

智能數據管理系統 (ART/233CP)

智能數據管理系統 (ART/233CP)
ART/233CP
平台
07 / 07 / 2017 - 06 / 05 / 2019
11,765

劉文建博士

The deliverables comprise of (1)Developer Services, (2) Machine Learning and (3) Distributed Systems. 1. Developer Service: 1a. Research and design documentation for ML web framework, to cater for Chinese handwriting paper form and document management system used by different organizations 1b. Web based ML setup portal for cloud and distributed devices, for easier backend and frontend deployment for paper form processing and increase the commercialization chance 1c. Data Visualization modules, for report generation and visualization of data flow, training status, model definition, knowledge tree and accuracy, enable one to set strategy and improve accuracy for specified context and industrial applications. 1d. Sample code and projects with App and Web, and create the building block for Chinese handwriting paper form processing, increase the commercialization chance 2. Machine Learning: 2a. Research and design documentation for ML models and algorithms , to represent ML checkpoint in timeline and branches, with incremental learning from sample collected through distributed system, and cater for Chinese recognition in specific context and domain semantics 2b. Detail design of models and algorithms, to improve accuracy of Chinese recognition in specific context and domain semantics, create classifiers for different parts of form and fields of document. 2c. Development of Models and Algorithms, including the heterogeneous software in various platforms with consideration of distributed systems. 2d. Integrate previous ML engines from seed projects for Chinese Handwriting Recognition and Signature, also provide a framework for future integration of multiple ML engines and OCR for non-Chinese. 3. Distributed Systems: 3a. Design documentation of core engine/framework, include the enhancement of ML engine for knowledge topology and handle Chinese handwriting image in different context 3b. Data and Machine Learning frameworks integration, for incremental machine learning to handle more Chinese handwriting samples from large number of frontend devices and users 3c. Devices management system, for device to upgrade automatically with centralized management, and allow human intelligence to involve verification and correction to generate more training sample and feedback. 3d. Agents management system, for collection of labeled samples in different context and consider the knowledge topology from large number of frontend devices and users. 4. CS deliverable: (for Broadlearning) 4a. Reference design of document and database management system, with Chinese handwriting support for financial, education and other industries, including the frontend in various portable devices and computers, and backend in cloud or dedicated servers. The reference design also provides SDK and interface with commercial system, it will handle at least 3 kinds of paper forms: (1) change of address, (2) application of account and (3) termination of account.The deliverables comprise of (1)Developer Services, (2) Machine Learning and (3) Distributed Systems. 1. Developer Service: 1a. Research and design documentation for ML web framework, to cater for Chinese handwriting paper form and document management system used by different organizations 1b. Web based ML setup portal for cloud and distributed devices, for easier backend and frontend deployment for paper form processing and increase the commercialization chance 1c. Data Visualization modules, for report generation and visualization of data flow, training status, model definition, knowledge tree and accuracy, enable one to set strategy and improve accuracy for specified context and industrial applications. 1d. Sample code and projects with App and Web, and create the building block for Chinese handwriting paper form processing, increase the commercialization chance 2. Machine Learning: 2a. Research and design documentation for ML models and algorithms , to represent ML checkpoint in timeline and branches,

博文教育(亞洲) 有限公司
京信通信有限公司
創次元有限公司
柯尼卡美能達商業系統(香港)有限公司


今天,大多數中文手寫圖像識別引擎是相對靜態不變的,即使是相同的中文手寫,用戶也需要重複修正 。此外,公司使用的設備(智能手機,平板電腦,桌上電腦和筆記本電腦)種類繁多。 在過住的應科院種子項目中,我們開發了一概念系統,應用機器學習(ML)在中文手寫圖像識別,吸引了一些本地公司,支持開發這個平台項目,希望進一步開發更有效率地處理紙文件之技術。 在這個平台項目“智能數據管理系統”(IDMS)中,將通過人類智能(HI)與分佈式系統的交互,開發具有不斷進步之中文手寫圖像識的文檔處理系統,使用家之驗證結果於不同的前端系統成為ML的新培訓樣本,提高中文手寫識別的準確性。該平台具有可擴展性和靈活性,適用於具少量開發人員的中小企業, 我們更提供開放之軟件開發工具包(SDK)和範例原代碼,使用較少資源也可開發具工作流程的應用程序。我們的框架還通過知識拓撲來改進ML,實現不同領域的準確結果。 此外,該平台的技術和框架可以進一步應用到其他行業與其他資料來源,而資料產生可來自不同的智慧設備,如醫療保健和智慧城市地區。