Natural Language Processing

Natural Language Processing

  • Natural language processing is an important direction in the field of computer science and artificial intelligence. ASTRI has established the core competence and technologies on NLP and applied these technologies on various application.


    Technology Highlights

    1. Chinese Natural Language Processor

    The unique cultural practice to write in proper Chinese, augmented with English, emojis, local slangs, colloquial terms, domain jargons, and intentionally or unintentionally ‘mis-written terms’ is prevalent in Hong Kong. This unprecedented ‘mix’ of linguistic expressions makes it complex to analyse the language and associated mood and emotions. ASTRI has developed natural language processing (NLP) technology embedded with a sentence parser specific for colloquial Cantonese to extract the full meaning of these written passages. It also has a built-in colloquial Cantonese vocabulary database together with a large set of synonyms, and confusion sets with words that are similar in sound or in form.

    This technology can be applied to conduct big data applications on social media analytics, including mood/sentiment analysis, topics analysis and contextual analysis. The tool can be used to assist teachers in assessing primary student’s Chinese writing and providing suggestions to correct common mistakes.

    When the NLP is connected to a voice recognition engine, it can be developed into a Chatbot application or other voice-interactive tools.



    1. Multi-lingual address recognition for smart phone

    ASTRI team has successfully developed an address recognition system that works with and implemented the regular expressions for English, French, German, Italian, Spanish, and Portuguese languages in short messages on mobile phones. By using this technology, the mobile phones are enabled to recognize addresses inside the short messages received. The accuracy and recall rate are all above 85%, verified on hundreds of thousands annotated sentences in each language. Based on this, user can easily tab the address and link to search for addresses on maps.