Natural Language Processing

Natural language processing is an important direction in the field of computer science and artificial intelligence. ASTRI has established the core competence and technologies on NLP and applied these technologies on various applications.


Technology Highlights

I. Artificial Intelligence Chatbot for Cantonese/English Mixed Languages

There are many chatbots in the market already. The major differences in ASTRI Chatbot are:-

  1. ASTRI chatbots can handle a mix of Cantonese and English voice input. The unique cultural practice to speak in Cantonese embedded with some English words is also prevalent in the Big Bay Area. ASTRI chatbots can also recognise speech with mixed Cantonese and English or a slight accent.
  2. Chinese have words that share the same sound, and words that have similar forms. This could cause a lot of confusion in the context. ASTRI’s natural language processor can correct these common mistakes based on the context of the sentences.
  3. Usually chatbots obtain the answers for questions based solely on keywords. ASTRI chatbots extract the meaning from the full sentences in addition to the keywords discovered. By doing so, sentences with negation or ambiguity can be managed.  
  4. ASTRI chatbots can learn new phrases and new answers through a learning process.


Chatbots have wide applications, ranging from intelligent home, smart city to AI telephone customer service. It can also conduct categorization on the questions raised in emails. ASTRI is working with financial, insurance, banking, and retail industry in the development of chatbots.


Chatbot for general banking inquiry
Chatbot for general banking inquiry



II. Multi-lingual address recognition for smart phone

ASTRI team has successfully developed an address recognition system that worked with and implemented the regular expressions for English, French, German, Italian, Spanish and Portuguese languages in short messages on mobile phones. By using this technology, the mobile phones are enabled to recognise addresses inside the short messages received. The accuracy and recall rate are all above 85%, verified on hundreds of thousands annotated sentences in each language. Based on this, user can easily tab the address and search on maps.