At the 2023 Artificial Intelligence Day event themed "AI - Rebuilding Reality," held on December 5th and 6th, VinAI Artificial Intelligence Research and Application Company ( Vingroup ) announced its open-source research project on a large language model specifically for Vietnamese, called PhoGPT.
PhoGPT is an open-source project, unlike proprietary software such as OpenAI's ChatGPT. Because it's open-source, there are no commercial limitations; all parties can use PhoGPT to develop their own applications, including those for commercial purposes. Essentially, it's a platform for the domestic community developing AI-related applications.
According to Dr. Bui Hai Hung, General Director of VinAI Artificial Intelligence Research and Application Company, the limitations of existing Vietnamese language models demonstrate that they have not achieved optimal performance and lack an open-source codebase. Therefore, one of the urgent tasks facing the AI community in general, and the natural language processing (NLP) community in particular, is to build a new, more powerful model capable of processing the Vietnamese language with high accuracy and efficiency.
AI experts believe that, with a big data language model containing 7.5 billion parameters, built on the Transformer decoding platform, this model was trained from scratch using the most advanced techniques available, such as Flash Attention and AliBi context length extrapolation.
These techniques not only help the model gain a deeper understanding of context but also enhance PhoGPT's ability to engage in natural dialogue and interaction. This makes the model a versatile and flexible tool capable of meeting the diverse language needs of its users.
Dr. Bui Hai Hung added that PhoGPT was developed from scratch by the company, independently of all other models worldwide . With its open-source model, the community in Vietnam can use and improve it further. Making the PhoGPT source code publicly available and readily accessible to users creates an environment and community where users can develop unique and customized applications.
One of the goals of open source is to create a platform where people don't have to redo things, allowing organizations to further develop the PhoGPT large language model. This will help society have a quality open-source community for the Vietnamese large language model, creating a positive effect so that many companies can participate and apply it to various fields. With PhoGPT, VinAI Artificial Intelligence Research and Application Company stated that it plans to research and develop applications for individual users and comprehensive support solutions for businesses in the Vietnamese language in fields such as healthcare and education.
PhoGPT has laid the first foundations for the development of high-performance Vietnamese language models, serving as a basis for developing practical, effective applications that align with the Government's AI development strategy until 2030.
BA TAN
Source






Comment (0)