Large Language Model
MathGPT: Leveraging Llama 2 to create a platform for highly personalized learning
April 15, 2024

Breakthroughs in AI have the potential to benefit the global community—if more people have the education and access to leverage them. As STEM education remains a critical factor in economic mobility and innovation, Mathpresso is doing its part with its learning platform QANDA, which has made personalized learning content accessible to people across 50 countries. When the Seoul-based company set out to create a highly accurate math-specific LLM with leading Korean AI startup Upstage, Meta’s open-source model Llama 2 was a fitting choice.

“Our primary focus has been mathematics,” says Mathpresso Co-Founder Jake Yongjae Lee, who adds that QANDA is best-suited for subjects that are typically taught with a structured curriculum, like lectures and practice problem sets. However, he says, ”Commercial LLMs like ChatGPT lack the customization needed for the complex education landscape.”

According to Lee, student learning is influenced by hyperlocal factors including curriculum, school district, exam trends, teaching style, and more. Through that complexity, Llama 2 is helping Mathpresso envision a world in which everyone has access to quality education.

With an open-source model like Llama 2, the team could create flexible, domain-specific educational products while fully utilizing their own specialized data and techniques. The result is MathGPT, an LLM with powerful, accurate math capabilities that uses Llama 2 as a base model. Upstage handles the model’s engine and fine-tuning, while Mathpresso and QANDA contribute specialized mathematical data for model learning.

Earlier this year, MathGPT set a new world record in benchmarks assessing mathematical performance at both the primary and secondary school level.

Helping students deepen math comprehension with MathGPT

Mathpresso started by training smaller models and gradually moved on to larger ones to test their performance in mathematics. In the process, the team used data generated from the larger models to train other models, finding that methodologies applied to the smaller models also worked effectively in the larger models. Today, the team is using Llama 2 to build a fine-tuned model based on QANDA’s own mathematical data, generating training data through supervised fine-tuning and data augmentation.

What sets MathGPT apart is its focus on fostering students’ comprehension of the solution process rather than merely providing answers to math problems. As a result, it provides detailed explanations that are also broken down into step-by-step procedures, helping cultivate a deeper level of understanding compared to typical explanations.

The Mathpresso team conducts full fine-tuning using data collected from the QANDA platform. Since that data usually exists as pairs of problems and their solutions, the training process involves presenting the model with a question and training it to generate a correct answer. This data is proprietary to QANDA, so the team concluded that using an open-source model was preferable to a closed-source or hosted model, because it allowed them to retain control of their own data.

Mathpresso needed a model with exceptional capability in interpreting mathematical expressions. To that end, selectively incorporating data specialized in mathematical expressions strengthened Llama 2’s laTeX document preparation system expression.

Outperforming on the Open LLM leaderboard

For Upstage, the Llama journey began with its pursuit of a versatile language model that could excel in English and other languages, like Korean, and seamlessly adapt to various company needs. To measure progress, it targeted the top spot on HuggingFace’s Open LLM Leaderboard, aiming to surpass GPT-3.5’s benchmark score. After considering BERT-based models, Upstage discovered academic papers demonstrating that Llama 2 offered higher benchmark performance.

“To create a champion language model for the Open LLM Leaderboard, we needed a strong starting point—that’s where Llama 2 came in,” says Upstange CEO Sung Kim. “As a top performer and the go-to choice in the open-source LLM world, Llama 2 was the perfect foundation for our project.”

The company first used Llama 2 for fine-tuning to compete on the leaderboard, which involved adjusting an existing model to excel on that benchmark. Its Llama2-70b model successfully moved up to the number one position, making Upstage the first company globally to outperform GPT-3.5 on the Open LLM Leaderboard.

Next, Upstage leveraged a smaller version, Llama2-7b, for research on Korean language support and to develop its own foundation model. This allowed for the exploration of Korean capabilities and the building of a customized base model. The company adopted the Llama 2 architecture as its default because of its widespread support in open-source libraries.

Since then, the company’s work with Mathpresso, which was part of a strategic partnership with telecom giant KT, resulted in the MathGPT record. Upstage also developed its first pre-trained LLM, SOLAR-10.7B (short for Specialized and Optimized LLM and Applications with Reliability),which also topped the Open LLM Leaderboard last December. Compared to larger models with hundreds of billions of parameters, Solar is a lightweight model with fewer than 20 billion parameters. Because it uses a smaller training dataset, the model can run inference at lower costs and about 2.5 times faster than GPT-3.5.

“We wouldn’t have been able to achieve this rapid rise if Llama 2 hadn’t been released as an open-source model,” says Kim. “Our story exemplifies the power of open source for rising generative AI startups.”

Llama 2’s real-world open-source impact on education

For Mathpresso, making 1:1 personalized education available for everyone through an AI tutor has been a long-standing goal.

“Through the QANDA platform, we have been able to meticulously collect and digitize unique data on each student’s learning path and needs,” says Lee. “With open-source models like Llama 2, we have the flexibility to create affordable educational tools that leverage our unique insights, helping students worldwide to reach their fullest potential.”

Both Mathpresso and Upstage believe that open-source models like Llama 2 can profoundly impact companies, large and small.

“Access to cutting-edge open-source tools and libraries can level the playing field,” says Kim, “enabling organizations to leverage advanced technologies and methodologies that may otherwise be out of reach.”


Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

Related Posts
Computer Vision
Introducing Segment Anything: Working toward the first foundation model for image segmentation
April 5, 2023
MultiRay: Optimizing efficiency for large-scale AI models
November 18, 2022
ML Applications
MuAViC: The first audio-video speech translation benchmark
March 8, 2023