The Chinese startup, which roiled markets with its low-cost reasoning model that emerged in January, collaborated with researchers from the Beijing institution on a paper detailing a novel approach to reinforcement learning to make models more efficient.
The new method aims to help artificial intelligence models better adhere to human preferences by offering rewards for more accurate and understandable responses, the researchers wrote. Reinforcement learning has proven effective in speeding up AI tasks in narrow applications and spheres. However, ...
Learn more about Bloomberg Law or Log In to keep reading:
Learn About Bloomberg Law
AI-powered legal analytics, workflow tools and premium legal & business news.
Already a subscriber?
Log in to keep reading or access research tools.