The Allen Institute for Artificial Intelligence (AI2) announced today it has created an artificial intelligence (AI) system that can solve SAT geometry questions as well as the average American 11th-grade student, a breakthrough in AI research. This system, called GeoS, uses a combination of computer vision to interpret diagrams, natural language processing to read and understand text, and a geometric solver to achieve 49 percent accuracy on geometry questions from the official SAT tests. If these results were extrapolated to the entire Math SAT test, the computer roughly achieved an SAT score of 500 (out of 800), the average test score for 2015.
These results, presented at the Conference on Empirical Methods in Natural Language Processing (EMNLP) in Lisbon, Portugal, were achieved by GeoS solving unaltered SAT questions that it had never seen before and that required an understanding of: Implicit relationships – Ambiguous references – The relationships between diagrams and Natural-Language text
A demonstration of the system’s problem solving is available here: geometry.allenai.org
“Unlike the Turing Test, standardized tests such as the SAT provide us today with a way to measure a machine’s ability to reason and to compare its abilities with that of a human,” said Oren Etzioni, CEO of AI2. “Much of what we understand from text and graphics is not explicitly stated, and requires far more knowledge than we appreciate. Creating a system to be able to successfully take these tests is challenging, and we are proud to achieve these unprecedented results.
“Said Ali Farhadi, senior research manager for Vision at AI2 and assistant professor of computer science and engineering at UW, “We are excited about GeoS’s performance on real-world tasks. Our biggest challenge was converting the question to a computer-understandable language. One needs to go beyond standard pattern-matching approaches for problems like solving geometry questions that require in-depth understanding of text, diagram, and reasoning.” Source: AI system solves SAT geometry questions as well as average human test taker