{"id":1659,"date":"2025-01-25T01:31:17","date_gmt":"2025-01-25T01:31:17","guid":{"rendered":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/"},"modified":"2025-01-25T01:31:17","modified_gmt":"2025-01-25T01:31:17","slug":"a-test-so-hard-no-ai-system-can-pass-it-yet","status":"publish","type":"post","link":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/","title":{"rendered":"A Test So Hard No AI System Can Pass It \u2014 Yet"},"content":{"rendered":"<p> <script async src=\"https:\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js?client=ca-pub-6606220950177433\"\r\n     crossorigin=\"anonymous\"><\/script>\r\n<!-- ItShrt World News -->\r\n<ins class=\"adsbygoogle\"\r\n     style=\"display:block\"\r\n     data-ad-client=\"ca-pub-6606220950177433\"\r\n     data-ad-slot=\"1882483372\"\r\n     data-ad-format=\"auto\"\r\n     data-full-width-responsive=\"true\"><\/ins>\r\n<script>\r\n     (adsbygoogle = window.adsbygoogle || []).push({});\r\n<\/script>\r\n<br \/>\n<\/p>\n<div data-testid=\"companionColumn-0\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">If you\u2019re looking for a new reason to be nervous about artificial intelligence, try this: Some of the smartest humans in the world are struggling to create tests that A.I. systems can\u2019t pass.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">For years, A.I. systems were measured by giving new models a variety of standardized benchmark tests. Many of these tests consisted of challenging, S.A.T.-caliber problems in areas like math, science and logic. Comparing the models\u2019 scores over time served as a rough measure of A.I. progress.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">But A.I. systems eventually got too good at those tests, so new, harder tests were created \u2014 often with the types of questions graduate students might encounter on their exams.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Those tests aren\u2019t in good shape, either. New models from companies like OpenAI, Google and Anthropic have been getting high scores on many Ph.D.-level challenges, limiting those tests\u2019 usefulness and leading to a chilling question: Are A.I. systems getting too smart for us to measure?<\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<div data-testid=\"companionColumn-1\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">This week, researchers at the Center for AI Safety and Scale AI are releasing a possible answer to that question: A new evaluation, called \u201c<a class=\"css-yywogo\" href=\"https:\/\/lastexam.ai\/\" title=\"\" rel=\"noopener noreferrer\" target=\"_blank\">Humanity\u2019s Last Exam<\/a>,\u201d that they claim is the hardest test ever administered to A.I. systems. <\/p>\n<p class=\"css-at9mc1 evys1bk0\">Humanity\u2019s Last Exam is the brainchild of Dan Hendrycks, a well-known A.I. safety researcher and director of the Center for AI Safety. (The test\u2019s original name, \u201cHumanity\u2019s Last Stand,\u201d was discarded for being overly dramatic.)<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Mr. Hendrycks worked with Scale AI, an A.I. company where he is an advisor, to compile the test, which consists of roughly 3,000 multiple-choice and short answer questions designed to test A.I. systems\u2019 abilities in areas ranging from analytic philosophy to rocket engineering.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Questions were submitted by experts in these fields, including college professors and prizewinning mathematicians, who were asked to come up with extremely difficult questions they knew the answers to. <\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<div data-testid=\"companionColumn-2\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">Here, try your hand at a question about hummingbird anatomy from the test:<\/p>\n<blockquote class=\"css-1ggt3fz etf134l0\">\n<p class=\"css-12wzsk6 evys1bk0\">Hummingbirds within Apodiformes uniquely have a bilaterally paired oval bone, a sesamoid embedded in the caudolateral portion of the expanded, cruciate aponeurosis of insertion of m. depressor caudae. How many paired tendons are supported by this sesamoid bone? Answer with a number.<\/p>\n<\/blockquote>\n<p class=\"css-at9mc1 evys1bk0\">Or, if physics is more your speed, try this one: <\/p>\n<blockquote class=\"css-1ggt3fz etf134l0\">\n<p class=\"css-12wzsk6 evys1bk0\">A block is placed on a horizontal rail, along which it can slide frictionlessly. It is attached to the end of a rigid, massless rod of length R. A mass is attached at the other end. Both objects have weight W. The system is initially stationary, with the mass directly above the block. The mass is given an infinitesimal push, parallel to the rail. Assume the system is designed so that the rod can rotate through a full 360 degrees without interruption. When the rod is horizontal, it carries tension T1\u200b. When the rod is vertical again, with the mass directly below the block, it carries tension T2. (Both these quantities could be negative, which would indicate that the rod is in compression.) What is the value of (T1\u2212T2)\/W?<\/p>\n<\/blockquote>\n<p class=\"css-at9mc1 evys1bk0\">(I would print the answers here, but that would spoil the test for any A.I. systems being trained on this column. Also, I\u2019m far too dumb to verify the answers myself.)<\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<div data-testid=\"companionColumn-3\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">The questions on Humanity\u2019s Last Exam went through a two-step filtering process. First, submitted questions were given to leading A.I. models to solve.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">If the models couldn\u2019t answer them (or if, in the case of multiple-choice questions, the models did worse than by random guessing), the questions were given to a set of human reviewers, who refined them and verified the correct answers. Experts who wrote top-rated questions were paid between $500 and $5,000 per question, as well as receiving credit for contributing to the exam.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Kevin Zhou, a postdoctoral researcher in theoretical particle physics at the University of California, Berkeley, submitted a handful of questions to the test. Three of his questions were chosen, all of which he told me were \u201calong the upper range of what one might see in a graduate exam.\u201d <\/p>\n<p class=\"css-at9mc1 evys1bk0\">Mr. Hendrycks, who helped create a widely used A.I. test known as Massive Multitask Language Understanding, or M.M.L.U., said he was inspired to create harder A.I. tests by a conversation with Elon Musk. (Mr. Hendrycks is also a safety advisor to Mr. Musk\u2019s A.I. company, xAI.) Mr. Musk, he said, raised concerns about the existing tests given to A.I. models, which he thought were too easy.<\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<div data-testid=\"companionColumn-4\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">\u201cElon looked at the M.M.L.U. questions and said, \u2018These are undergrad level. I want things that a world-class expert could do,\u2019\u201d Mr. Hendrycks said.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">There are other tests trying to measure advanced A.I. capabilities in certain domains, such as FrontierMath, a test developed by Epoch AI, and <a class=\"css-yywogo\" href=\"https:\/\/arcprize.org\/arc\" title=\"\" rel=\"noopener noreferrer\" target=\"_blank\">ARC-AGI<\/a>, a test<span class=\"css-8l6xbc evw5hdy0\">  <\/span>developed by the A.I. researcher Fran\u00e7ois Chollet. <\/p>\n<p class=\"css-at9mc1 evys1bk0\">But Humanity\u2019s Last Exam is aimed at determining how good A.I. systems are at answering complex questions across a wide variety of academic subjects, giving us what might be thought of as a general intelligence score.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">\u201cWe are trying to estimate the extent to which A.I. can automate a lot of really difficult intellectual labor,\u201d Mr. Hendrycks said.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Once the list of questions had been compiled, the researchers gave Humanity\u2019s Last Exam to six leading A.I. models, including Google\u2019s Gemini 1.5 Pro and Anthropic\u2019s Claude 3.5 Sonnet. All of them failed miserably. OpenAI\u2019s o1 system scored the highest of the bunch, with a score of 8.3 percent.<\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<div data-testid=\"companionColumn-5\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">(The New York Times has <a class=\"css-yywogo\" href=\"https:\/\/www.nytimes.com\/2023\/12\/27\/business\/media\/new-york-times-open-ai-microsoft-lawsuit.html\" title=\"\">sued<\/a> OpenAI and its partner, Microsoft, accusing them of copyright infringement of news content related to A.I. systems. OpenAI and Microsoft have denied those claims.)<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Mr. Hendrycks said he expected those scores to rise quickly, and potentially to surpass 50 percent by the end of the year. At that point, he said, A.I. systems might be considered \u201cworld-class oracles,\u201d capable of answering questions on any topic more accurately than human experts. And we might have to look for other ways to measure A.I.\u2019s impacts, like looking at economic data or judging whether it can make novel discoveries in areas like math and science.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">\u201cYou can imagine a better version of this where we can give questions that we don\u2019t know the answers to yet, and we\u2019re able to verify if the model is able to help solve it for us,\u201d said Summer Yue, Scale AI\u2019s director of research and an organizer of the exam.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Part of what\u2019s so confusing about A.I. progress these days is how jagged it is. We have A.I. models capable of <a class=\"css-yywogo\" href=\"https:\/\/www.nytimes.com\/2024\/11\/17\/health\/chatgpt-ai-doctors-diagnosis.html\" title=\"\">diagnosing diseases more effectively than human doctors<\/a>, <a class=\"css-yywogo\" href=\"https:\/\/deepmind.google\/discover\/blog\/ai-solves-imo-problems-at-silver-medal-level\/\" title=\"\" rel=\"noopener noreferrer\" target=\"_blank\">winning silver medals at the International Math Olympiad<\/a> and <a class=\"css-yywogo\" href=\"https:\/\/www.youtube.com\/live\/SKBG1sqdyIU?si=bq2zPAx1NrBxHPTf&amp;t=149\" title=\"\" rel=\"noopener noreferrer\" target=\"_blank\">beating top human programmers<\/a> on competitive coding challenges. <\/p>\n<p class=\"css-at9mc1 evys1bk0\">But these same models sometimes struggle with basic tasks, like arithmetic or writing metered poetry. That has given them a reputation as astoundingly brilliant at some things and totally useless at others, and it has created vastly different impressions of how fast A.I. is improving, depending on whether you\u2019re looking at the best or the worst outputs. <\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<div data-testid=\"companionColumn-6\">\n<div class=\"css-53u6y8\">\n<p class=\"css-at9mc1 evys1bk0\">That jaggedness has also made measuring these models hard. I wrote last year that <a class=\"css-yywogo\" href=\"https:\/\/www.nytimes.com\/2024\/04\/15\/technology\/ai-models-measurement.html\" title=\"\">we need better evaluations for A.I. systems<\/a>. I still believe that. But I also believe that we need more creative methods of tracking A.I. progress that don\u2019t rely on standardized tests, because most of what humans do \u2014 and what we fear A.I. will do better than us \u2014 can\u2019t be captured on a written exam.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">Mr. Zhou, the theoretical particle physics researcher who submitted questions to Humanity\u2019s Last Exam, told me that while A.I. models were often impressive at answering complex questions, he didn\u2019t consider them a threat to him and his colleagues, because their jobs involve much more than spitting out correct answers.<\/p>\n<p class=\"css-at9mc1 evys1bk0\">\u201cThere\u2019s a big gulf between what it means to take an exam and what it means to be a practicing physicist and researcher,\u201d he said. \u201cEven an A.I. that can answer these questions might not be ready to help in research, which is inherently less structured.\u201d<\/p>\n<\/div>\n<aside class=\"css-ew4tgv\" aria-label=\"companion column\"\/><\/div>\n<p><script async src=\"https:\/\/pagead2.googlesyndication.com\/pagead\/js\/adsbygoogle.js?client=ca-pub-6606220950177433\"\r\n     crossorigin=\"anonymous\"><\/script>\r\n<!-- ItShrt World News -->\r\n<ins class=\"adsbygoogle\"\r\n     style=\"display:block\"\r\n     data-ad-client=\"ca-pub-6606220950177433\"\r\n     data-ad-slot=\"1882483372\"\r\n     data-ad-format=\"auto\"\r\n     data-full-width-responsive=\"true\"><\/ins>\r\n<script>\r\n     (adsbygoogle = window.adsbygoogle || []).push({});\r\n<\/script>\r\n<br \/>\n<br \/><a href=\"https:\/\/www.nytimes.com\/2025\/01\/23\/technology\/ai-test-humanitys-last-exam.html\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you\u2019re looking for a new reason to be nervous about artificial intelligence, try this: Some of the smartest humans<\/p>\n","protected":false},"author":1,"featured_media":1660,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[39],"tags":[],"class_list":["post-1659","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>A Test So Hard No AI System Can Pass It \u2014 Yet - World News<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"A Test So Hard No AI System Can Pass It \u2014 Yet - World News\" \/>\n<meta property=\"og:description\" content=\"If you\u2019re looking for a new reason to be nervous about artificial intelligence, try this: Some of the smartest humans\" \/>\n<meta property=\"og:url\" content=\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\" \/>\n<meta property=\"og:site_name\" content=\"World News\" \/>\n<meta property=\"article:published_time\" content=\"2025-01-25T01:31:17+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg\" \/>\n<meta name=\"author\" content=\"News\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"News\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\"},\"author\":{\"name\":\"News\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/person\/1d8ae7348e048cb7984f26b4dd818ff2\"},\"headline\":\"A Test So Hard No AI System Can Pass It \u2014 Yet\",\"datePublished\":\"2025-01-25T01:31:17+00:00\",\"dateModified\":\"2025-01-25T01:31:17+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\"},\"wordCount\":1392,\"publisher\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#organization\"},\"image\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1\",\"articleSection\":[\"Technology\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\",\"url\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\",\"name\":\"A Test So Hard No AI System Can Pass It \u2014 Yet - World News\",\"isPartOf\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1\",\"datePublished\":\"2025-01-25T01:31:17+00:00\",\"dateModified\":\"2025-01-25T01:31:17+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage\",\"url\":\"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1\",\"width\":1050,\"height\":550},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/site.itshrt.com\/worldnews\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"A Test So Hard No AI System Can Pass It \u2014 Yet\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#website\",\"url\":\"https:\/\/site.itshrt.com\/worldnews\/\",\"name\":\"World News\",\"description\":\"World News\",\"publisher\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/site.itshrt.com\/worldnews\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#organization\",\"name\":\"World News\",\"url\":\"https:\/\/site.itshrt.com\/worldnews\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/site.itshrt.com\/worldnews\/wp-content\/uploads\/sites\/13\/2025\/01\/cropped-download.png\",\"contentUrl\":\"https:\/\/site.itshrt.com\/worldnews\/wp-content\/uploads\/sites\/13\/2025\/01\/cropped-download.png\",\"width\":581,\"height\":581,\"caption\":\"World News\"},\"image\":{\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/person\/1d8ae7348e048cb7984f26b4dd818ff2\",\"name\":\"News\",\"url\":\"https:\/\/site.itshrt.com\/worldnews\/author\/admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"A Test So Hard No AI System Can Pass It \u2014 Yet - World News","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/","og_locale":"en_US","og_type":"article","og_title":"A Test So Hard No AI System Can Pass It \u2014 Yet - World News","og_description":"If you\u2019re looking for a new reason to be nervous about artificial intelligence, try this: Some of the smartest humans","og_url":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/","og_site_name":"World News","article_published_time":"2025-01-25T01:31:17+00:00","og_image":[{"url":"https:\/\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg","type":"","width":"","height":""}],"author":"News","twitter_card":"summary_large_image","twitter_image":"https:\/\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg","twitter_misc":{"Written by":"News","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#article","isPartOf":{"@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/"},"author":{"name":"News","@id":"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/person\/1d8ae7348e048cb7984f26b4dd818ff2"},"headline":"A Test So Hard No AI System Can Pass It \u2014 Yet","datePublished":"2025-01-25T01:31:17+00:00","dateModified":"2025-01-25T01:31:17+00:00","mainEntityOfPage":{"@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/"},"wordCount":1392,"publisher":{"@id":"https:\/\/site.itshrt.com\/worldnews\/#organization"},"image":{"@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1","articleSection":["Technology"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/","url":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/","name":"A Test So Hard No AI System Can Pass It \u2014 Yet - World News","isPartOf":{"@id":"https:\/\/site.itshrt.com\/worldnews\/#website"},"primaryImageOfPage":{"@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage"},"image":{"@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage"},"thumbnailUrl":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1","datePublished":"2025-01-25T01:31:17+00:00","dateModified":"2025-01-25T01:31:17+00:00","breadcrumb":{"@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#primaryimage","url":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1","contentUrl":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1","width":1050,"height":550},{"@type":"BreadcrumbList","@id":"https:\/\/site.itshrt.com\/worldnews\/a-test-so-hard-no-ai-system-can-pass-it-yet\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/site.itshrt.com\/worldnews\/"},{"@type":"ListItem","position":2,"name":"A Test So Hard No AI System Can Pass It \u2014 Yet"}]},{"@type":"WebSite","@id":"https:\/\/site.itshrt.com\/worldnews\/#website","url":"https:\/\/site.itshrt.com\/worldnews\/","name":"World News","description":"World News","publisher":{"@id":"https:\/\/site.itshrt.com\/worldnews\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/site.itshrt.com\/worldnews\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/site.itshrt.com\/worldnews\/#organization","name":"World News","url":"https:\/\/site.itshrt.com\/worldnews\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/logo\/image\/","url":"https:\/\/site.itshrt.com\/worldnews\/wp-content\/uploads\/sites\/13\/2025\/01\/cropped-download.png","contentUrl":"https:\/\/site.itshrt.com\/worldnews\/wp-content\/uploads\/sites\/13\/2025\/01\/cropped-download.png","width":581,"height":581,"caption":"World News"},"image":{"@id":"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/site.itshrt.com\/worldnews\/#\/schema\/person\/1d8ae7348e048cb7984f26b4dd818ff2","name":"News","url":"https:\/\/site.itshrt.com\/worldnews\/author\/admin\/"}]}},"magazineBlocksPostFeaturedMedia":{"thumbnail":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=150&resize=150,150&ssl=1","medium":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=300&resize=300,300&ssl=1","medium_large":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=768&resize=768,0&ssl=1","large":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1024&resize=1024,1024&ssl=1","1536x1536":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1536&resize=1536,1536&ssl=1","2048x2048":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=2048&resize=2048,2048&ssl=1","colormag-highlighted-post":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=392&resize=392,272&ssl=1","colormag-featured-post-medium":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=390&resize=390,205&ssl=1","colormag-featured-post-small":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=130&resize=130,90&ssl=1","colormag-featured-image":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=800&resize=800,445&ssl=1","colormag-default-news":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=150&resize=150,150&ssl=1","colormag-featured-image-large":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1400&resize=1400,600&ssl=1","colormag-elementor-block-extra-large-thumbnail":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1155&resize=1155,480&ssl=1","colormag-elementor-grid-large-thumbnail":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=600&resize=600,417&ssl=1","colormag-elementor-grid-small-thumbnail":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=285&resize=285,450&ssl=1","colormag-elementor-grid-medium-large-thumbnail":"https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=575&resize=575,198&ssl=1"},"magazineBlocksPostAuthor":{"name":"News","avatar":"https:\/\/secure.gravatar.com\/avatar\/e85cf08852547781a5f631f3713dbb5197ed4b54e32ae19c7d5769b8c8504897?s=96&d=mm&r=g"},"magazineBlocksPostCommentsNumber":false,"magazineBlocksPostExcerpt":"If you\u2019re looking for a new reason to be nervous about artificial intelligence, try this: Some of the smartest humans","magazineBlocksPostCategories":["Technology"],"magazineBlocksPostViewCount":61,"magazineBlocksPostReadTime":7,"magazine_blocks_featured_image_url":{"full":["https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=1050&resize=1050,550&ssl=1",1050,550,false],"medium":["https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=300&resize=300,300&ssl=1",300,300,false],"thumbnail":["https:\/\/i0.wp.com\/static01.nyt.com\/images\/2025\/01\/22\/business\/00roose\/00roose-facebookJumbo.jpg?w=150&resize=150,150&ssl=1",150,150,false]},"magazine_blocks_author":{"display_name":"News","author_link":"https:\/\/site.itshrt.com\/worldnews\/author\/admin\/"},"magazine_blocks_comment":0,"magazine_blocks_author_image":"https:\/\/secure.gravatar.com\/avatar\/e85cf08852547781a5f631f3713dbb5197ed4b54e32ae19c7d5769b8c8504897?s=96&d=mm&r=g","magazine_blocks_category":"<a href=\"#\" class=\"category-link category-link-39\">Technology<\/a>","_links":{"self":[{"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/posts\/1659","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/comments?post=1659"}],"version-history":[{"count":0,"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/posts\/1659\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/media\/1660"}],"wp:attachment":[{"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/media?parent=1659"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/categories?post=1659"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/site.itshrt.com\/worldnews\/wp-json\/wp\/v2\/tags?post=1659"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}