Let the AIs play games against each other. The resulting leader board is more precise than benchmarks?
You must log in or register to comment.
The funny thing: Afaik the LLMs are terrible at chess vs purpose trained chess AI. https://dynomight.net/more-chess/
Often suggests illegal chess moves.
You are a chess grandmaster. You will be given a partially completed game. After seeing it, you should choose the next move. Use standard algebraic notation, e.g. “e4” or “Rdf8” or “R1a3”. NEVER give a turn number. NEVER explain your choice.


