5,984
edits
(→2025) |
(→2025) |
||
| Line 868: | Line 868: | ||
}} | }} | ||
|timestamp=9:32 PM · Jun 9, 2025 | |timestamp=9:32 PM · Jun 9, 2025 | ||
}} | |||
{{Tweet | |||
|image=Eric profile picture.jpg | |||
|nameurl=https://x.com/EricRWeinstein/status/1938248787425956219 | |||
|name=Eric Weinstein | |||
|usernameurl=https://x.com/EricRWeinstein | |||
|username=EricRWeinstein | |||
|content=That’s where the state of play is as of June 2025 in STEM. It’s not close to humans of 70 years ago (before '''modern peer review''') as of these iterations. | |||
Maybe it’s a bit closer to simulating today’s humans in the Claudine Gay/'''Peer Review''' academic era. | |||
We are converging down as it moves up. And we may soon meet in the middle it seems. | |||
|media1=ERW-X-post-1938248787425956219-GuYLABXaAAAfTgi.jpg | |||
|thread= | |||
{{Tweet | |||
|image=Eric profile picture.jpg | |||
|nameurl=https://x.com/EricRWeinstein/status/1938248774637523301 | |||
|name=Eric Weinstein | |||
|usernameurl=https://x.com/EricRWeinstein | |||
|username=EricRWeinstein | |||
|content='''LLM Peer Review''': | |||
Gemini was asked to write a journal submission on a STEM topic I know a bit about. | |||
ChatGPT was asked to Peer Review that article for publication and make suggestions. | |||
I recommend every STEM academic try this adversarial excercise in an area he/she knows well. | |||
|timestamp=2:51 PM · Jun 26, 2025 | |||
|media1=ERW-X-post-1938248774637523301-GuYK_R3bEAMdFX5.jpg | |||
}} | |||
{{Tweet | |||
|image=Eric profile picture.jpg | |||
|nameurl=https://x.com/EricRWeinstein/status/1938248781121892706 | |||
|name=Eric Weinstein | |||
|usernameurl=https://x.com/EricRWeinstein | |||
|username=EricRWeinstein | |||
|content=ChatGPT had all kinds of issues with it and complained while rewriting it to improve it. | |||
Gemini was given the reviewer report and it more or less freaked out: “How is this reviewer even competent to make these claims?“ and pointed out how lousy the reviewer report was. In fact CharGPT as reviewer had, in fact, misrepresented Gemini’s work as well as engaged in making claims to have improved the work…which were false. | |||
Gemini was basically far more correct. ChatGPT suggestions made Gemini worse. It may be that ChatGPT’s output tokens are far fewer so it gutted Gemini’s work. | |||
Thus I wanted to see if Gemini was the clear winner. | |||
So I then asked Gemini for specific references to back up its claims in a comprehensive literature search. And it promptly totally fabricated convincing quotes, page numbers, article titles and mathematical equations. | |||
|timestamp=2:51 PM · Jun 26, 2025 | |||
|media1=ERW-X-post-1938248781121892706-GuYK_nxa4AAq5TW.jpg | |||
}} | |||
|timestamp=2:51 PM · Jun 26, 2025 | |||
}} | }} | ||