So I’ve been working on an implementation of GPT-4-Turbo that’s designed to ingress entire papers into its context window and process them into summaries that would be understandable by someone with a highschool education (originally went for 8th grade max, but that led to rather patronizing results lol). The machine tells me what the content should be for a given paper and I make it using a few tools like Premier Pro and Photoshop. I’ve never made videos like this before though, so it’s a bit rough.
I was hoping to use this tool to expand access to scientific papers by the general public. Those papers are hella dense and def need some translation.
I’ve attached my first full attempt at using my tool. It’s kinda rough, but I’d love to get some feedback on how to make it better (especially as it pertains to the script).
My comment isn’t on your script, it is on paper-selection…
Please have several, orthogonal, selection-systems:
-
most-cited,
-
most trustworthy researchers
-
most unique area of research, or most unique question, or something
-
most central to new tech
-
most central to old tech
-
most undernoticed, big potential… ( things that should be big news, but the global novelty-addiction ignores it )
-
best fundamental science ( definitely include this category! )
-
best citizen-science
etc…
iow, the algorithm that most use, which is “paying attention to what others are paying attention to” is an algorithm we shouldn’t be assigning our viability to, you know?
Also, you may need to make a flexible-limit for dumbing-down, as some papers may have a higher limit and others a lower limit, so some could be easy for most to get the sense of, but some might be tricky … and it might well do more good to let them have their different thresholds, see?
-
Isn’t that the whole purpose of an Abstract?
It’s actually not. Abstracts are targeted at academics or researchers, and oftentime preserve the complexity. Take for example the abstract of the paper this video’s about:
Reported here are experiments that show that ribonucleoside triphosphates are converted to polyribonucleic acid when incubated with rock glasses similar to those likely present 4.3–4.4 billion years ago on the Hadean Earth surface, where they were formed by impacts and volcanism. This polyribonucleic acid averages 100–300 nucleotides in length, with a substantial fraction of 3′,-5′-dinucleotide linkages. Chemical analyses, including classical methods that were used to prove the structure of natural RNA, establish a polyribonucleic acid structure for these products. The polyribonucleic acid accumulated and was stable for months, with a synthesis rate of 2 × 10−3 pmoles of triphosphate polymerized each hour per gram of glass (25°C, pH 7.5). These results suggest that polyribonucleotides were available to Hadean environments if triphosphates were. As many proposals are emerging describing how triphosphates might have been made on the Hadean Earth, the process observed here offers an important missing step in models for the prebiotic synthesis of RNA.
While it is less complex than the paper, it is nevertheless dense and jargon endowed. Your average person with a highschool education will either not understand it well or be absolutely turned off by its density. They’re also just very unlikely to stumble across it.
I could have the machine reword it, but the information is not comprehensive, which reduces quality. By having the entire paper in its context window, the LLM is less likely to hallucinatinate. Plus the added information helps it make better summaries based on all the paper’s sections, importantly the limitation section.
Fun Fact: There’s a whole scientific discipline solely focused at making science available to average joe, it’s called science communication.
Not sure what you’d expect a LLM model to do, but reading deeper into science communication will certainly help you understand and achieve what you are looking for.