Dev Log: Cursor-Inspired ArXiv Research Tool
I’ve been trying to get back into reading research papers, but it’s been years since my linear algebra classes. Every paper feels like it’s written in a foreign language. So I built a tool to help me (and hopefully others) understand them better.
The Problem
ArXiv papers are dense. You’re reading along, following the introduction, then BAM - you hit a wall of equations and terminology that assumes you remember your graduate-level math.
The traditional approach: Open 15 Wikipedia tabs, lose context, forget why you were reading the paper in the first place.
The Inspiration
I’ve been using Cursor for coding, and their CMD+K feature is brilliant. Select some code, hit the shortcut, and get an AI explanation in context.
Why not do the same for research papers?
Synthetic Trail
That’s the placeholder name for now. Here’s how it works:
- Find an ArXiv paper you want to read
- Open the HTML version (not PDF)
- Prepend
r.apurn.com/
to the URL - Select any confusing text
- Hit
CMD+Shift+L
- Get an explanation that knows the paper’s context
The Technical Approach
The implementation is surprisingly straightforward:
// Listen for text selection
document.addEventListener('mouseup', () => {
const selection = window.getSelection()
if (selection.toString().length > 0) {
currentSelection = selection.toString()
}
})
// Listen for keyboard shortcut
document.addEventListener('keydown', (e) => {
if (e.metaKey && e.shiftKey && e.key === 'l') {
explainSelection(currentSelection)
}
})
The magic happens in explainSelection()
. It:
- Grabs the selected text
- Adds context from surrounding paragraphs
- Includes the paper title and abstract
- Sends to GPT-4 with a prompt optimized for academic explanations
The Context Window
The key insight: Don’t just explain the selected text. Include:
- The paragraph it’s from
- The section heading
- The paper’s abstract
- Previously explained terms from this session
This way, when you select “We use a VAE to…”, the AI knows what paper you’re reading and can explain VAE in that specific context.
Current Features
Smart Explanations
The AI adjusts its explanation based on what you select:
- Equations: Step-by-step breakdown
- Terms: Definition + how it’s used in this paper
- Paragraphs: Plain English summary
- Citations: What the referenced paper contributes
Session Memory
Each paper gets its own context. Previously explained terms are remembered, so explanations get more sophisticated as you read.
Future Plans
Local Storage
Currently, explanations vanish when you refresh. Planning to store chat history by ArXiv ID:
localStorage.setItem(`arxiv_${paperId}`, JSON.stringify(explanations))
Authentication
Considering Google OAuth so your reading history follows you across devices. Privacy-first though - all data stays client-side unless you explicitly sync.
Annotations
Thinking about letting users highlight and annotate papers, Genius-style. Build up a personal knowledge base of paper notes.
Paper Graph
Every paper cites others. What if you could visualize the citation graph and see explanations for how papers connect?
Technical Challenges
CORS
ArXiv doesn’t set CORS headers, so I had to proxy requests. Adds latency but it works.
LaTeX Rendering
ArXiv HTML has LaTeX equations as images. Extracting the actual math for better explanations is tricky.
Rate Limits
OpenAI rate limits are real. Implemented caching and request queuing to avoid hitting them.
Why This Matters
Academic papers are humanity’s knowledge repository, but they’re locked behind jargon. If we can lower the barrier to entry, more people can learn from and contribute to research.
I’ve already used it to finally understand a paper on transformer architectures that’s been in my reading list for months.
Try It
It’s live at r.apurn.com. Just prepend any ArXiv HTML URL.
Fair warning: It’s still rough. Sometimes the explanations are too verbose. Sometimes they miss nuance. But it’s already helpful enough that I use it daily.
The Code
Thinking about open-sourcing it. The core is just:
- A Chrome extension for the keyboard shortcuts
- A simple proxy server
- OpenAI API calls
- Some JavaScript glue
If there’s interest, I’ll clean it up and put it on GitHub.
The goal isn’t to replace deep understanding. It’s to help you build it. Sometimes you just need someone to explain that one concept that’s blocking everything else.
That’s what Synthetic Trail tries to be - your patient study buddy who never judges you for forgetting what an eigenvalue is.