Akshat Mittal
Weekly Update: July 15, 2025
đź§ North-star Metric
- Number of PDFs successfully parsed and readable through our interface
- Average session duration per user when engaging with documents
âś… What I Got Done
- Implemented functionality to extract authors from search input and use AI to match with the most relevant author profile.
- Built and deployed new query suggestion feature based on users’ previous queries.
- Researched OpenAlex’s approach to identifying related papers to explore how we can integrate or replicate their method for improving paper discovery.
đź’ˇ Learnings / New Thoughts
Realized I may be too attached to our current MVP. YC advice reminds me to stay detached until we’re building something that clearly resonates. The product works, but it’s not yet novel or differentiated — and that’s okay at this stage.
🚧 What’s Limiting Right Now
The current solution feels too close to traditional academic search engines — it’s functional but lacks a distinct perspective or edge. I’m struggling to see what will make this product stand out or feel indispensable to users. This lack of uniqueness is making it hard to stay inspired and push bold ideas forward.
🎯 Goals for This Week
- Conduct 2 user interviews with academic researchers to understand how they currently explore related papers and where they get stuck.
- Co-ideate with Eli and produce 3 UI concepts that help users narrow down paper selection and run focused queries.
- Prototype and test a workflow where users can select a subset of papers and run batch LLM queries; evaluate usability with at least 1 user.
Weekly Update: July 7, 2025
đź§ North-star Metric
- Number of PDFs successfully parsed and readable through our interface
- Average session duration per user when engaging with documents
âś… What I Got Done
- Built a PDF upload and search demo using a RAG (Retrieval-Augmented Generation) pipeline for semantic search
- Implemented user authentication (Sign Up and Login) to track returning users
- Added a post-paper survey prompt to collect structured user feedback
- Integrated OpenAlex and Semantic Scholar APIs to improve academic paper discovery
- Continued exploring connected papers and citation graphs to enhance filtering by relevance
- Structured the codebase to support parallel development of both "read" and "find" components
đź’ˇ Learnings / New Thoughts
- Gained a deeper appreciation of how LLMs act as a new kind of programmable interface—allowing functionality to be expressed and executed through natural language up to a meaningful extent
🚧 What’s Limiting Right Now
- Lack of a deeper, grounded understanding of the core user problem in academic research—more discovery and validation is needed
🎯 Goals for This Week
- Reactivate and debug the "read" pipeline
- Implement text and element tree-type fetching for connected papers
- Conduct two user interviews
- Await visa feedback
Weekly Update: June 30, 2025
đź§ North Star Metric
Paper Searches through Find Papers Fast: 25
Target for this week: 100 searches
âś… What did I get done?
- Integrated PostHog and a search API to enable analytics and paper search functionality in Find Papers Fast.
- Collected initial user feedback on search results; early insights suggest.
- Ran a form to identify users’ most painful problems.
- Scheduled visa appointment for Germany (personal milestone).
â›” What is limiting right now?
No major blockers.
🎯 Goals for this week
- Reach 100 completed paper searches via Find Papers Fast.
- Collect 50+ user emails and pain point submissions via the form.
- Integrate OpenAlex API and validate improvement in result breadth.
- Complete and submit German visa application.
Weekly Update: June 23, 2025
đź§ North-star Metric
Primary metrics:
- Number of PDFs successfully parsed and readable through our interface
- Average session duration per user when engaging with documents
âś… What did I get done?
- Added extracted reference links and notes to the frontend
- Refactored code into modular components for PaperC
- Generated ordered topics for PDF documents
- Enabled uploading and linking of PDFs to relevant summaries
- Implemented paper fetching based on generated keywords
đź’ˇ Learnings / New Thoughts
Using LLMs has significantly increased my coding throughput—about 3x more code in the same amount of time—but this can also lead to messier structure and less deliberate design. I’m adjusting by building in more upfront planning and modular architecture.
I'm also realizing how LLMs can act as makeshift backends through prompting, enabling surprisingly structured outputs without traditional API logic.
đźš§ What is limiting right now?
I'm currently struggling with a lack of inspiration and direction when it comes to deepening the paper search experience. This seems rooted in a deeper uncertainty.
🎯 Goals for this week
- Restructure document-fetching logic into clean, testable service modules
- Add search options that allow users to explore a broader set of research papers
- Conduct 2 user interviews focusing on how users currently search for relevant academic papers
Weekly Update: June 16, 2025
đź§ North-star metric Primary metrics:
-
Number of PDFs successfully parsed and readable through our interface
-
Average session duration per user when engaging with documents
âś… What did I get done?
-
Deployed the larger GROBID model, enabling broader extraction coverage including titles, authors, formulas, figures, references, and footnotes/endnotes.
-
Integrated frontend to dynamically fetch and display all extracted data from the GROBID output
đź’ˇ Learnings / New Thoughts
-
Cursor’s inline interaction model inspired the idea of treating research papers more like modular, interactive codebases
-
Thinking of papers as structured, navigable artifacts (like repos) could help drive UI and UX decisions
-
Realized that users prefer writing their own prompts; we should embrace customization rather than abstract prompts away
-
Structuring the document well is a prerequisite for building effective search and other layered features
đźš§ What is limiting right now?
-
No external blockers currently
-
A potential internal constraint: lack of structured daily goals may be reducing my productivity ceiling—experimenting with tighter day-to-day objectives
🎯 Goals for this week
-
Implement clean, intuitive UI with proper loading states for each document section
-
Fully link all extracted content (text, images, formulas) to their correct positions within the document
-
Extract and render formulas accurately within the reading interface
Weekly Update: June 9, 2025
đź§ North-star metric Number of PDFs uploaded and read via Read Papers Fast
Surpassed this week’s target—very happy with the progress and system performance.
âś… What did I get done?
Deployed GROBID and Python microservice to production, improving PDF parsing throughput and automation.
deployed a machine learning model (PubLayNet) for scientific image extraction.
Implemented a CI/CD pipeline reducing deployment to production time, enabling faster iteration.
Refactored and updated the database schema to better support linking extracted content (images, text).
đź’ˇ Learnings / New Thoughts
Realized that striving for perfection slows down user feedback—shipping early, even if imperfect, is more valuable.
Practicing more flexibility in defining MVP features is accelerating collaboration and progress.
Letting go of anxiety around launch readiness is helping me stay focused on learning from real usage, not just planning.
đźš§ What is limiting right now?
Lack of a dedicated remote machine is delaying model training and experimentation cycles.
No clear timeline or milestone structure for VISDA, making prioritization and resource planning difficult.
🎯 Goals for this week
Set up PostHog with tracking for at least 5 critical user events across upload and reading flows.
Link extracted images to the correct content sections.
Extract and render formula in the reading interface.
Weekly Update: June 2, 2025
📝 Context
Back after a 10-day vacation and some much-needed time with family. Feeling a bit under the weather today, so taking it slow while ramping back up.
âś… What did I get done?
- Read Fast architecture papers and diagrams, gaining a clear understanding of how we will access the database and how the Python and Next.js pipeline will operate.
- Initialized a clean microservice repository in Python with proper import structure; migrated legacy code to begin integration.
- Completed GCP verification steps to enable access and begin validating the cloud environment for CI/CD setup.
đź’ˇ Learnings / New Thoughts
- Stepping away from coding for a few days helped me return with renewed focus and a broader strategic lens. I was able to see technical decisions more clearly and prioritize better.
- Beginning a meditation practice revealed how crucial mental clarity is to sustaining deep work. It's prompted me to rethink my daily schedule to better support focus and energy management.
đźš§ What is limiting right now?
- Insufficient cloud quota in the Netherlands region is preventing access to GPUs, which is slowing down testing and deployment of resource-intensive components.
- Need greater clarity on the direction and requirements for image, table, and formula extraction, especially regarding output format and integration with the rest of the pipeline.
🎯 Goals for this week
- Finalize the CI/CD pipeline such that Python services are integrated with GROBID and the Next.js frontend, enabling full deployment and testing. This will create a stable foundation for building and iterating on further extraction and integration features.
- Ensure that all extracted images are accurately matched to their corresponding figure metadata and correctly referenced in the paragraphs where they should be displayed.
- Test the end-to-end pipeline using Gemini-generated keywords and structured GROBID-extracted data to validate the integration and information flow.