Further Augmenting Long Term Memory

Integrating LLMs with Flashcard Revision

Mar 27, 2024

Style Comment

This post is intended to have a wide audience: from experienced Spaced-Repetition flashcards users to students just discovering its benefits. Therefore, I try to strike a style between a technical essay and a fun blog post.

Intro

If you do not currently spend at least 5 minutes a day reviewing personal flashcards, you should.

Stay with me.

You can put anything on those flashcards: the mistakes you want to learn from, your boss’ favourite food, the definition of mitosis for your upcoming exam, the maths of the attention mechanism, good one-liners.

“Anki makes memory a choice” - Michael Nielsen

That’s a lot of useful information.

I’ve been using flashcards for the last five years and they have transformed my capacity for creative ideas, do my work well, and just be a better friend.

Reasonable objections at this point:

Not enough time
Flashcards are not fun
Making flashcards is hard
Not actually using the information on your flashcards in real life

Fair points.

So let’s overcome those objections.

We’ll start with a story:

I am the eldest child of four children – which is awesome because it means I get to (try to) help my younger siblings in a hope a more relatable way than my parents can.

For example, I was recently trying to help my younger brother revise for his British A Levels (End of School leaving exams). Piece of advice number one (if you haven’t picked this up yet): make Anki flashcards to transfer content into long-term memory so that he doesn’t have to cram before an exam.

Except I’d often find him staring blankly at a stack of flashcards trying desperately to motivate himself to get through them. Because to be fair to him, as important as spaced repetition is, memorising facts on virtual tiny pieces of paper often isn’t as attractive to our brains as the allure of TikTok or YouTube.

So, trying to be a helpful older brother, I offered to revise with him: I’d look at his flashcards and then challenge him to teach me about the topics he was learning. We ended up having some pretty lively discussions about what he was learning.

Unfortunately, though, real life means I can’t always be there in person when he wants to revise or needs an academic pep talk.

So how could I provide my brother with a tutor to be with him all the time?

The obvious answer to that in 2024 is to make an AI chatbot that does exactly that. I wanted to turn his studying into a game - giving him a way to make information stick not through mindless repetition but rather through lively, engaging discussion tailored to how he learns best.

This ended up transforming the way both of us use Anki and, having now shared this with a few friends, it seems like something other people might also find helpful.

So, this is me sharing with you what I made for him. I think it overcomes all of our previous objections and might even persuade you to spend 10 minutes a day using flashcards.

Overview

In this post, I discuss:

🧠 Long term memory and why I think it is still important in a world of LLMs

🏔️ Some of the limitations of spaced repetition flashcard apps

🔮 The 5 key principles I used for designing an optimal AI study tutor

🚀 How to set up our implementation of a GPT to give your flashcards a major upgrade

🤯 Real examples of these AI tutor sessions in action and limitations

💭 A hypothesis on mind map flashcards as a challenging dataset for testing multi-modal AI abilities

These go against the grain of traditional advice to keep flashcards as atomic as possible, but I’ve found them to be incredibly powerful for increasing creativity and understanding.

Plus at the end, I also have some suggestions for tiny changes to the ChatGPT frontend based on this experiment.

“So if you’re ready to level up your study game, let’s dive in and explore how AI can breathe new life into your flashcards! No more snooze-fest memorization - it’s time for fun, engaging knowledge osmosis. 😎 Let me show you how it works... ” – Claude Anthropic – when given the context of this blog post

If you are already convinced and want to dive into using GPT4 to rewrite your flashcards and then provide you with a tutorial on them, here is the link to set up the GPT:

AnkiTutorGPT Set Up Guide

Adam Winnifrith

Mar 28

AnkiConnect First you’ll need to install the AnkiConnect plugin for local Anki – installation instructions here, and briefly described below: Open the Install Add-on dialog by selecting Tools | Add-ons | Get Add-ons... in Anki. Input 2055492159 into the text box labeled

Read full story

Why care about long-term memory in a world of LLMs

But briefly, why is long-term memory so important?

Creative ideas are (almost) always about someone putting together a combination of other ideas in a new way. Novel ideas, therefore, are based on a rich, and easily-accessible, long-term memory. When we store information in our long-term memory and spend time recalling that information, we build up the mental framework for creativity. Put another way, creative thinking reinforces the mental apparatus for long-term memory.

Problems with traditional active recall programs for personal knowledge databases

I think there are three broad problems with the current way people use Anki:

1. Anki provides you with the same prompt for each piece of information that you’d like to learn. As a result, you just learn a response to a prompt and limit your ability to recall information in context – i.e. in situations where you’d actually find it useful. In contrast, recalling information in an alternative context to that which you learnt is known as embedding – and it is one of the most effective forms of active recall.

2. You have to reveal the entire answer to see if you got your answer right, which limits your ability for complete active recall. There’s been many a time when I was reviewing flashcards and I correctly answered part of the flashcard. It would be great if in these situations I had someone else to learn with who could have looked at the back of flashcards and say “That’s partly right, what about this specific part though …”

For example, see this flashcard below that I made after I got curious about the smell from some cacao nibs. When I reviewed this flashcard recently, I correctly remembered that it had something to do with fermentation but not that the smell specifically was from ethanoic acid.

As a biochemist, I find it particularly interesting that a chemical so small can have such a distinct smell (quite why I’m not sure), so I’d like to remember it! It would have been ideal if I could have had a friend respond to my first answer with “That’s exactly right that the smell comes from microorganisms growing and metabolising on the cacao nibs. Can you remember exactly what chemical causes the particular vinegary smell?”

(Though perhaps this is also a sign that my flashcards aren’t atomic enough – we’ll get back to this later – as we also provide a flashcard GPT that can help with this.)

3. Anki lacks some good principles for making learning effective and fun – namely positive feedback when you get something right. This is something that the best language and skill-learning apps (Duolingo, Brilliant) do incredibly well. You’re provided with a fun (and somehow confusingly addictive) sound when you get the right answer and an animal (that you somehow have lots of affection for and desperately don’t want to let down) that provides you with praise and encouragement. These apps go even further and utilise some tricks of social media to make learning addictive. They have streaks that you don’t want to lose and leagues that you want to climb.

Design criteria for AnkiTutorGPT

I therefore set out to implement a flashcard study tutor with the following design criteria:

1.       The tutor should be encouraging and make learning fun.
2.       The tutor should be able to provide follow-up questions if part of the answer is incorrect.
3.       Experimentation: the tutor should ask questions that prompt the learner to explore information in different contexts. Creating a discussion around a topic.
4.       Cognitive load: The tutor should be able to adjust to the learner’s grasp and understanding of the content to scale how difficult follow up questions are.
5.       Feedback: The tutor should provide positive feedback when you get things right and specific actionable feedback when you get things wrong.

AntiTutorGPT

To keep this blog post concise, we provide a description of how to set up your own AnkiGPT here. We unfortunately cannot yet provide a link to a GPT that works off-the-shelf because, as of 23/03/2024 a custom HTTPS port from your local Anki has to be created and this has to be defined in the GPT Actions specification and will be different for everyone.

If you happen to be from OpenAI and are reading this, we’d be so grateful for your help fixing this – please see our suggestion list below (there are two super simple ways to do it)!

As a brief explanation of the setup, the GPT we’ve created pulls the current flashcard that you have open in the Anki desktop app. It does this by connecting to the AnkiConnect API through a local tunnel URL that connects to a local port on your computer in which Anki sits (please read the guide if that sentence doesn’t make sense – or ask ChatGPT to break it down step by step 😊). It then looks at both the front and back of your flashcard and rewords the flashcard front into a different context.

Additionally, we quickly realised that there are many GPT4 calls and therefore if you already had a ChatGPT Plus subscription AnkiGPT very quickly becomes very cost-effective to use inside a ChatGPT Plus subscription instead of API calls. (Using inside ChatGPT comes with a speed limitation and we working on developing a setup for API calls for faster learning later).

Lessons from learning with AnkiTutorGPT

Since creating AnkiTutorGPT, for most of my text-based flashcards, it is now the default way in which I learn.

One of its behaviours is helping me recall when I don’t get the full answer; see this example where I misremembered what the coccus part of the name of the staphylococcus bacteria is.

Beyond just missing pieces of information, AnkiTutorGPT will encourage you to show a full understanding if you have a flashcard that is more than just a singular atomic idea. See this example of it encouraging me to recall in full a geometric understanding of the dot product of vectors: https://chat.openai.com/share/eba9a4ae-aae0-4a12-9a4a-f78ca1f97319

AnkiTutorGPT also intentionally asks follow-up questions to help embed the information in new contexts: https://chat.openai.com/share/179a885c-7cbb-4bbf-bcfd-02be63b0a806

We created a second prompt that focuses specifically on getting you to recall the information within a new and interesting context. We call this embedding mode:

of course, throughout it maintains a positive and inspiring tone!

We provide a full breakdown of the prompts in the set-up blog post.

We’ve also used AnkiGPT to rewrite flashcards or create DALLE 3 images to help us remember particular flashcards. We found that we were doing this so often we created a specific GPT that rewrites flashcards to focus on key core concepts, creates fun mnemonics, and always offers to create a DALLE 3 image. This one you can find here.

Which mode, Recall or Embedding, works better?

We’ve yet to fully test which mode works better. We’re making these prompts available partly in case any psychologists out there want to study which works better.

Conversations with ChatGPT audio mode are even better

An unexpected advantage of being forced to create the HTTPS set up, was that we could use AnkiGPT on the ChatGPT app. This enabled us to have entirely voice conversations about flashcards and we found this to be extremely effective for embedding learning.

This was almost like having an actual 1-to-1 tutoring session. There’s something about being forced to explain something out loud and then being given feedback audibly that really forces you to think about the wider context. It also allows you to do it while doing chores if you want to!

However, this does slow your learning down as you have to listen to ChatGPT respond. The quickest set-up is to use voice typing and then read the ChatGPT response.

Limitations

This 80/20 proof of concept has a few limitations that we’d like to improve:

1) Most obviously, AnkiGPT requires that you have Anki open on your computer. So you can’t currently learn on the go.

2) Speed. Whilst AnkiGPT seems to be more effective for learning, it takes longer. You have to make the prompt into ChatGPT to look at the flashcard and then often will spend more time answering follow up questions. I think the extra engagement with the flashcard likely extends the interval by which you next need to see the card, and so total revision time might go down, but I don’t have any good metrics to prove this. To fix this we’re making a Anki AddIn that rewrites the flashcards before you see them. This will also allow us to separate the question prompt from the prompt for the tutor’s behaviour.

I’ve also created a GPT prompt that does not reword the question that you see. This speeds of the process as you can give the answer straight away – this is the mode that I am using most at the moment as a balance between speed and engaging tutor interaction.

3) Photo based flashcards. Many times, a flashcard has a photo prompt or answer. As AnkiGPT currently doesn’t look at the photos on the flashcards, when this forms the majority of the information to be recalled, it does not make sense to use AnkiGPT (though breaking up text based flashcards with photos or diagrams does help extend the time before we were hitting our GPT4 message caps so we actually found this quite helpful!).

In particular, for me a lot of my flashcards are in the form of asking me to recall summary mind maps. Mind maps are, in my opinion, one of the most effective ways to store information in long term memory in the human brain. Having a conversation with an LLM about my mind maps would be amazing - I have a suggestion below to frontier model building companies about a benchmark dataset on this.

Suggestions for the next OpenAI hackathon

If you happen to work at OpenAI and are reading this, there are a number of super simple front-end changes that we’d love to build into AnkiGPT. We’re making these suggestions here because we think it makes more sense to build them into the ChatGPT frontend than create our own app:

1)      The ability for someone to provide their HTTPS URL whenever they start a conversation (without having to make a GPT4 call) OR the ability to connect to local port URLs.
2)      Streaks. Streaks make learning addictive. Please can we have a streak for AnkiGPT?
3)      Fun noise when you get something right. Brains are weird. Fun noises make us happy. Please can we have some fun noises when we get things right (Any chance Duolingo team would be happy to share their noise? I am so addicted).
4)      Timer! Timer’s allow us to hit our goals. Can we have a timer for how long we spend talking with AnkiGPT daily?
5)      Self-prompting – Especially in the embedding mode, we observed that the model was able to better direct conversations in an interesting direction when we made it write out a summary paragraph of content related to the flashcard first. We tried various ways to make this invisible to the user (asking the model to “Run code” that creates a summary paragraph – as this is usually hidden by default, and asking the model to write in grey markdown text so the user couldn’t see it) but couldn’t get it to work – if there was a way to easily provide a space for GPTs to self prompt that would be very helpful.

Conclusions

I think that learning can and should be not just fun but a tool through which we make the world a better place. The most innovative ideas arise from combining knowledge - connecting the dots in novel ways made possible by wide exposure across disciplines.

I want to see more innovative ideas in the world. This will make the world a better place. This is good.

However it seems that many education systems set students up to learn content purely for exams. Qualifications are of course life-changing tools, but we’re hopeful that more and more of us will learn information not just for exams but to apply it in generative ways.

I hope that Anki Tutor GPT makes a tiny contribution to this.

It was really important to make this open source so that any learning app can learn from this: Khan Academy, Quizlet, Duolingo, and Seneca (though we know that many of these companies probably did this in 2022!). We also don’t think it replaces these platforms – their value lies in the course content creation – the organisation of the course content.

We built this on top of Anki because of the number of incredible flashcard decks that already exist for numerous exam systems worldwide. We, therefore, hope that this acts as an incredibly useful learning tool for anyone studying for exams on any piece of content. If anyone uses Supermemo and would like this go ahead and share it.

I also think it is highly likely that the prompts that I have created could be improved. Please share any learnings you have with improving the prompts in the comments of the set up blog post. The system that I currently find most helpful is the basic AnkiTutor GPT (not embedding) that looks at my flashcard and my answer and helps me to recall it correctly.

I’ve had the immense privilege of learning at Oxford, where the tutorial system creates a fun environment in which an expert in what you are learning about asks you probing questions to further your understanding. I’ve attempted to provide some of that to my younger brother and I hope this goes some way to sharing it even further and making this kind of education accessible to all.

Happy learning!

Appendix A: Could an LLM understand and have a conversation about a Mind Map?

I create most of my notes about papers and lectures and plan blog posts, articles, and essays using mind maps. I’ve found them to be one of the most effective learning tools there is. To help me recall entire papers in 5 minutes, I put flashcards into my Anki deck that ask me to recall mind map summaries. For example, below are some of the mind map summaries I made for papers that feature in my review of generative artificial intelligence for de novo protein design.

Link to the paper: CaLM

Link to paper

Mind maps work even better for talks where there are key ideas the speaker is trying to communicate as they make excellent branch headings:

Link to talk

To speed up the time it takes to recall these mind maps, I will, ideally, speak out loud retelling the story of the talk or the paper. I’ll then look at the mind map and check that I hit all the key ideas and if there was anything important that I missed.

I’ve tried a few times providing ChatGPT/Gemini/Claude Opus with the photo of my mind map and the transcript of my out loud ramblings about the mind map and asked it to see if I recalled most of the information. At the moment most of them don’t succeed very well. This might be because of my Dark background choice (but when I switch to a white background it is still as bad).

I’ve wondered whether creating a benchmark dataset of MindMap Images and description pairs would be an interesting multi-modal task for LLMs. This is because of the structure of a mind map. It is a series of keywords associated with lines (it aims to mimic neurons). An LLM in theory should be great at this – working out how each keyword relates to the other and therefore what the mind map is trying to tell. 0

I’d love to get feedback on this idea! I’d also be happy to share lots of my mind maps and descriptions to go with them if anyone working on frontier models also thinks this is interesting.

Acknowledgements

Thank you to all the people who read drafts on this blog post and gave very helpful feedback: Patrick, Piotr, Lily, Osaid. Thank you to Nathan for the helpful discussion which framed a lot of this narrative!

Adam’s Substack