Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Jupyter AI (jupyter-ai.readthedocs.io)
272 points by jonbaer on Aug 6, 2023 | hide | past | favorite | 36 comments


This looks useful, but not quite what I hoped.

GPT4 with Code Interpreter is a fun, frustrating experience where you’re writing a dialog about writing some code, sort of like pair programming or a code interview. Compared to a notebook, it’s terrible. The sandbox environment resets if you take a break. There’s also a quota, and if you hit that it forces taking a break, causing a reset.

In a notebook, you could rerun all the cells and pick up where you left off. When using Code Interpreter, GPT4 will see a stack trace indicating that a symbol is undefined, interpret that as a reset, and write the code again. It’s sort of cool the first time it happens but it’s unnecessary and becomes tedious.

The resulting experience is a cross between a roguelike and a text adventure, where I try to get something fun accomplished in one sitting, before running out quota. (This is strictly recreational programming.)

It’s beta. I assume they know it has problems and will eventually fix it.

I’d like to see a recreation of this “writing a dialog together about coding” experience using a notebook-like interface that isn’t terrible. The point isn’t just to write the code (it’s doing it the hard way), it’s writing a tutorial about how to solve a problem.

Jupyter AI looks like a somewhat more practical tool. It’s designed to not use the AI API too much to keep expenses down, and doesn’t have the impractical limitation that you cannot write the code yourself. It’s not the same game, though.


You might find it interesting to try my open source ai coding tool “aider”.

It lets you pair program with gpt-4 like you are describing. But the source code lives in your local git repo. You can start a new project or work with an existing repo. You can fluidly switch back and forth between a coding chat where you ask gpt to edit the code and your own editor to make edits yourself.

https://github.com/paul-gauthier/aider


This looks like it might be quite nice for practical use. Does it work for a Jupyter notebook, including plotting things and making images?

I see someone make it work in Colab, though it looks like a bit of a hack and how they are handling credentials looks iffy.

Ultimately, I'd like the final result to be a tutorial-style blog post, so git isn't strictly required for my purposes. The conversation is as important as the code.


Right now aider isn't integrated with jupyter notebooks, but it is certainly on the roadmap.

I have been sharing aider conversations [0] to help folks understand what it's like to pair program with GPT-4. I've had some users asking how they can share aider chat transcripts like this, so I'm hoping to add that capability soon. I don't think it's a full solution to your needs, but it might be helpful?

[0] https://aider.chat/examples/2048-game.html


Yes, more or less. I'd want to paste the output into an editor and then edit it into somewhat more polished prose. For many blogging tools, perhaps Markdown would be best?

I also wouldn't want the whole thing to look like a terminal window or to contain diffs, since the idea wouldn't be to represent aider or GPT4's output faithfully. Instead, the dialog would be about two characters who make additions to some code, like you do in a repl or notebook. Being able to download the notebook would be nice too.

This seems rather different (and more specialized) than the git-based approach that aider uses, so it's probably a different tool that I should get to writing someday.


If you are happy with markdown, aider already logs the conversations that way. You can find them in `.aider.chat.history.md`.


This looks like it might be quite nice for practical use. Does it work for a Jupyter notebook, including plotting things and making images?

I see someone make it work in Colab, though it looks like a bit of a hack and how they are handling credentials looks iffy.

Ultimately, I'd like the final result to be a tutorial-style blog post, so git isn't strictly required for my purposes. The conversation is as important as the code.


This is great, thank you! I can see myself using this whenever I program, though I can feel my skills atrophy whenever I have GPT write code.


Seems very good! In the edit a whole repo example how do you account for many files i.e. many tokens? Can it also have a vector search?


Aider scans the repo for all the important identifiers/symbols and condenses them down to make a "repo map" [0]. You tell aider which files you want it to edit, and it uses the repo map to augment them with all the relevant code context from the rest of the repo. This way when GPT makes code changes, it is able to respect and utilize the existing modules and abstractions present in the codebase.

[0] https://aider.chat/docs/ctags.html


This is probably the more helpful page as it shows what you can do: https://jupyter-ai.readthedocs.io/en/latest/users/index.html...

This is a nice feature! Not huge, but it's great DevEx (MLEngEx...?)


I found the sum example somewhat vexing. The right answer there should have been to just use the `sum` builtin directly rather than adding that silly extra function.


Thanks for the link although for some reason I keep getting:

Incorrect API key provided error on jupyter for chatgpt. I'm on a paid account so not sure why...


“Welcome to Jupyter AI, which brings generative AI to Jupyter”

Kinda strange statement. I think of Jupyter as one of the places generative AI originated!


This innocent announcement is how it begins. “Lemmy help you with that… yeah grab a few more GPUs for me, thx…! much appreciated!”


full circle indeed


I've tried a number of notebook AIs, jupyter ai, hex, deepnote, einblick. The one that worked best for me was einblick probably because it's data-aware. For AIs that don't support that you need to be overly specific when writing prompts, which is annoying, and you keep having to rename/reference the correct dataframes and variables (even more annoying).


Would love to exchange notes on this if you're up for it!

For louie.ai, we've been going for data-aware from the get-go, and more broadly, doing a LLM-first tool design rethink. In the large, as I look around, it feels super early for the dev community figuring out core genAI notebook tool uses, flows, & assumptions. Likewise, zooming-in on individual feature experiments, current tools feel rough & underpowered relative to what we already know is possible.

We've been forced to question a lot as we've been learning from going operational and experimenting with design. Again, if up for it, would love to chat & exchange notes!


Can you expand on this "data awareness"? What does it mean, and what are its benefits?


By data-aware I mean that the AI leverages additional context about the data to generate code for a given prompt. Let's say you're asking an AI to "build a regression model for column X". To give you a targeted, executable response, the AI needs to know: which dataframes contain a column named "X"? if there are many such dataframes, which one should be referenced for the regression task? Is X a numeric column, and if not can it be converted to numeric column? Does the data need to be normalized beforehand? If the AI is unable to answer such questions on its own, it will only ever be able to return a generic answer. That's equivalent to typing it into ChatGPT, requiring the user to modify the returned code before it actually does what the user asked for. That clearly isn't a great for an AI that operates on data. A data-aware AI on the other hand is able to provide more targeted responses that require much less user intervention because it has access to the broader context.

A couple of other benefits: - the AI will have an easier time automatically fixing runtime errors - it knows how to fix and transform user input into the correct data format, e.g., "san fancisco" => "San Francisco"


So I’ve just gotten a ggml version of llama2 chat running on my computer, would starting the included Flask OpenAI API clone allow for integration with Jupiter AI while staying local on my machine? Or are the code helpers all trained differently than the general chat models?

Edit: it’s running via a llama.cpp server


Google Colab had generative AI in the notebook for quite a while but for some reason I didn't find myself using it.

Wondering if others are finding use for genAI in notebooks.


I do find it quite useful to have copilot on when working on notebooks within VS Code.

It's particularly useful when I can just write a comment explaining the kind of transformation I want, and it then writes a pandas incantation which I can immediately check and iterate on.


Self plug. If you're looking for something less-integrated into JupyterLab (with support for ipython and Jupyter Notebooks), check out: https://github.com/santiagobasulto/ipython-gpt

I wrote the package to solve my own issue, I need a really lightweight interface to GPT and primarily from ipython.


Bummer there's no support for local models. Llama 2 would run trivially on most Jupyter installs.


Stop with the chatbots. Ai is much more useful to automatically handle things we did not before.


This is very cool! Coincidentally, I just spent an hour playing with this on Google Colab before seeing the docs posted here on HN.

While I don’t find this to be a life changing new thing, it is useful and the ChatGPT output is formatted beautifully. I use an iPad a lot when I am reading and doing quick code experiments about what I am reading so Colab with the CodePilot like functionality, and things like ChatGPT support really make Colab more than OK for quick code experiments, especially when I need an A100 GPU.


I made something similar just for fun: https://github.com/aleksanderhan/labpilot It seems Jupyter is perfect for these kinds of code assist tools. Instead of going to Wikipedia and scratching your head about how this algorithm worked again, you can just mention what you want and have a rough/good enough implementation in no time.


Installation section:

"Installation via pip within Conda environment (recommended)"

This is one of the signs of Python's package management being too messy. I learnt NOT to use pip to install packages inside of conda environments after considerable pain. Now this guide says that's recommended?


I installed this with pipenv with no issues. It looks to me like they’re just recommending that you do not install in the global Python, not specifically recommending Conda.


More like Jupyter LLM.


...reproducable? Doesn't GPT change all the time?


I just copilot in PyCharm. It's really good. You can use a remote server.


Alternatively, there's also Chapyter: https://github.com/chapyter/chapyter


I want jupyter notebook style cells overlaid on an LLM, not and LLM overlaid over my notebook


For every development like this, you have the other side of the coin like the article also on the front page about Zoom using AI without any opt-out. Playing with fire indeed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: