Agreed. Notebook environments are great for exploration, discovery and pedagogy....

orlp · on June 5, 2018

I solve this in my personal workflow by extracting the important bit to a module, editing in that module, and testing/exploring changes in a notebook by reloading the module.

cbcoutinho · on June 5, 2018

This is how I work as well, where all the code I'm working with in a jupyter notebook is directly visible on my screen. Any other code is generally 'finished' and put into a text editor.

Additionally, I use the following settings in my ipython_config.py file to automatically reload modules:

c.InteractiveShellApp.extensions = [ 'autoreload' ]

c.InteractiveShellApp.exec_lines [ '%autoreload 2' ]

mlthoughts2018 · on June 5, 2018

Note that the autoreload features can be very tricky to use safely with Python.

For example, at least in some previous versions, Caffe and TensorFlow make incompatible assumptions about the ability to claim all available GPU memory. So there can be situations where you first import Caffe, then later import TensorFlow with restrictions on its GPU policy. If you naively re-import the Caffe code, it can evict TensorFlow from whatever GPUs it had reclaimed, and coming up with a group of settings that reliably prevent this, across possibly different machine where the notebook will be run, is very tricky.

This once led to a huge time sink because someone on my team created a mistaken GitHub issue claiming our TensorFlow model had a bug (since the notebook was producing an error). We spent all this time trying to reproduce it and figure out why it wasn't working, and eventually realized it was because of this hidden auto-reload setting on his specific IPython setup that caused Caffe to evict TensorFlow just for his specific usage pattern, resulting in strange errors because the TensorFlow model was no longer loaded in GPU memory.

There can be other problems too, like auto-reloading modules that have large start-up times (say if they load a very large model into memory). Sometimes you want to re-run a cell without auto-reload, even if you still want selective auto-reload functionality in other parts.

cbcoutinho · on June 6, 2018

Thanks for explaining the downsides related to using this feature. Like all good config options, there are tradeoffs. Luckily I haven't been bitten by it yet, but I'll remember that if I run into issues

wenc · on June 5, 2018

Yes, I tried doing this too but it forces me to flip between the notebook and a separate text editor (for editing that module). It's not that seamless and the context switches were a little expensive (at least for me), but your mileage may vary.

mlthoughts2018 · on June 5, 2018

On that console / IDE point you made, IPython can still be quite good for that if you use the interactive shell.

For example, I might make two shell tabs in tmux, and make one a small rectangle towards the bottom of the screen (holds my running IPython session), and a large rectangle above it (holds my Emacs where I’m editing source code).

And I might have a third shell tab somewhere that detects any time source files are changed and re-runs unit tests.

wenc · on June 5, 2018

Yes, that's definitely a possibility. Though it would be nice if it were more nicely integrated into an environment like RStudio where you can interactively set breakpoints, watch variables, etc. while still maintaining the interactivity.

I do the tmux/vim too, but for exploratory work the experience is less well-integrated than it could be with an Rstudio-like IDE.

mlthoughts2018 · on June 5, 2018

I agree, and the IDE setups can be very valuable for certain use cases or certain preferences. The equivalent thing in the IPython shell approach, basically using souped-up pudb, is not quite as nicely interactive with the debugging cycle, since setting breakpoints, watchpoints, etc., is either a matter of editing them into the source code and re-running, or becoming a master of specifying them on the command line, both of which require stepping out of the tight iteration workflow slightly (but to be fair, they also offer more power than the preconfigured options availabe in the IDE debugger features).

wenc · on June 5, 2018

Yes, my IDE is vim but it's a hard sell to a lot of folks... especially having to map a shortcut key to "import ipdb; ipdb.set_trace()" for breakpoints...

Rodeo [1] was an attempt at an IDE but development died, and now that yhat's been acquired, there's no sign of any further development. I wish the Jupyter folks would push more in this direction (and they are with Jupyter Lab) but I get the sense they are really invested in the notebook paradigm.

[1] https://www.yhat.com/products/rodeo

leephillips · on June 5, 2018

Well, I guess they are invested in it as a component of the JuyterLab toolbox, but JupyterLab tries to integrate it with consoles and editing windows: https://lwn.net/Articles/748937/

wenc · on June 5, 2018

Now if only we can get the editing window in Jupyterlab to talk to the console...

... and switch back-and-forth between notebook mode and text editor mode....