
If you work in jupyter often, you probably find yourself sharing code between notebooks. Whether it’s creating plots, processing data, or running pipelines, replication is a part of data science. If you’re aiming to keep DRY (don’t repeat yourself), keep your code in sync across notebooks, or ease the transition from dev towards reusable code–jupyckage
can help. It’s a little package that makes a big workflow upgrade.
TLDR;
If you want to convert your jupyter notebook to a python package,
pip install jupyckage
import jupyckage.jupyckage as jp
jp.notebook_to_package("<notebook_name.ipynb>")
OR
jupyckage --nb <notebook_name.ipynb>
will allow you to import your notebook as
import notebooks.src.<notebook_name>.<notebook_name>
Existing Options and Why I Don’t love Them
Execute One Notebook Inside Another Notebook
You could use some magic to import that code %run notebook_name.ipynb
, but that’ll execute your notebook which, in data science, is often note desirable. Here are two reasons, in addition to the ones I’ll give for nbconvert:
- If your notebook trains a model, downloads data, saves a file, or executes any long processes, you probably don’t want to run your notebook just to access its functions.
- It requires
%
magic which is great in notebooks, but not useful otherwise.
I’ve rarely been in a situation in which this was a viable option.
NBConvert and Import
You could use nbconvert
to create a python executable and then import that file. I don’t love this solution because
- It creates a mess in my directory that I inevitably have to cleanup. As soon as I do clean it up–ie. organize the modules into directories–I have to do the work of creating a package anyways (ie
__init__.py
files and appropriate structure) in order to import the modules. - Usually I’m working towards reusable code, which means I’m going to have to create a package from the modules anyways. nbconvert doesn’t help with this.
- It either must be done in the terminal, which interrupts workflow, or requires bang (
!
) magic, which is great while working in a notebook but not great when you move past it. - Everything in my notebook is converted to the module. Which means I have to be “done” or “done-ish” with that notebook to convert it.
These were big enough problems for long enough that I finally pulled together a simple solution.
A New, Slicker Solution: jupyckage
jupyckage creates (local) python packages from notebooks in one line of code.
Pip Install
The package is up on pip — just pip install jupyckage
.
Create and Import the Package
To create a package from your notebook, simply run
notebook_to_package("<notebook_name.ipynb>")
Or you can run this via terminal
jupyckage --nb <notebook_name.ipyb>
Either will create a local package for you, which you can import as
import notebooks.src.<notebook_name>.<notebook_name>
Yes, the import statement is long, but it should tab complete for you (either in jupyter or an IDE).
Also, your notebook name will be reformatted as all lower case and any spaces replaced with _
. But–if your notebook name contains any disallowed characters, there will be an error.
I recommend importing the package as
something convenient, e.g. as abv
where abv
is your chosen abbreviation. As usual, you can access the functions and objects defined in your notebook via abv.<your_function>()
and abv.<your_object
>.
You Can Leave Stuff Out of the Module
Nice! If you are still working in your notebook you probably don’t want all of it accessible in a module. You can add a MD cell with the contents
# DO NOT ADD BELOW TO SCRIPT
and nothing below will be added to the module.
Directories and Files Created
You’ll also notice the the below file structure has been created.
notebooks/ ├── src/ │ └── <notebook_name>/ │ ├── __init__.py │ └── <notebook_name>.py └──bin/ └── <notebook_name>.py #executable
Create a Package from a Collection of Notebooks
If you want to convert other notebooks in the same directory, they’ll show up alongside your first one.
notebooks/ ├── src/ │ ├── <notebook_name>/ │ │ ├── __init__.py │ │ └── <notebook_name>.py │ ├── <notebook_name2>/ │ │ ├── __init__.py │ │ └── <notebook_name2>.py │ └── <notebook_name3>/ │ ├── __init__.py │ └── <notebook_name3>.py └── bin/ └── <notebook_name>.py └── <notebook_name2>.py └── <notebook_name3>.py
More Info
If you want to learn more, request a feature, report a bug, or contribute–
- pypi https://pypi.org/project/jupyckage/0.1.2/
- code https://github.com/kefrailey/jupyckage
- issues https://github.com/kefrailey/jupyckage/issues
Thanks for reading. Hope this helps and have a great day!