Introducing jupyckage: Create Python Packages from Notebooks in One Line of Code

If you work in jupyter often, you probably find yourself sharing code between notebooks. Whether it’s creating plots, processing data, or running pipelines, replication is a part of data science. If you’re aiming to keep DRY (don’t repeat yourself), keep your code in sync across notebooks, or ease the transition from dev towards reusable code–jupyckage can help. It’s a little package that makes a big workflow upgrade.

  1. TLDR;
  2. Existing Options and Why I Don’t love Them
    1. Execute One Notebook Inside Another Notebook
    2. NBConvert and Import
  3. A New, Slicker Solution: jupyckage
    1. Pip Install
    2. Create and Import the Package
    3. You Can Leave Stuff Out of the Module
    4. Directories and Files Created
    5. Create a Package from a Collection of Notebooks
  4. More Info

TLDR;

If you want to convert your jupyter notebook to a python package,

pip install jupyckage
import jupyckage.jupyckage as jp
jp.notebook_to_package("<notebook_name.ipynb>")

OR

jupyckage --nb <notebook_name.ipynb>

will allow you to import your notebook as

import notebooks.src.<notebook_name>.<notebook_name>

Existing Options and Why I Don’t love Them

Execute One Notebook Inside Another Notebook

You could use some magic to import that code %run notebook_name.ipynb, but that’ll execute your notebook which, in data science, is often note desirable. Here are two reasons, in addition to the ones I’ll give for nbconvert:

  • If your notebook trains a model, downloads data, saves a file, or executes any long processes, you probably don’t want to run your notebook just to access its functions.
  • It requires % magic which is great in notebooks, but not useful otherwise.

I’ve rarely been in a situation in which this was a viable option.

NBConvert and Import

You could use nbconvert to create a python executable and then import that file. I don’t love this solution because

  • It creates a mess in my directory that I inevitably have to cleanup. As soon as I do clean it up–ie. organize the modules into directories–I have to do the work of creating a package anyways (ie __init__.py files and appropriate structure) in order to import the modules.
  • Usually I’m working towards reusable code, which means I’m going to have to create a package from the modules anyways. nbconvert doesn’t help with this.
  • It either must be done in the terminal, which interrupts workflow, or requires bang (!) magic, which is great while working in a notebook but not great when you move past it.
  • Everything in my notebook is converted to the module. Which means I have to be “done” or “done-ish” with that notebook to convert it.

These were big enough problems for long enough that I finally pulled together a simple solution.

A New, Slicker Solution: jupyckage

jupyckage creates (local) python packages from notebooks in one line of code.

Pip Install

The package is up on pip — just pip install jupyckage.

Create and Import the Package

To create a package from your notebook, simply run

notebook_to_package("<notebook_name.ipynb>")

Or you can run this via terminal

jupyckage --nb <notebook_name.ipyb>

Either will create a local package for you, which you can import as

import notebooks.src.<notebook_name>.<notebook_name>

Yes, the import statement is long, but it should tab complete for you (either in jupyter or an IDE).

Also, your notebook name will be reformatted as all lower case and any spaces replaced with _. But–if your notebook name contains any disallowed characters, there will be an error.

I recommend importing the package as something convenient, e.g. as abv where abv is your chosen abbreviation. As usual, you can access the functions and objects defined in your notebook via abv.<your_function>() and abv.<your_object>.

You Can Leave Stuff Out of the Module

Nice! If you are still working in your notebook you probably don’t want all of it accessible in a module. You can add a MD cell with the contents

# DO NOT ADD BELOW TO SCRIPT

and nothing below will be added to the module.

Directories and Files Created

You’ll also notice the the below file structure has been created.

notebooks/
├── src/
│   └── <notebook_name>/
│       ├── __init__.py
│       └── <notebook_name>.py
└──bin/
    └── <notebook_name>.py #executable

Create a Package from a Collection of Notebooks

If you want to convert other notebooks in the same directory, they’ll show up alongside your first one.

notebooks/
├── src/
│   ├──  <notebook_name>/
│   │   ├── __init__.py
│   │   └── <notebook_name>.py   
│   ├── <notebook_name2>/
│   │   ├── __init__.py
│   │   └── <notebook_name2>.py  
│   └── <notebook_name3>/
│       ├── __init__.py
│       └── <notebook_name3>.py 
└── bin/
    └── <notebook_name>.py 
    └── <notebook_name2>.py 
    └── <notebook_name3>.py 

More Info

If you want to learn more, request a feature, report a bug, or contribute–

Thanks for reading. Hope this helps and have a great day!