Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: plaintext support #1499

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

lukastk
Copy link
Contributor

@lukastk lukastk commented Feb 22, 2025

Closes #1498

Adds support for exporting the following plaintext formats as if they were regular .ipynb files:

  • percent
  • light
  • sphinx
  • myst
  • pandoc

See this for details on the above plain-text notebook formats.

Running nbdev_export now converts, by default, files of extension *.{pct.py,lgt.py,spx.py,myst.md,pandoc.md,ipynb}. See this for a demonstration (and some documentation) on the new exporter.

The feature requires jupytext and nbformat as (optional) dependencies. They will only be required if users attempt to export non-ipynb files.

deven367 and others added 9 commits December 1, 2024 15:45
Can now export plaintext python files in the `percent`, `light`, `sphinx`, `myst` and `pandoc` formats. The new feature uses `jupytext` to convert the `.{pct,lgt,spx}.py` and `.{myst,pandoc}.md` files to an intermediate `.ipynb` file, which is then processed by `nbdev` in the usual manner.

To export a plaintext file (in the `percent` format):

```python
nbdev.export.nb_export("my_file.pct.py", lib_path="my_lib")
```

To export plaintext files in the `nbs` folder (in the `percent` format):

```bash
nbdev_export --file_glob '*.pct.py'
```
Example plaintext files are in `tests/plaintext_files`. We run through the exporting process of each of these files into an `.ipynb` notebook.
…port

The `fmt` argument takes any of the following: 'py:percent', 'py:light', 'py:sphinx', 'md:myst', 'md:pandoc' and 'ipynb'.
`nbdev_export` now exports files of the form "*.{pct.py,lgt.py,spx.py,myst.md,pandoc.md,ipynb}" by default.

In order to do this, I added support for multiple file globs in `nbglob`, and adjusted the default file glob in `nbglob_cli` to be for all support plaintext and notebook filetypes.
@lukastk
Copy link
Contributor Author

lukastk commented Feb 22, 2025

Note: In this commit I change the default behaviour of nbdev_export to also include the plaintext formats by default. This can be rolled back in case it's deemed too drastic a change. In which case, plaintext formats can be exported by running, for example:

nbdev_export --file_glob '*.pct.py'

@lukastk lukastk changed the title Feature/plaintext support feat: plaintext support Feb 23, 2025
@hamelsmu
Copy link
Contributor

Thanks for providing the blog post with the motivation for this PR.

Plain text formats don't capture outputs which are required to render nbdev docs so I'm afraid this is not a good fit. Please let me know if I'm wrong or I am overlooking something!

@lukastk
Copy link
Contributor Author

lukastk commented Feb 27, 2025

Hi @hamelsmu. Thanks for taking the time to look through the PR and issue.

I agree that plaintext formats do not capture outputs, but I don't think these are required as such to render them via Quarto, which itself has support for rendering notebook-formatted .py files (see this).

I take your point that we often want to render the outputs in the docs as well, although for some use-cases (like mine) it's less necessary. One option would be to add a pre-processing step where the plaintext files are converted to .ipynb notebooks, and then executed. This way the outputs of the code cells would be rendered. I'd be happy to take a look into this if you think it's a good idea.

@jlopezpena
Copy link

I have used jupytext before for docs, and it is possible to get it to run the notebooks before generating the documentation, jupyterbook does so in a very nice way. I personally like my documentation rendered (and published!) as part of the CI/CD process, and since the notebooks need to run anyway for the test part, there is no additional overhead.

In my opinion, there is a great benefit on not having to deal with the outputs under version control, particularly when they are binary objects (like images). Besides the pain to have to use a paid tool for proper code reviews, changes in binary files increase the size of the git history massively, making it quickly become annoying to deal with

@jlopezpena
Copy link

PS Quarto also has the option to execute cells when rendering: https://quarto.org/docs/computations/execution-options.html

Copy link
Contributor

@jph00 jph00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR. Let's make this opt in, disabled by default. You can add a setting for it. I've added some minor formatting requests.

It's unusual to have a separate "tests" notebook. Can this be integrated into the docs instead, creating tests that also illustrate the behavior?

res = globtastic(path, file_glob=file_glob, skip_folder_re=skip_folder_re,
skip_file_re=skip_file_re, recursive=recursive, **kwargs)
if type(file_glob) != list: file_glob = [file_glob]
res = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a list comprehension.

@@ -138,7 +142,7 @@ def nbglob_cli(
@call_parse
@delegates(nbglob_cli)
def nbdev_export(
path:str=None, # Path or filename
path:str=None, # Path or filename,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a mistake?

@@ -16,6 +16,14 @@

from collections import defaultdict

try:
import nbformat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imports can be on one line

Comment on lines +24 to +25
except ImportError:
plaintext_supported = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except ImportError:
plaintext_supported = False
except ImportError: plaintext_supported = False

@@ -88,17 +96,43 @@ def _mk_procs(procs, nb): return L(procs).map(instantiate, nb=nb)
# %% ../nbs/api/03_process.ipynb
def _is_direc(f): return getattr(f, '__name__', '-')[-1]=='_'

# %% ../nbs/api/03_process.ipynb
plaintext_file_formats = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be on one line.

nbformat.write(nb_converted, temp_file.name)
self.nb = read_nb(temp_file.name) if nb is None else nb
return
if fmt is None or fmt == "ipynb":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each part here can be on one line.

@lukastk
Copy link
Contributor Author

lukastk commented Mar 29, 2025

Thanks @jph00 for your comments.

Let's make this opt in, disabled by default. You can add a setting for it. I've added some minor formatting requests.

Sounds good, will do!

It's unusual to have a separate "tests" notebook. Can this be integrated into the docs instead, creating tests that also illustrate the behavior?

Sure, can integrate these into the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: add support for .py files in plaintext formats
5 participants